From 6a72cfd2875e4bc003332bc19b9a50fdbceac593 Mon Sep 17 00:00:00 2001 From: Aaron Raimist Date: Wed, 1 May 2019 13:00:06 -0500 Subject: Convert olm.rst to markdown Signed-off-by: Aaron Raimist --- docs/olm.md | 328 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 328 insertions(+) create mode 100644 docs/olm.md (limited to 'docs/olm.md') diff --git a/docs/olm.md b/docs/olm.md new file mode 100644 index 0000000..e9bb4ae --- /dev/null +++ b/docs/olm.md @@ -0,0 +1,328 @@ +# Olm: A Cryptographic Ratchet + +An implementation of the double cryptographic ratchet described by +https://whispersystems.org/docs/specifications/doubleratchet/. + +## Notation + +This document uses $`\parallel`$ to represent string concatenation. When +$`\parallel`$ appears on the right hand side of an $`=`$ it means that +the inputs are concatenated. When $`\parallel`$ appears on the left hand +side of an $`=`$ it means that the output is split. + +When this document uses $`ECDH\left(K_A,\,K_B\right)`$ it means that each +party computes a Diffie-Hellman agreement using their private key and the +remote party's public key. +So party $`A`$ computes $`ECDH\left(K_B^{public},\,K_A^{private}\right)`$ +and party $`B`$ computes $`ECDH\left(K_A^{public},\,K_B^{private}\right)`$. + +Where this document uses $`HKDF\left(salt,\,IKM,\,info,\,L\right)`$ it +refers to the [HMAC-based key derivation function][] with a salt value of +$`salt`$, input key material of $`IKM`$, context string $`info`$, +and output keying material length of $`L`$ bytes. + +## The Olm Algorithm + +### Initial setup + +The setup takes four [Curve25519][] inputs: Identity keys for Alice and Bob, +$`I_A`$ and $`I_B`$, and one-time keys for Alice and Bob, +$`E_A`$ and $`E_B`$. A shared secret, $`S`$, is generated using +[Triple Diffie-Hellman][]. The initial 256 bit root key, $`R_0`$, and 256 +bit chain key, $`C_{0,0}`$, are derived from the shared secret using an +HMAC-based Key Derivation Function using [SHA-256][] as the hash function +([HKDF-SHA-256][]) with default salt and ``"OLM_ROOT"`` as the info. + +```math +\begin{aligned} + S&=ECDH\left(I_A,\,E_B\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; + \parallel\;ECDH\left(E_A,\,E_B\right)\\ + R_0\;\parallel\;C_{0,0}&= + HKDF\left(0,\,S,\,\text{"OLM\_ROOT"},\,64\right) +\end{aligned} +``` + +### Advancing the root key + +Advancing a root key takes the previous root key, $`R_{i-1}`$, and two +Curve25519 inputs: the previous ratchet key, $`T_{i-1}`$, and the current +ratchet key $`T_i`$. The even ratchet keys are generated by Alice. +The odd ratchet keys are generated by Bob. A shared secret is generated +using Diffie-Hellman on the ratchet keys. The next root key, $`R_i`$, and +chain key, $`C_{i,0}`$, are derived from the shared secret using +[HKDF-SHA-256][] using $`R_{i-1}`$ as the salt and ``"OLM_RATCHET"`` as the +info. + +```math +\begin{aligned} + R_i\;\parallel\;C_{i,0}&=HKDF\left( + R_{i-1},\, + ECDH\left(T_{i-1},\,T_i\right),\, + \text{"OLM\_RATCHET"},\, + 64 + \right) +\end{aligned} +``` + +### Advancing the chain key + +Advancing a chain key takes the previous chain key, $`C_{i,j-1}`$. The next +chain key, $`C_{i,j}`$, is the [HMAC-SHA-256][] of ``"\x02"`` using the +previous chain key as the key. + +```math +\begin{aligned} + C_{i,j}&=HMAC\left(C_{i,j-1},\,\text{"\x02"}\right) +\end{aligned} +``` + +### Creating a message key + +Creating a message key takes the current chain key, $`C_{i,j}`$. The +message key, $`M_{i,j}`$, is the [HMAC-SHA-256][] of ``"\x01"`` using the +current chain key as the key. The message keys where $`i`$ is even are used +by Alice to encrypt messages. The message keys where $`i`$ is odd are used +by Bob to encrypt messages. + +```math +\begin{aligned} + M_{i,j}&=HMAC\left(C_{i,j},\,\text{"\x01"}\right) +\end{aligned} +``` + +## The Olm Protocol + +### Creating an outbound session + +Bob publishes the public parts of his identity key, $`I_B`$, and some +single-use one-time keys $`E_B`$. + +Alice downloads Bob's identity key, $`I_B`$, and a one-time key, +$`E_B`$. She generates a new single-use key, $`E_A`$, and computes a +root key, $`R_0`$, and a chain key $`C_{0,0}`$. She also generates a +new ratchet key $`T_0`$. + +### Sending the first pre-key messages + +Alice computes a message key, $`M_{0,j}`$, and a new chain key, +$`C_{0,j+1}`$, using the current chain key. She replaces the current chain +key with the new one. + +Alice encrypts her plain-text with the message key, $`M_{0,j}`$, using an +authenticated encryption scheme (see below) to get a cipher-text, +$`X_{0,j}`$. + +She then sends the following to Bob: + * The public part of her identity key, $`I_A`$ + * The public part of her single-use key, $`E_A`$ + * The public part of Bob's single-use key, $`E_B`$ + * The current chain index, $`j`$ + * The public part of her ratchet key, $`T_0`$ + * The cipher-text, $`X_{0,j}`$ + +Alice will continue to send pre-key messages until she receives a message from +Bob. + +### Creating an inbound session from a pre-key message + +Bob receives a pre-key message as above. + +Bob looks up the private part of his single-use key, $`E_B`$. He can now +compute the root key, $`R_0`$, and the chain key, $`C_{0,0}`$, from +$`I_A`$, $`E_A`$, $`I_B`$, and $`E_B`$. + +Bob then advances the chain key $`j`$ times, to compute the chain key used +by the message, $`C_{0,j}`$. He now creates the +message key, $`M_{0,j}`$, and attempts to decrypt the cipher-text, +$`X_{0,j}`$. If the cipher-text's authentication is correct then Bob can +discard the private part of his single-use one-time key, $`E_B`$. + +Bob stores Alice's initial ratchet key, $`T_0`$, until he wants to +send a message. + +### Sending normal messages + +Once a message has been received from the other side, a session is considered +established, and a more compact form is used. + +To send a message, the user checks if they have a sender chain key, +$`C_{i,j}`$. Alice uses chain keys where $`i`$ is even. Bob uses chain +keys where $`i`$ is odd. If the chain key doesn't exist then a new ratchet +key $`T_i`$ is generated and a new root key $`R_i`$ and chain key +$`C_{i,0}`$ are computed using $`R_{i-1}`$, $`T_{i-1}`$ and +$`T_i`$. + +A message key, +$`M_{i,j}`$ is computed from the current chain key, $`C_{i,j}`$, and +the chain key is replaced with the next chain key, $`C_{i,j+1}`$. The +plain-text is encrypted with $`M_{i,j}`$, using an authenticated encryption +scheme (see below) to get a cipher-text, $`X_{i,j}`$. + +The user then sends the following to the recipient: + * The current chain index, $`j`$ + * The public part of the current ratchet key, $`T_i`$ + * The cipher-text, $`X_{i,j}`$ + +### Receiving messages + +The user receives a message as above with the sender's current chain index, $`j`$, +the sender's ratchet key, $`T_i`$, and the cipher-text, $`X_{i,j}`$. + +The user checks if they have a receiver chain with the correct +$`i`$ by comparing the ratchet key, $`T_i`$. If the chain doesn't exist +then they compute a new root key, $`R_i`$, and a new receiver chain, with +chain key $`C_{i,0}`$, using $`R_{i-1}`$, $`T_{i-1}`$ and +$`T_i`$. + +If the $`j`$ of the message is less than +the current chain index on the receiver then the message may only be decrypted +if the receiver has stored a copy of the message key $`M_{i,j}`$. Otherwise +the receiver computes the chain key, $`C_{i,j}`$. The receiver computes the +message key, $`M_{i,j}`$, from the chain key and attempts to decrypt the +cipher-text, $`X_{i,j}`$. + +If the decryption succeeds the receiver updates the chain key for $`T_i`$ +with $`C_{i,j+1}`$ and stores the message keys that were skipped in the +process so that they can decode out of order messages. If the receiver created +a new receiver chain then they discard their current sender chain so that +they will create a new chain when they next send a message. + +## The Olm Message Format + +Olm uses two types of messages. The underlying transport protocol must provide +a means for recipients to distinguish between them. + +### Normal Messages + +Olm messages start with a one byte version followed by a variable length +payload followed by a fixed length message authentication code. + +``` + +--------------+------------------------------------+-----------+ + | Version Byte | Payload Bytes | MAC Bytes | + +--------------+------------------------------------+-----------+ +``` + +The version byte is ``"\x03"``. + +The payload consists of key-value pairs where the keys are integers and the +values are integers and strings. The keys are encoded as a variable length +integer tag where the 3 lowest bits indicates the type of the value: +0 for integers, 2 for strings. If the value is an integer then the tag is +followed by the value encoded as a variable length integer. If the value is +a string then the tag is followed by the length of the string encoded as +a variable length integer followed by the string itself. + +Olm uses a variable length encoding for integers. Each integer is encoded as a +sequence of bytes with the high bit set followed by a byte with the high bit +clear. The seven low bits of each byte store the bits of the integer. The least +significant bits are stored in the first byte. + +**Name**|**Tag**|**Type**|**Meaning** +:-----:|:-----:|:-----:|:-----: +Ratchet-Key|0x0A|String|The public part of the ratchet key, Ti, of the message +Chain-Index|0x10|Integer|The chain index, j, of the message +Cipher-Text|0x22|String|The cipher-text, Xi, j, of the message + +The length of the MAC is determined by the authenticated encryption algorithm +being used. (Olm version 1 uses [HMAC-SHA-256][], truncated to 8 bytes). The +MAC protects all of the bytes preceding the MAC. + +### Pre-Key Messages + +Olm pre-key messages start with a one byte version followed by a variable +length payload. + +``` + +--------------+------------------------------------+ + | Version Byte | Payload Bytes | + +--------------+------------------------------------+ +``` + +The version byte is ``"\x03"``. + +The payload uses the same key-value format as for normal messages. + +**Name**|**Tag**|**Type**|**Meaning** +:-----:|:-----:|:-----:|:-----: +One-Time-Key|0x0A|String|The public part of Bob's single-use key, Eb. +Base-Key|0x12|String|The public part of Alice's single-use key, Ea. +Identity-Key|0x1A|String|The public part of Alice's identity key, Ia. +Message|0x22|String|An embedded Olm message with its own version and MAC. + +## Olm Authenticated Encryption + +### Version 1 + +Version 1 of Olm uses [AES-256][] in [CBC][] mode with [PKCS#7][] padding for +encryption and [HMAC-SHA-256][] (truncated to 64 bits) for authentication. The +256 bit AES key, 256 bit HMAC key, and 128 bit AES IV are derived from the +message key using [HKDF-SHA-256][] using the default salt and an info of +``"OLM_KEYS"``. + +```math +\begin{aligned} + AES\_KEY_{i,j}\;\parallel\;HMAC\_KEY_{i,j}\;\parallel\;AES\_IV_{i,j} + &= HKDF\left(0,\,M_{i,j},\text{"OLM\_KEYS"},\,80\right) \\ +\end{aligned} +``` + +The plain-text is encrypted with AES-256, using the key $`AES\_KEY_{i,j}`$ +and the IV $`AES\_IV_{i,j}`$ to give the cipher-text, $`X_{i,j}`$. + +Then the entire message (including the Version Byte and all Payload Bytes) are +passed through [HMAC-SHA-256][]. The first 8 bytes of the MAC are appended to the message. + +## Message authentication concerns + +To avoid unknown key-share attacks, the application must include identifying +data for the sending and receiving user in the plain-text of (at least) the +pre-key messages. Such data could be a user ID, a telephone number; +alternatively it could be the public part of a keypair which the relevant user +has proven ownership of. + +### Example attacks + +1. Alice publishes her public [Curve25519][] identity key, $`I_A`$. Eve + publishes the same identity key, claiming it as her own. Bob downloads + Eve's keys, and associates $`I_A`$ with Eve. Alice sends a message to + Bob; Eve intercepts it before forwarding it to Bob. Bob believes the + message came from Eve rather than Alice. + + This is prevented if Alice includes her user ID in the plain-text of the + pre-key message, so that Bob can see that the message was sent by Alice + originally. + +2. Bob publishes his public [Curve25519][] identity key, $`I_B`$. Eve + publishes the same identity key, claiming it as her own. Alice downloads + Eve's keys, and associates $`I_B`$ with Eve. Alice sends a message to + Eve; Eve cannot decrypt it, but forwards it to Bob. Bob believes the + Alice sent the message to him, wheras Alice intended it to go to Eve. + + This is prevented by Alice including the user ID of the intended recpient + (Eve) in the plain-text of the pre-key message. Bob can now tell that the + message was meant for Eve rather than him. + +## IPR + +The Olm specification (this document) is hereby placed in the public domain. + +## Feedback + +Can be sent to olm at matrix.org. + +## Acknowledgements + +The ratchet that Olm implements was designed by Trevor Perrin and Moxie +Marlinspike - details at https://whispersystems.org/docs/specifications/doubleratchet/. Olm is +an entirely new implementation written by the Matrix.org team. + +[Curve25519]: http://cr.yp.to/ecdh.html +[Triple Diffie-Hellman]: https://whispersystems.org/blog/simplifying-otr-deniability/ +[HMAC-based key derivation function]: https://tools.ietf.org/html/rfc5869 +[HKDF-SHA-256]: https://tools.ietf.org/html/rfc5869 +[HMAC-SHA-256]: https://tools.ietf.org/html/rfc2104 +[SHA-256]: https://tools.ietf.org/html/rfc6234 +[AES-256]: http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf +[CBC]: http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf +[PKCS#7]: https://tools.ietf.org/html/rfc2315 -- cgit v1.2.3