From e273189af328c99d88df4ccfb5d508ee84ef0fb5 Mon Sep 17 00:00:00 2001 From: Aaron Raimist Date: Wed, 1 May 2019 11:55:21 -0500 Subject: Convert megolm.rst to markdown Signed-off-by: Aaron Raimist --- docs/megolm.md | 325 ++++++++++++++++++++++++++++++++++++++++++++++++++ docs/megolm.rst | 362 -------------------------------------------------------- 2 files changed, 325 insertions(+), 362 deletions(-) create mode 100644 docs/megolm.md delete mode 100644 docs/megolm.rst (limited to 'docs') diff --git a/docs/megolm.md b/docs/megolm.md new file mode 100644 index 0000000..f9acfd0 --- /dev/null +++ b/docs/megolm.md @@ -0,0 +1,325 @@ +# Megolm group ratchet + +An AES-based cryptographic ratchet intended for group communications. + +## Background + +The Megolm ratchet is intended for encrypted messaging applications where there +may be a large number of recipients of each message, thus precluding the use of +peer-to-peer encryption systems such as [Olm][]. + +It also allows a recipient to decrypt received messages multiple times. For +instance, in client/server applications, a copy of the ciphertext can be stored +on the (untrusted) server, while the client need only store the session keys. + +## Overview + +Each participant in a conversation uses their own outbound session for +encrypting messages. A session consists of a ratchet and an [Ed25519][] keypair. + +Secrecy is provided by the ratchet, which can be wound forwards but not +backwards, and is used to derive a distinct message key for each message. + +Authenticity is provided via Ed25519 signatures. + +The value of the ratchet, and the public part of the Ed25519 key, are shared +with other participants in the conversation via secure peer-to-peer +channels. Provided that peer-to-peer channel provides authenticity of the +messages to the participants and deniability of the messages to third parties, +the Megolm session will inherit those properties. + +## The Megolm ratchet algorithm + +The Megolm ratchet $`R_i`$ consists of four parts, $`R_{i,j}`$ for +$`j \in {0,1,2,3}`$. The length of each part depends on the hash function +in use (256 bits for this version of Megolm). + +The ratchet is initialised with cryptographically-secure random data, and +advanced as follows: + +```math +\begin{aligned} +R_{i,0} &= + \begin{cases} + H_0\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ + R_{i-1,0} &\text{otherwise} + \end{cases}\\ +R_{i,1} &= + \begin{cases} + H_1\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ + H_1\left(R_{2^16(m-1),1}\right) &\text{if }\exists m | i = 2^16m\\ + R_{i-1,1} &\text{otherwise} + \end{cases}\\ +R_{i,2} &= + \begin{cases} + H_2\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ + H_2\left(R_{2^16(m-1),1}\right) &\text{if }\exists m | i = 2^16m\\ + H_2\left(R_{2^8(p-1),2}\right) &\text{if }\exists p | i = 2^8p\\ + R_{i-1,2} &\text{otherwise} + \end{cases}\\ +R_{i,3} &= + \begin{cases} + H_3\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ + H_3\left(R_{2^16(m-1),1}\right) &\text{if }\exists m | i = 2^16m\\ + H_3\left(R_{2^8(p-1),2}\right) &\text{if }\exists p | i = 2^8p\\ + H_3\left(R_{i-1,3}\right) &\text{otherwise} + \end{cases} +\end{aligned} +``` + +where $`H_0`$, $`H_1`$, $`H_2`$, and $`H_3`$ are different hash +functions. In summary: every $`2^8`$ iterations, $`R_{i,3}`$ is +reseeded from $`R_{i,2}`$. Every $`2^16`$ iterations, $`R_{i,2}`$ +and $`R_{i,3}`$ are reseeded from $`R_{i,1}`$. Every $`2^24`$ +iterations, $`R_{i,1}`$, $`R_{i,2}`$ and $`R_{i,3}`$ are reseeded +from $`R_{i,0}`$. + +The complete ratchet value, $`R_{i}`$, is hashed to generate the keys used +to encrypt each message. This scheme allows the ratchet to be advanced an +arbitrary amount forwards while needing at most 1023 hash computations. A +client can decrypt chat history onwards from the earliest value of the ratchet +it is aware of, but cannot decrypt history from before that point without +reversing the hash function. + +This allows a participant to share its ability to decrypt chat history with +another from a point in the conversation onwards by giving a copy of the +ratchet at that point in the conversation. + + +## The Megolm protocol + +### Session setup + +Each participant in a conversation generates their own Megolm session. A +session consists of three parts: + +* a 32 bit counter, $`i`$. +* an [Ed25519][] keypair, $`K`$. +* a ratchet, $`R_i`$, which consists of four 256-bit values, + $`R_{i,j}`$ for $`j \in {0,1,2,3}`$. + +The counter $`i`$ is initialised to $`0`$. A new Ed25519 keypair is +generated for $`K`$. The ratchet is simply initialised with 1024 bits of +cryptographically-secure random data. + +A single participant may use multiple sessions over the lifetime of a +conversation. The public part of $`K`$ is used as an identifier to +discriminate between sessions. + +### Sharing session data + +To allow other participants in the conversation to decrypt messages, the +session data is formatted as described in [Session-sharing format](#Session-sharing-format). It is then +shared with other participants in the conversation via a secure peer-to-peer +channel (such as that provided by [Olm][]). + +When the session data is received from other participants, the recipient first +checks that the signature matches the public key. They then store their own +copy of the counter, ratchet, and public key. + +### Message encryption + +This version of Megolm uses AES-256_ in CBC_ mode with [PKCS#7][] padding and +HMAC-SHA-256_ (truncated to 64 bits). The 256 bit AES key, 256 bit HMAC key, +and 128 bit AES IV are derived from the megolm ratchet $`R_i`$: + +```math +\begin{aligned} +AES\_KEY_{i}\;\parallel\;HMAC\_KEY_{i}\;\parallel\;AES\_IV_{i} + &= HKDF\left(0,\,R_{i},\text{"MEGOLM\_KEYS"},\,80\right) \\ +\end{aligned} +``` + +where $`\parallel`$ represents string splitting, and +$`HKDF\left(salt,\,IKM,\,info,\,L\right)`$ refers to the [HMAC-based key +derivation function][] using using [SHA-256][] as the hash function +([HKDF-SHA-256][]) with a salt value of $`salt`$, input key material of +$`IKM`$, context string $`info`$, and output keying material length of +$`L`$ bytes. + +The plain-text is encrypted with AES-256, using the key $`AES\_KEY_{i}`$ +and the IV $`AES\_IV_{i}`$ to give the cipher-text, $`X_{i}`$. + +The ratchet index $`i`$, and the cipher-text $`X_{i}`$, are then packed +into a message as described in [Message format](#message-format). Then the entire message +(including the version bytes and all payload bytes) are passed through +HMAC-SHA-256. The first 8 bytes of the MAC are appended to the message. + +Finally, the authenticated message is signed using the Ed25519 keypair; the 64 +byte signature is appended to the message. + +The complete signed message, together with the public part of $`K`$ (acting +as a session identifier), can then be sent over an insecure channel. The +message can then be authenticated and decrypted only by recipients who have +received the session data. + +### Advancing the ratchet + +After each message is encrypted, the ratchet is advanced. This is done as +described in [The Megolm ratchet algorithm](#the-megolm-ratchet-algorithm), using the following definitions: + +```math +\begin{aligned} + H_0(A) &\equiv HMAC(A,\text{"\x00"}) \\ + H_1(A) &\equiv HMAC(A,\text{"\x01"}) \\ + H_2(A) &\equiv HMAC(A,\text{"\x02"}) \\ + H_3(A) &\equiv HMAC(A,\text{"\x03"}) \\ +\end{aligned} +``` + +where $`HMAC(A, T)`$ is the HMAC-SHA-256 of ``T``, using ``A`` as the +key. + +For outbound sessions, the updated ratchet and counter are stored in the +session. + +In order to maintain the ability to decrypt conversation history, inbound +sessions should store a copy of their earliest known ratchet value (unless they +explicitly want to drop the ability to decrypt that history - see [Partial +Forward Secrecy](#partial-forward-secrecy)). They may also choose to cache calculated ratchet values, +but the decision of which ratchet states to cache is left to the application. + +## Data exchange formats + +### Session-sharing format + +The Megolm key-sharing format is as follows: + +``` ++---+----+--------+--------+--------+--------+------+-----------+ +| V | i | R(i,0) | R(i,1) | R(i,2) | R(i,3) | Kpub | Signature | ++---+----+--------+--------+--------+--------+------+-----------+ +0 1 5 37 69 101 133 165 229 bytes +``` + +The version byte, ``V``, is ``"\x02"``. + +This is followed by the ratchet index, $`i`$, which is encoded as a +big-endian 32-bit integer; the ratchet values $`R_{i,j}`$; and the public +part of the Ed25519 keypair $`K`$. + +The data is then signed using the Ed25519 keypair, and the 64-byte signature is +appended. + +### Message format + +Megolm messages consist of a one byte version, followed by a variable length +payload, a fixed length message authentication code, and a fixed length +signature. + +``` ++---+------------------------------------+-----------+------------------+ +| V | Payload Bytes | MAC Bytes | Signature Bytes | ++---+------------------------------------+-----------+------------------+ +0 1 N N+8 N+72 bytes +``` + +The version byte, ``V``, is ``"\x03"``. + +The payload uses a format based on the [Protocol Buffers encoding][]. It +consists of the following key-value pairs: + +**Name**|**Tag**|**Type**|**Meaning** +:-----:|:-----:|:-----:|:-----: +Message-Index|0x08|Integer|The index of the ratchet, i +Cipher-Text|0x12|String|The cipher-text, Xi, of the message + +Within the payload, integers are encoded using a variable length encoding. Each +integer is encoded as a sequence of bytes with the high bit set followed by a +byte with the high bit clear. The seven low bits of each byte store the bits of +the integer. The least significant bits are stored in the first byte. + +Strings are encoded as a variable-length integer followed by the string itself. + +Each key-value pair is encoded as a variable-length integer giving the tag, +followed by a string or variable-length integer giving the value. + +The payload is followed by the MAC. The length of the MAC is determined by the +authenticated encryption algorithm being used (8 bytes in this version of the +protocol). The MAC protects all of the bytes preceding the MAC. + +The length of the signature is determined by the signing algorithm being used +(64 bytes in this version of the protocol). The signature covers all of the +bytes preceding the signature. + +## Limitations + +### Message Replays + +A message can be decrypted successfully multiple times. This means that an +attacker can re-send a copy of an old message, and the recipient will treat it +as a new message. + +To mitigate this it is recommended that applications track the ratchet indices +they have received and that they reject messages with a ratchet index that +they have already decrypted. + +### Lack of Transcript Consistency + +In a group conversation, there is no guarantee that all recipients have +received the same messages. For example, if Alice is in a conversation with Bob +and Charlie, she could send different messages to Bob and Charlie, or could +send some messages to Bob but not Charlie, or vice versa. + +Solving this is, in general, a hard problem, particularly in a protocol which +does not guarantee in-order message delivery. For now it remains the subject of +future research. + +### Lack of Backward Secrecy + +Once the key to a Megolm session is compromised, the attacker can decrypt any +future messages sent via that session. + +In order to mitigate this, the application should ensure that Megolm sessions +are not used indefinitely. Instead it should periodically start a new session, +with new keys shared over a secure channel. + + + +### Partial Forward Secrecy + +Each recipient maintains a record of the ratchet value which allows them to +decrypt any messages sent in the session after the corresponding point in the +conversation. If this value is compromised, an attacker can similarly decrypt +those past messages. + +To mitigate this issue, the application should offer the user the option to +discard historical conversations, by winding forward any stored ratchet values, +or discarding sessions altogether. + +### Dependency on secure channel for key exchange + +The design of the Megolm ratchet relies on the availability of a secure +peer-to-peer channel for the exchange of session keys. Any vulnerabilities in +the underlying channel are likely to be amplified when applied to Megolm +session setup. + +For example, if the peer-to-peer channel is vulnerable to an unknown key-share +attack, the entire Megolm session become similarly vulnerable. For example: +Alice starts a group chat with Eve, and shares the session keys with Eve. Eve +uses the unknown key-share attack to forward the session keys to Bob, who +believes Alice is starting the session with him. Eve then forwards messages +from the Megolm session to Bob, who again believes they are coming from +Alice. Provided the peer-to-peer channel is not vulnerable to this attack, Bob +will realise that the key-sharing message was forwarded by Eve, and can treat +the Megolm session as a forgery. + +A second example: if the peer-to-peer channel is vulnerable to a replay +attack, this can be extended to entire Megolm sessions. + +## License + +The Megolm specification (this document) is licensed under the Apache License, +Version 2.0 http://www.apache.org/licenses/LICENSE-2.0. + +[Ed25519]: http://ed25519.cr.yp.to/ +[HMAC-based key derivation function]: https://tools.ietf.org/html/rfc5869 +[HKDF-SHA-256]: https://tools.ietf.org/html/rfc5869 +[HMAC-SHA-256]: https://tools.ietf.org/html/rfc2104 +[SHA-256]: https://tools.ietf.org/html/rfc6234 +[AES-256]: http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf +[CBC]: http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf +[PKCS#7]: https://tools.ietf.org/html/rfc2315 +[Olm]: https://gitlab.matrix.org/matrix-org/olm/blob/master/docs/olm.md +[Protocol Buffers encoding]: https://developers.google.com/protocol-buffers/docs/encoding diff --git a/docs/megolm.rst b/docs/megolm.rst deleted file mode 100644 index 03ee426..0000000 --- a/docs/megolm.rst +++ /dev/null @@ -1,362 +0,0 @@ -.. Copyright 2016 OpenMarket Ltd -.. -.. Licensed under the Apache License, Version 2.0 (the "License"); -.. you may not use this file except in compliance with the License. -.. You may obtain a copy of the License at -.. -.. http://www.apache.org/licenses/LICENSE-2.0 -.. -.. Unless required by applicable law or agreed to in writing, software -.. distributed under the License is distributed on an "AS IS" BASIS, -.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -.. See the License for the specific language governing permissions and -.. limitations under the License. - - -Megolm group ratchet -==================== - -An AES-based cryptographic ratchet intended for group communications. - -.. contents:: - -Background ----------- - -The Megolm ratchet is intended for encrypted messaging applications where there -may be a large number of recipients of each message, thus precluding the use of -peer-to-peer encryption systems such as `Olm`_. - -It also allows a recipient to decrypt received messages multiple times. For -instance, in client/server applications, a copy of the ciphertext can be stored -on the (untrusted) server, while the client need only store the session keys. - -Overview --------- - -Each participant in a conversation uses their own outbound session for -encrypting messages. A session consists of a ratchet and an `Ed25519`_ keypair. - -Secrecy is provided by the ratchet, which can be wound forwards but not -backwards, and is used to derive a distinct message key for each message. - -Authenticity is provided via Ed25519 signatures. - -The value of the ratchet, and the public part of the Ed25519 key, are shared -with other participants in the conversation via secure peer-to-peer -channels. Provided that peer-to-peer channel provides authenticity of the -messages to the participants and deniability of the messages to third parties, -the Megolm session will inherit those properties. - -The Megolm ratchet algorithm ----------------------------- - -The Megolm ratchet :math:`R_i` consists of four parts, :math:`R_{i,j}` for -:math:`j \in {0,1,2,3}`. The length of each part depends on the hash function -in use (256 bits for this version of Megolm). - -The ratchet is initialised with cryptographically-secure random data, and -advanced as follows: - -.. math:: - \begin{align} - R_{i,0} &= - \begin{cases} - H_0\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ - R_{i-1,0} &\text{otherwise} - \end{cases}\\ - R_{i,1} &= - \begin{cases} - H_1\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ - H_1\left(R_{2^16(m-1),1}\right) &\text{if }\exists m | i = 2^16m\\ - R_{i-1,1} &\text{otherwise} - \end{cases}\\ - R_{i,2} &= - \begin{cases} - H_2\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ - H_2\left(R_{2^16(m-1),1}\right) &\text{if }\exists m | i = 2^16m\\ - H_2\left(R_{2^8(p-1),2}\right) &\text{if }\exists p | i = 2^8p\\ - R_{i-1,2} &\text{otherwise} - \end{cases}\\ - R_{i,3} &= - \begin{cases} - H_3\left(R_{2^24(n-1),0}\right) &\text{if }\exists n | i = 2^24n\\ - H_3\left(R_{2^16(m-1),1}\right) &\text{if }\exists m | i = 2^16m\\ - H_3\left(R_{2^8(p-1),2}\right) &\text{if }\exists p | i = 2^8p\\ - H_3\left(R_{i-1,3}\right) &\text{otherwise} - \end{cases} - \end{align} - -where :math:`H_0`, :math:`H_1`, :math:`H_2`, and :math:`H_3` are different hash -functions. In summary: every :math:`2^8` iterations, :math:`R_{i,3}` is -reseeded from :math:`R_{i,2}`. Every :math:`2^16` iterations, :math:`R_{i,2}` -and :math:`R_{i,3}` are reseeded from :math:`R_{i,1}`. Every :math:`2^24` -iterations, :math:`R_{i,1}`, :math:`R_{i,2}` and :math:`R_{i,3}` are reseeded -from :math:`R_{i,0}`. - -The complete ratchet value, :math:`R_{i}`, is hashed to generate the keys used -to encrypt each message. This scheme allows the ratchet to be advanced an -arbitrary amount forwards while needing at most 1023 hash computations. A -client can decrypt chat history onwards from the earliest value of the ratchet -it is aware of, but cannot decrypt history from before that point without -reversing the hash function. - -This allows a participant to share its ability to decrypt chat history with -another from a point in the conversation onwards by giving a copy of the -ratchet at that point in the conversation. - - -The Megolm protocol -------------------- - -Session setup -~~~~~~~~~~~~~ - -Each participant in a conversation generates their own Megolm session. A -session consists of three parts: - -* a 32 bit counter, :math:`i`. -* an `Ed25519`_ keypair, :math:`K`. -* a ratchet, :math:`R_i`, which consists of four 256-bit values, - :math:`R_{i,j}` for :math:`j \in {0,1,2,3}`. - -The counter :math:`i` is initialised to :math:`0`. A new Ed25519 keypair is -generated for :math:`K`. The ratchet is simply initialised with 1024 bits of -cryptographically-secure random data. - -A single participant may use multiple sessions over the lifetime of a -conversation. The public part of :math:`K` is used as an identifier to -discriminate between sessions. - -Sharing session data -~~~~~~~~~~~~~~~~~~~~ - -To allow other participants in the conversation to decrypt messages, the -session data is formatted as described in `Session-sharing format`_. It is then -shared with other participants in the conversation via a secure peer-to-peer -channel (such as that provided by `Olm`_). - -When the session data is received from other participants, the recipient first -checks that the signature matches the public key. They then store their own -copy of the counter, ratchet, and public key. - -Message encryption -~~~~~~~~~~~~~~~~~~ - -This version of Megolm uses AES-256_ in CBC_ mode with `PKCS#7`_ padding and -HMAC-SHA-256_ (truncated to 64 bits). The 256 bit AES key, 256 bit HMAC key, -and 128 bit AES IV are derived from the megolm ratchet :math:`R_i`: - -.. math:: - - \begin{align} - AES\_KEY_{i}\;\parallel\;HMAC\_KEY_{i}\;\parallel\;AES\_IV_{i} - &= HKDF\left(0,\,R_{i},\text{"MEGOLM\_KEYS"},\,80\right) \\ - \end{align} - -where :math:`\parallel` represents string splitting, and -:math:`HKDF\left(salt,\,IKM,\,info,\,L\right)` refers to the `HMAC-based key -derivation function`_ using using `SHA-256`_ as the hash function -(`HKDF-SHA-256`_) with a salt value of :math:`salt`, input key material of -:math:`IKM`, context string :math:`info`, and output keying material length of -:math:`L` bytes. - -The plain-text is encrypted with AES-256, using the key :math:`AES\_KEY_{i}` -and the IV :math:`AES\_IV_{i}` to give the cipher-text, :math:`X_{i}`. - -The ratchet index :math:`i`, and the cipher-text :math:`X_{i}`, are then packed -into a message as described in `Message format`_. Then the entire message -(including the version bytes and all payload bytes) are passed through -HMAC-SHA-256. The first 8 bytes of the MAC are appended to the message. - -Finally, the authenticated message is signed using the Ed25519 keypair; the 64 -byte signature is appended to the message. - -The complete signed message, together with the public part of :math:`K` (acting -as a session identifier), can then be sent over an insecure channel. The -message can then be authenticated and decrypted only by recipients who have -received the session data. - -Advancing the ratchet -~~~~~~~~~~~~~~~~~~~~~ - -After each message is encrypted, the ratchet is advanced. This is done as -described in `The Megolm ratchet algorithm`_, using the following definitions: - -.. math:: - \begin{align} - H_0(A) &\equiv HMAC(A,\text{"\textbackslash x00"}) \\ - H_1(A) &\equiv HMAC(A,\text{"\textbackslash x01"}) \\ - H_2(A) &\equiv HMAC(A,\text{"\textbackslash x02"}) \\ - H_3(A) &\equiv HMAC(A,\text{"\textbackslash x03"}) \\ - \end{align} - -where :math:`HMAC(A, T)` is the HMAC-SHA-256_ of ``T``, using ``A`` as the -key. - -For outbound sessions, the updated ratchet and counter are stored in the -session. - -In order to maintain the ability to decrypt conversation history, inbound -sessions should store a copy of their earliest known ratchet value (unless they -explicitly want to drop the ability to decrypt that history - see `Partial -Forward Secrecy`_\ ). They may also choose to cache calculated ratchet values, -but the decision of which ratchet states to cache is left to the application. - -Data exchange formats ---------------------- - -Session-sharing format -~~~~~~~~~~~~~~~~~~~~~~ - -The Megolm key-sharing format is as follows: - -.. code:: - - +---+----+--------+--------+--------+--------+------+-----------+ - | V | i | R(i,0) | R(i,1) | R(i,2) | R(i,3) | Kpub | Signature | - +---+----+--------+--------+--------+--------+------+-----------+ - 0 1 5 37 69 101 133 165 229 bytes - -The version byte, ``V``, is ``"\x02"``. - -This is followed by the ratchet index, :math:`i`, which is encoded as a -big-endian 32-bit integer; the ratchet values :math:`R_{i,j}`; and the public -part of the Ed25519 keypair :math:`K`. - -The data is then signed using the Ed25519 keypair, and the 64-byte signature is -appended. - -Message format -~~~~~~~~~~~~~~ - -Megolm messages consist of a one byte version, followed by a variable length -payload, a fixed length message authentication code, and a fixed length -signature. - -.. code:: - - +---+------------------------------------+-----------+------------------+ - | V | Payload Bytes | MAC Bytes | Signature Bytes | - +---+------------------------------------+-----------+------------------+ - 0 1 N N+8 N+72 bytes - -The version byte, ``V``, is ``"\x03"``. - -The payload uses a format based on the `Protocol Buffers encoding`_. It -consists of the following key-value pairs: - -============= ===== ======== ================================================ - Name Tag Type Meaning -============= ===== ======== ================================================ -Message-Index 0x08 Integer The index of the ratchet, :math:`i` -Cipher-Text 0x12 String The cipher-text, :math:`X_{i}`, of the message -============= ===== ======== ================================================ - -Within the payload, integers are encoded using a variable length encoding. Each -integer is encoded as a sequence of bytes with the high bit set followed by a -byte with the high bit clear. The seven low bits of each byte store the bits of -the integer. The least significant bits are stored in the first byte. - -Strings are encoded as a variable-length integer followed by the string itself. - -Each key-value pair is encoded as a variable-length integer giving the tag, -followed by a string or variable-length integer giving the value. - -The payload is followed by the MAC. The length of the MAC is determined by the -authenticated encryption algorithm being used (8 bytes in this version of the -protocol). The MAC protects all of the bytes preceding the MAC. - -The length of the signature is determined by the signing algorithm being used -(64 bytes in this version of the protocol). The signature covers all of the -bytes preceding the signature. - -Limitations ------------ - -Message Replays ---------------- - -A message can be decrypted successfully multiple times. This means that an -attacker can re-send a copy of an old message, and the recipient will treat it -as a new message. - -To mitigate this it is recommended that applications track the ratchet indices -they have received and that they reject messages with a ratchet index that -they have already decrypted. - -Lack of Transcript Consistency -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -In a group conversation, there is no guarantee that all recipients have -received the same messages. For example, if Alice is in a conversation with Bob -and Charlie, she could send different messages to Bob and Charlie, or could -send some messages to Bob but not Charlie, or vice versa. - -Solving this is, in general, a hard problem, particularly in a protocol which -does not guarantee in-order message delivery. For now it remains the subject of -future research. - -Lack of Backward Secrecy -~~~~~~~~~~~~~~~~~~~~~~~~ - -Once the key to a Megolm session is compromised, the attacker can decrypt any -future messages sent via that session. - -In order to mitigate this, the application should ensure that Megolm sessions -are not used indefinitely. Instead it should periodically start a new session, -with new keys shared over a secure channel. - -.. TODO: Can we recommend sensible lifetimes for Megolm sessions? Probably - depends how paranoid we're feeling, but some guidelines might be useful. - -Partial Forward Secrecy -~~~~~~~~~~~~~~~~~~~~~~~ - -Each recipient maintains a record of the ratchet value which allows them to -decrypt any messages sent in the session after the corresponding point in the -conversation. If this value is compromised, an attacker can similarly decrypt -those past messages. - -To mitigate this issue, the application should offer the user the option to -discard historical conversations, by winding forward any stored ratchet values, -or discarding sessions altogether. - -Dependency on secure channel for key exchange -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The design of the Megolm ratchet relies on the availability of a secure -peer-to-peer channel for the exchange of session keys. Any vulnerabilities in -the underlying channel are likely to be amplified when applied to Megolm -session setup. - -For example, if the peer-to-peer channel is vulnerable to an unknown key-share -attack, the entire Megolm session become similarly vulnerable. For example: -Alice starts a group chat with Eve, and shares the session keys with Eve. Eve -uses the unknown key-share attack to forward the session keys to Bob, who -believes Alice is starting the session with him. Eve then forwards messages -from the Megolm session to Bob, who again believes they are coming from -Alice. Provided the peer-to-peer channel is not vulnerable to this attack, Bob -will realise that the key-sharing message was forwarded by Eve, and can treat -the Megolm session as a forgery. - -A second example: if the peer-to-peer channel is vulnerable to a replay -attack, this can be extended to entire Megolm sessions. - -License -------- - -The Megolm specification (this document) is licensed under the `Apache License, -Version 2.0 `_. - - -.. _`Ed25519`: http://ed25519.cr.yp.to/ -.. _`HMAC-based key derivation function`: https://tools.ietf.org/html/rfc5869 -.. _`HKDF-SHA-256`: https://tools.ietf.org/html/rfc5869 -.. _`HMAC-SHA-256`: https://tools.ietf.org/html/rfc2104 -.. _`SHA-256`: https://tools.ietf.org/html/rfc6234 -.. _`AES-256`: http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf -.. _`CBC`: http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf -.. _`PKCS#7`: https://tools.ietf.org/html/rfc2315 -.. _`Olm`: ./olm.html -.. _`Protocol Buffers encoding`: https://developers.google.com/protocol-buffers/docs/encoding -- cgit v1.2.3 From 6a72cfd2875e4bc003332bc19b9a50fdbceac593 Mon Sep 17 00:00:00 2001 From: Aaron Raimist Date: Wed, 1 May 2019 13:00:06 -0500 Subject: Convert olm.rst to markdown Signed-off-by: Aaron Raimist --- docs/olm.md | 328 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ docs/olm.rst | 358 ----------------------------------------------------------- 2 files changed, 328 insertions(+), 358 deletions(-) create mode 100644 docs/olm.md delete mode 100644 docs/olm.rst (limited to 'docs') diff --git a/docs/olm.md b/docs/olm.md new file mode 100644 index 0000000..e9bb4ae --- /dev/null +++ b/docs/olm.md @@ -0,0 +1,328 @@ +# Olm: A Cryptographic Ratchet + +An implementation of the double cryptographic ratchet described by +https://whispersystems.org/docs/specifications/doubleratchet/. + +## Notation + +This document uses $`\parallel`$ to represent string concatenation. When +$`\parallel`$ appears on the right hand side of an $`=`$ it means that +the inputs are concatenated. When $`\parallel`$ appears on the left hand +side of an $`=`$ it means that the output is split. + +When this document uses $`ECDH\left(K_A,\,K_B\right)`$ it means that each +party computes a Diffie-Hellman agreement using their private key and the +remote party's public key. +So party $`A`$ computes $`ECDH\left(K_B^{public},\,K_A^{private}\right)`$ +and party $`B`$ computes $`ECDH\left(K_A^{public},\,K_B^{private}\right)`$. + +Where this document uses $`HKDF\left(salt,\,IKM,\,info,\,L\right)`$ it +refers to the [HMAC-based key derivation function][] with a salt value of +$`salt`$, input key material of $`IKM`$, context string $`info`$, +and output keying material length of $`L`$ bytes. + +## The Olm Algorithm + +### Initial setup + +The setup takes four [Curve25519][] inputs: Identity keys for Alice and Bob, +$`I_A`$ and $`I_B`$, and one-time keys for Alice and Bob, +$`E_A`$ and $`E_B`$. A shared secret, $`S`$, is generated using +[Triple Diffie-Hellman][]. The initial 256 bit root key, $`R_0`$, and 256 +bit chain key, $`C_{0,0}`$, are derived from the shared secret using an +HMAC-based Key Derivation Function using [SHA-256][] as the hash function +([HKDF-SHA-256][]) with default salt and ``"OLM_ROOT"`` as the info. + +```math +\begin{aligned} + S&=ECDH\left(I_A,\,E_B\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; + \parallel\;ECDH\left(E_A,\,E_B\right)\\ + R_0\;\parallel\;C_{0,0}&= + HKDF\left(0,\,S,\,\text{"OLM\_ROOT"},\,64\right) +\end{aligned} +``` + +### Advancing the root key + +Advancing a root key takes the previous root key, $`R_{i-1}`$, and two +Curve25519 inputs: the previous ratchet key, $`T_{i-1}`$, and the current +ratchet key $`T_i`$. The even ratchet keys are generated by Alice. +The odd ratchet keys are generated by Bob. A shared secret is generated +using Diffie-Hellman on the ratchet keys. The next root key, $`R_i`$, and +chain key, $`C_{i,0}`$, are derived from the shared secret using +[HKDF-SHA-256][] using $`R_{i-1}`$ as the salt and ``"OLM_RATCHET"`` as the +info. + +```math +\begin{aligned} + R_i\;\parallel\;C_{i,0}&=HKDF\left( + R_{i-1},\, + ECDH\left(T_{i-1},\,T_i\right),\, + \text{"OLM\_RATCHET"},\, + 64 + \right) +\end{aligned} +``` + +### Advancing the chain key + +Advancing a chain key takes the previous chain key, $`C_{i,j-1}`$. The next +chain key, $`C_{i,j}`$, is the [HMAC-SHA-256][] of ``"\x02"`` using the +previous chain key as the key. + +```math +\begin{aligned} + C_{i,j}&=HMAC\left(C_{i,j-1},\,\text{"\x02"}\right) +\end{aligned} +``` + +### Creating a message key + +Creating a message key takes the current chain key, $`C_{i,j}`$. The +message key, $`M_{i,j}`$, is the [HMAC-SHA-256][] of ``"\x01"`` using the +current chain key as the key. The message keys where $`i`$ is even are used +by Alice to encrypt messages. The message keys where $`i`$ is odd are used +by Bob to encrypt messages. + +```math +\begin{aligned} + M_{i,j}&=HMAC\left(C_{i,j},\,\text{"\x01"}\right) +\end{aligned} +``` + +## The Olm Protocol + +### Creating an outbound session + +Bob publishes the public parts of his identity key, $`I_B`$, and some +single-use one-time keys $`E_B`$. + +Alice downloads Bob's identity key, $`I_B`$, and a one-time key, +$`E_B`$. She generates a new single-use key, $`E_A`$, and computes a +root key, $`R_0`$, and a chain key $`C_{0,0}`$. She also generates a +new ratchet key $`T_0`$. + +### Sending the first pre-key messages + +Alice computes a message key, $`M_{0,j}`$, and a new chain key, +$`C_{0,j+1}`$, using the current chain key. She replaces the current chain +key with the new one. + +Alice encrypts her plain-text with the message key, $`M_{0,j}`$, using an +authenticated encryption scheme (see below) to get a cipher-text, +$`X_{0,j}`$. + +She then sends the following to Bob: + * The public part of her identity key, $`I_A`$ + * The public part of her single-use key, $`E_A`$ + * The public part of Bob's single-use key, $`E_B`$ + * The current chain index, $`j`$ + * The public part of her ratchet key, $`T_0`$ + * The cipher-text, $`X_{0,j}`$ + +Alice will continue to send pre-key messages until she receives a message from +Bob. + +### Creating an inbound session from a pre-key message + +Bob receives a pre-key message as above. + +Bob looks up the private part of his single-use key, $`E_B`$. He can now +compute the root key, $`R_0`$, and the chain key, $`C_{0,0}`$, from +$`I_A`$, $`E_A`$, $`I_B`$, and $`E_B`$. + +Bob then advances the chain key $`j`$ times, to compute the chain key used +by the message, $`C_{0,j}`$. He now creates the +message key, $`M_{0,j}`$, and attempts to decrypt the cipher-text, +$`X_{0,j}`$. If the cipher-text's authentication is correct then Bob can +discard the private part of his single-use one-time key, $`E_B`$. + +Bob stores Alice's initial ratchet key, $`T_0`$, until he wants to +send a message. + +### Sending normal messages + +Once a message has been received from the other side, a session is considered +established, and a more compact form is used. + +To send a message, the user checks if they have a sender chain key, +$`C_{i,j}`$. Alice uses chain keys where $`i`$ is even. Bob uses chain +keys where $`i`$ is odd. If the chain key doesn't exist then a new ratchet +key $`T_i`$ is generated and a new root key $`R_i`$ and chain key +$`C_{i,0}`$ are computed using $`R_{i-1}`$, $`T_{i-1}`$ and +$`T_i`$. + +A message key, +$`M_{i,j}`$ is computed from the current chain key, $`C_{i,j}`$, and +the chain key is replaced with the next chain key, $`C_{i,j+1}`$. The +plain-text is encrypted with $`M_{i,j}`$, using an authenticated encryption +scheme (see below) to get a cipher-text, $`X_{i,j}`$. + +The user then sends the following to the recipient: + * The current chain index, $`j`$ + * The public part of the current ratchet key, $`T_i`$ + * The cipher-text, $`X_{i,j}`$ + +### Receiving messages + +The user receives a message as above with the sender's current chain index, $`j`$, +the sender's ratchet key, $`T_i`$, and the cipher-text, $`X_{i,j}`$. + +The user checks if they have a receiver chain with the correct +$`i`$ by comparing the ratchet key, $`T_i`$. If the chain doesn't exist +then they compute a new root key, $`R_i`$, and a new receiver chain, with +chain key $`C_{i,0}`$, using $`R_{i-1}`$, $`T_{i-1}`$ and +$`T_i`$. + +If the $`j`$ of the message is less than +the current chain index on the receiver then the message may only be decrypted +if the receiver has stored a copy of the message key $`M_{i,j}`$. Otherwise +the receiver computes the chain key, $`C_{i,j}`$. The receiver computes the +message key, $`M_{i,j}`$, from the chain key and attempts to decrypt the +cipher-text, $`X_{i,j}`$. + +If the decryption succeeds the receiver updates the chain key for $`T_i`$ +with $`C_{i,j+1}`$ and stores the message keys that were skipped in the +process so that they can decode out of order messages. If the receiver created +a new receiver chain then they discard their current sender chain so that +they will create a new chain when they next send a message. + +## The Olm Message Format + +Olm uses two types of messages. The underlying transport protocol must provide +a means for recipients to distinguish between them. + +### Normal Messages + +Olm messages start with a one byte version followed by a variable length +payload followed by a fixed length message authentication code. + +``` + +--------------+------------------------------------+-----------+ + | Version Byte | Payload Bytes | MAC Bytes | + +--------------+------------------------------------+-----------+ +``` + +The version byte is ``"\x03"``. + +The payload consists of key-value pairs where the keys are integers and the +values are integers and strings. The keys are encoded as a variable length +integer tag where the 3 lowest bits indicates the type of the value: +0 for integers, 2 for strings. If the value is an integer then the tag is +followed by the value encoded as a variable length integer. If the value is +a string then the tag is followed by the length of the string encoded as +a variable length integer followed by the string itself. + +Olm uses a variable length encoding for integers. Each integer is encoded as a +sequence of bytes with the high bit set followed by a byte with the high bit +clear. The seven low bits of each byte store the bits of the integer. The least +significant bits are stored in the first byte. + +**Name**|**Tag**|**Type**|**Meaning** +:-----:|:-----:|:-----:|:-----: +Ratchet-Key|0x0A|String|The public part of the ratchet key, Ti, of the message +Chain-Index|0x10|Integer|The chain index, j, of the message +Cipher-Text|0x22|String|The cipher-text, Xi, j, of the message + +The length of the MAC is determined by the authenticated encryption algorithm +being used. (Olm version 1 uses [HMAC-SHA-256][], truncated to 8 bytes). The +MAC protects all of the bytes preceding the MAC. + +### Pre-Key Messages + +Olm pre-key messages start with a one byte version followed by a variable +length payload. + +``` + +--------------+------------------------------------+ + | Version Byte | Payload Bytes | + +--------------+------------------------------------+ +``` + +The version byte is ``"\x03"``. + +The payload uses the same key-value format as for normal messages. + +**Name**|**Tag**|**Type**|**Meaning** +:-----:|:-----:|:-----:|:-----: +One-Time-Key|0x0A|String|The public part of Bob's single-use key, Eb. +Base-Key|0x12|String|The public part of Alice's single-use key, Ea. +Identity-Key|0x1A|String|The public part of Alice's identity key, Ia. +Message|0x22|String|An embedded Olm message with its own version and MAC. + +## Olm Authenticated Encryption + +### Version 1 + +Version 1 of Olm uses [AES-256][] in [CBC][] mode with [PKCS#7][] padding for +encryption and [HMAC-SHA-256][] (truncated to 64 bits) for authentication. The +256 bit AES key, 256 bit HMAC key, and 128 bit AES IV are derived from the +message key using [HKDF-SHA-256][] using the default salt and an info of +``"OLM_KEYS"``. + +```math +\begin{aligned} + AES\_KEY_{i,j}\;\parallel\;HMAC\_KEY_{i,j}\;\parallel\;AES\_IV_{i,j} + &= HKDF\left(0,\,M_{i,j},\text{"OLM\_KEYS"},\,80\right) \\ +\end{aligned} +``` + +The plain-text is encrypted with AES-256, using the key $`AES\_KEY_{i,j}`$ +and the IV $`AES\_IV_{i,j}`$ to give the cipher-text, $`X_{i,j}`$. + +Then the entire message (including the Version Byte and all Payload Bytes) are +passed through [HMAC-SHA-256][]. The first 8 bytes of the MAC are appended to the message. + +## Message authentication concerns + +To avoid unknown key-share attacks, the application must include identifying +data for the sending and receiving user in the plain-text of (at least) the +pre-key messages. Such data could be a user ID, a telephone number; +alternatively it could be the public part of a keypair which the relevant user +has proven ownership of. + +### Example attacks + +1. Alice publishes her public [Curve25519][] identity key, $`I_A`$. Eve + publishes the same identity key, claiming it as her own. Bob downloads + Eve's keys, and associates $`I_A`$ with Eve. Alice sends a message to + Bob; Eve intercepts it before forwarding it to Bob. Bob believes the + message came from Eve rather than Alice. + + This is prevented if Alice includes her user ID in the plain-text of the + pre-key message, so that Bob can see that the message was sent by Alice + originally. + +2. Bob publishes his public [Curve25519][] identity key, $`I_B`$. Eve + publishes the same identity key, claiming it as her own. Alice downloads + Eve's keys, and associates $`I_B`$ with Eve. Alice sends a message to + Eve; Eve cannot decrypt it, but forwards it to Bob. Bob believes the + Alice sent the message to him, wheras Alice intended it to go to Eve. + + This is prevented by Alice including the user ID of the intended recpient + (Eve) in the plain-text of the pre-key message. Bob can now tell that the + message was meant for Eve rather than him. + +## IPR + +The Olm specification (this document) is hereby placed in the public domain. + +## Feedback + +Can be sent to olm at matrix.org. + +## Acknowledgements + +The ratchet that Olm implements was designed by Trevor Perrin and Moxie +Marlinspike - details at https://whispersystems.org/docs/specifications/doubleratchet/. Olm is +an entirely new implementation written by the Matrix.org team. + +[Curve25519]: http://cr.yp.to/ecdh.html +[Triple Diffie-Hellman]: https://whispersystems.org/blog/simplifying-otr-deniability/ +[HMAC-based key derivation function]: https://tools.ietf.org/html/rfc5869 +[HKDF-SHA-256]: https://tools.ietf.org/html/rfc5869 +[HMAC-SHA-256]: https://tools.ietf.org/html/rfc2104 +[SHA-256]: https://tools.ietf.org/html/rfc6234 +[AES-256]: http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf +[CBC]: http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf +[PKCS#7]: https://tools.ietf.org/html/rfc2315 diff --git a/docs/olm.rst b/docs/olm.rst deleted file mode 100644 index 9c13478..0000000 --- a/docs/olm.rst +++ /dev/null @@ -1,358 +0,0 @@ -Olm: A Cryptographic Ratchet -============================ - -An implementation of the double cryptographic ratchet described by -https://whispersystems.org/docs/specifications/doubleratchet/. - - -Notation --------- - -This document uses :math:`\parallel` to represent string concatenation. When -:math:`\parallel` appears on the right hand side of an :math:`=` it means that -the inputs are concatenated. When :math:`\parallel` appears on the left hand -side of an :math:`=` it means that the output is split. - -When this document uses :math:`ECDH\left(K_A,\,K_B\right)` it means that each -party computes a Diffie-Hellman agreement using their private key and the -remote party's public key. -So party :math:`A` computes :math:`ECDH\left(K_B_public,\,K_A_private\right)` -and party :math:`B` computes :math:`ECDH\left(K_A_public,\,K_B_private\right)`. - -Where this document uses :math:`HKDF\left(salt,\,IKM,\,info,\,L\right)` it -refers to the `HMAC-based key derivation function`_ with a salt value of -:math:`salt`, input key material of :math:`IKM`, context string :math:`info`, -and output keying material length of :math:`L` bytes. - -The Olm Algorithm ------------------ - -Initial setup -~~~~~~~~~~~~~ - -The setup takes four Curve25519_ inputs: Identity keys for Alice and Bob, -:math:`I_A` and :math:`I_B`, and one-time keys for Alice and Bob, -:math:`E_A` and :math:`E_B`. A shared secret, :math:`S`, is generated using -`Triple Diffie-Hellman`_. The initial 256 bit root key, :math:`R_0`, and 256 -bit chain key, :math:`C_{0,0}`, are derived from the shared secret using an -HMAC-based Key Derivation Function using SHA-256_ as the hash function -(HKDF-SHA-256_) with default salt and ``"OLM_ROOT"`` as the info. - -.. math:: - \begin{align} - S&=ECDH\left(I_A,\,E_B\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; - \parallel\;ECDH\left(E_A,\,E_B\right)\\ - R_0\;\parallel\;C_{0,0}&= - HKDF\left(0,\,S,\,\text{"OLM\_ROOT"},\,64\right) - \end{align} - -Advancing the root key -~~~~~~~~~~~~~~~~~~~~~~ - -Advancing a root key takes the previous root key, :math:`R_{i-1}`, and two -Curve25519 inputs: the previous ratchet key, :math:`T_{i-1}`, and the current -ratchet key :math:`T_i`. The even ratchet keys are generated by Alice. -The odd ratchet keys are generated by Bob. A shared secret is generated -using Diffie-Hellman on the ratchet keys. The next root key, :math:`R_i`, and -chain key, :math:`C_{i,0}`, are derived from the shared secret using -HKDF-SHA-256_ using :math:`R_{i-1}` as the salt and ``"OLM_RATCHET"`` as the -info. - -.. math:: - \begin{align} - R_i\;\parallel\;C_{i,0}&=HKDF\left( - R_{i-1},\, - ECDH\left(T_{i-1},\,T_i\right),\, - \text{"OLM\_RATCHET"},\, - 64 - \right) - \end{align} - - -Advancing the chain key -~~~~~~~~~~~~~~~~~~~~~~~ - -Advancing a chain key takes the previous chain key, :math:`C_{i,j-1}`. The next -chain key, :math:`C_{i,j}`, is the HMAC-SHA-256_ of ``"\x02"`` using the -previous chain key as the key. - -.. math:: - \begin{align} - C_{i,j}&=HMAC\left(C_{i,j-1},\,\text{"\textbackslash x02"}\right) - \end{align} - -Creating a message key -~~~~~~~~~~~~~~~~~~~~~~ - -Creating a message key takes the current chain key, :math:`C_{i,j}`. The -message key, :math:`M_{i,j}`, is the HMAC-SHA-256_ of ``"\x01"`` using the -current chain key as the key. The message keys where :math:`i` is even are used -by Alice to encrypt messages. The message keys where :math:`i` is odd are used -by Bob to encrypt messages. - -.. math:: - \begin{align} - M_{i,j}&=HMAC\left(C_{i,j},\,\text{"\textbackslash x01"}\right) - \end{align} - - -The Olm Protocol ----------------- - -Creating an outbound session -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Bob publishes the public parts of his identity key, :math:`I_B`, and some -single-use one-time keys :math:`E_B`. - -Alice downloads Bob's identity key, :math:`I_B`, and a one-time key, -:math:`E_B`. She generates a new single-use key, :math:`E_A`, and computes a -root key, :math:`R_0`, and a chain key :math:`C_{0,0}`. She also generates a -new ratchet key :math:`T_0`. - -Sending the first pre-key messages -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Alice computes a message key, :math:`M_{0,j}`, and a new chain key, -:math:`C_{0,j+1}`, using the current chain key. She replaces the current chain -key with the new one. - -Alice encrypts her plain-text with the message key, :math:`M_{0,j}`, using an -authenticated encryption scheme (see below) to get a cipher-text, -:math:`X_{0,j}`. - -She then sends the following to Bob: - * The public part of her identity key, :math:`I_A` - * The public part of her single-use key, :math:`E_A` - * The public part of Bob's single-use key, :math:`E_B` - * The current chain index, :math:`j` - * The public part of her ratchet key, :math:`T_0` - * The cipher-text, :math:`X_{0,j}` - -Alice will continue to send pre-key messages until she receives a message from -Bob. - -Creating an inbound session from a pre-key message -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Bob receives a pre-key message as above. - -Bob looks up the private part of his single-use key, :math:`E_B`. He can now -compute the root key, :math:`R_0`, and the chain key, :math:`C_{0,0}`, from -:math:`I_A`, :math:`E_A`, :math:`I_B`, and :math:`E_B`. - -Bob then advances the chain key :math:`j` times, to compute the chain key used -by the message, :math:`C_{0,j}`. He now creates the -message key, :math:`M_{0,j}`, and attempts to decrypt the cipher-text, -:math:`X_{0,j}`. If the cipher-text's authentication is correct then Bob can -discard the private part of his single-use one-time key, :math:`E_B`. - -Bob stores Alice's initial ratchet key, :math:`T_0`, until he wants to -send a message. - -Sending normal messages -~~~~~~~~~~~~~~~~~~~~~~~ - -Once a message has been received from the other side, a session is considered -established, and a more compact form is used. - -To send a message, the user checks if they have a sender chain key, -:math:`C_{i,j}`. Alice uses chain keys where :math:`i` is even. Bob uses chain -keys where :math:`i` is odd. If the chain key doesn't exist then a new ratchet -key :math:`T_i` is generated and a new root key :math:`R_i` and chain key -:math:`C_{i,0}` are computed using :math:`R_{i-1}`, :math:`T_{i-1}` and -:math:`T_i`. - -A message key, -:math:`M_{i,j}` is computed from the current chain key, :math:`C_{i,j}`, and -the chain key is replaced with the next chain key, :math:`C_{i,j+1}`. The -plain-text is encrypted with :math:`M_{i,j}`, using an authenticated encryption -scheme (see below) to get a cipher-text, :math:`X_{i,j}`. - -The user then sends the following to the recipient: - * The current chain index, :math:`j` - * The public part of the current ratchet key, :math:`T_i` - * The cipher-text, :math:`X_{i,j}` - -Receiving messages -~~~~~~~~~~~~~~~~~~ - -The user receives a message as above with the sender's current chain index, :math:`j`, -the sender's ratchet key, :math:`T_i`, and the cipher-text, :math:`X_{i,j}`. - -The user checks if they have a receiver chain with the correct -:math:`i` by comparing the ratchet key, :math:`T_i`. If the chain doesn't exist -then they compute a new root key, :math:`R_i`, and a new receiver chain, with -chain key :math:`C_{i,0}`, using :math:`R_{i-1}`, :math:`T_{i-1}` and -:math:`T_i`. - -If the :math:`j` of the message is less than -the current chain index on the receiver then the message may only be decrypted -if the receiver has stored a copy of the message key :math:`M_{i,j}`. Otherwise -the receiver computes the chain key, :math:`C_{i,j}`. The receiver computes the -message key, :math:`M_{i,j}`, from the chain key and attempts to decrypt the -cipher-text, :math:`X_{i,j}`. - -If the decryption succeeds the receiver updates the chain key for :math:`T_i` -with :math:`C_{i,j+1}` and stores the message keys that were skipped in the -process so that they can decode out of order messages. If the receiver created -a new receiver chain then they discard their current sender chain so that -they will create a new chain when they next send a message. - -The Olm Message Format ----------------------- - -Olm uses two types of messages. The underlying transport protocol must provide -a means for recipients to distinguish between them. - -Normal Messages -~~~~~~~~~~~~~~~ - -Olm messages start with a one byte version followed by a variable length -payload followed by a fixed length message authentication code. - -.. code:: - - +--------------+------------------------------------+-----------+ - | Version Byte | Payload Bytes | MAC Bytes | - +--------------+------------------------------------+-----------+ - -The version byte is ``"\x03"``. - -The payload consists of key-value pairs where the keys are integers and the -values are integers and strings. The keys are encoded as a variable length -integer tag where the 3 lowest bits indicates the type of the value: -0 for integers, 2 for strings. If the value is an integer then the tag is -followed by the value encoded as a variable length integer. If the value is -a string then the tag is followed by the length of the string encoded as -a variable length integer followed by the string itself. - -Olm uses a variable length encoding for integers. Each integer is encoded as a -sequence of bytes with the high bit set followed by a byte with the high bit -clear. The seven low bits of each byte store the bits of the integer. The least -significant bits are stored in the first byte. - -=========== ===== ======== ================================================ - Name Tag Type Meaning -=========== ===== ======== ================================================ -Ratchet-Key 0x0A String The public part of the ratchet key, :math:`T_{i}`, - of the message -Chain-Index 0x10 Integer The chain index, :math:`j`, of the message -Cipher-Text 0x22 String The cipher-text, :math:`X_{i,j}`, of the message -=========== ===== ======== ================================================ - -The length of the MAC is determined by the authenticated encryption algorithm -being used. (Olm version 1 uses HMAC-SHA-256, truncated to 8 bytes). The -MAC protects all of the bytes preceding the MAC. - -Pre-Key Messages -~~~~~~~~~~~~~~~~ - -Olm pre-key messages start with a one byte version followed by a variable -length payload. - -.. code:: - - +--------------+------------------------------------+ - | Version Byte | Payload Bytes | - +--------------+------------------------------------+ - -The version byte is ``"\x03"``. - -The payload uses the same key-value format as for normal messages. - -============ ===== ======== ================================================ - Name Tag Type Meaning -============ ===== ======== ================================================ -One-Time-Key 0x0A String The public part of Bob's single-use key, - :math:`E_b`. -Base-Key 0x12 String The public part of Alice's single-use key, - :math:`E_a`. -Identity-Key 0x1A String The public part of Alice's identity key, - :math:`I_a`. -Message 0x22 String An embedded Olm message with its own version and - MAC. -============ ===== ======== ================================================ - -Olm Authenticated Encryption ----------------------------- - -Version 1 -~~~~~~~~~ - -Version 1 of Olm uses AES-256_ in CBC_ mode with `PKCS#7`_ padding for -encryption and HMAC-SHA-256_ (truncated to 64 bits) for authentication. The -256 bit AES key, 256 bit HMAC key, and 128 bit AES IV are derived from the -message key using HKDF-SHA-256_ using the default salt and an info of -``"OLM_KEYS"``. - -.. math:: - - \begin{align} - AES\_KEY_{i,j}\;\parallel\;HMAC\_KEY_{i,j}\;\parallel\;AES\_IV_{i,j} - &= HKDF\left(0,\,M_{i,j},\text{"OLM\_KEYS"},\,80\right) \\ - \end{align} - -The plain-text is encrypted with AES-256, using the key :math:`AES\_KEY_{i,j}` -and the IV :math:`AES\_IV_{i,j}` to give the cipher-text, :math:`X_{i,j}`. - -Then the entire message (including the Version Byte and all Payload Bytes) are -passed through HMAC-SHA-256. The first 8 bytes of the MAC are appended to the message. - -Message authentication concerns -------------------------------- - -To avoid unknown key-share attacks, the application must include identifying -data for the sending and receiving user in the plain-text of (at least) the -pre-key messages. Such data could be a user ID, a telephone number; -alternatively it could be the public part of a keypair which the relevant user -has proven ownership of. - -.. admonition:: Example attacks - - 1. Alice publishes her public Curve25519 identity key, :math:`I_A`. Eve - publishes the same identity key, claiming it as her own. Bob downloads - Eve's keys, and associates :math:`I_A` with Eve. Alice sends a message to - Bob; Eve intercepts it before forwarding it to Bob. Bob believes the - message came from Eve rather than Alice. - - This is prevented if Alice includes her user ID in the plain-text of the - pre-key message, so that Bob can see that the message was sent by Alice - originally. - - 2. Bob publishes his public Curve25519 identity key, :math:`I_B`. Eve - publishes the same identity key, claiming it as her own. Alice downloads - Eve's keys, and associates :math:`I_B` with Eve. Alice sends a message to - Eve; Eve cannot decrypt it, but forwards it to Bob. Bob believes the - Alice sent the message to him, wheras Alice intended it to go to Eve. - - This is prevented by Alice including the user ID of the intended recpient - (Eve) in the plain-text of the pre-key message. Bob can now tell that the - message was meant for Eve rather than him. - -IPR ---- - -The Olm specification (this document) is hereby placed in the public domain. - -Feedback --------- - -Can be sent to olm at matrix.org. - -Acknowledgements ----------------- - -The ratchet that Olm implements was designed by Trevor Perrin and Moxie -Marlinspike - details at https://whispersystems.org/docs/specifications/doubleratchet/. Olm is -an entirely new implementation written by the Matrix.org team. - -.. _`Curve25519`: http://cr.yp.to/ecdh.html -.. _`Triple Diffie-Hellman`: https://whispersystems.org/blog/simplifying-otr-deniability/ -.. _`HMAC-based key derivation function`: https://tools.ietf.org/html/rfc5869 -.. _`HKDF-SHA-256`: https://tools.ietf.org/html/rfc5869 -.. _`HMAC-SHA-256`: https://tools.ietf.org/html/rfc2104 -.. _`SHA-256`: https://tools.ietf.org/html/rfc6234 -.. _`AES-256`: http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf -.. _`CBC`: http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf -.. _`PKCS#7`: https://tools.ietf.org/html/rfc2315 -- cgit v1.2.3 From 73288e6f2a37ef622ae0bbe107d1cebfc6caaa9c Mon Sep 17 00:00:00 2001 From: Aaron Raimist Date: Wed, 1 May 2019 19:20:08 -0500 Subject: Convert signing.rst to markdown Signed-off-by: Aaron Raimist --- docs/signing.md | 102 +++++++++++++++++++++++++++++++++++++++++++++++ docs/signing.rst | 118 ------------------------------------------------------- 2 files changed, 102 insertions(+), 118 deletions(-) create mode 100644 docs/signing.md delete mode 100644 docs/signing.rst (limited to 'docs') diff --git a/docs/signing.md b/docs/signing.md new file mode 100644 index 0000000..fcc5342 --- /dev/null +++ b/docs/signing.md @@ -0,0 +1,102 @@ +# Signature keys and user identity in libolm + +The use of any public-key based cryptography system such as Olm presents the +need for our users Alice and Bob to verify that they are in fact communicating +with each other, rather than a man-in-the-middle. Typically this requires an +out-of-band process in which Alice and Bob verify that they have the correct +public keys for each other. For example, this might be done via physical +presence or via a voice call. + +In the basic [Olm][] protocol, it is sufficient to compare the public +Curve25519 identity keys. As a naive example, Alice would meet Bob and ensure +that the identity key she downloaded from the key server matched that shown by +his device. This prevents the eavesdropper Eve from decrypting any messages +sent from Alice to Bob, or from masquerading as Bob to send messages to Alice: +she has neither Alice's nor Bob's private identity key, so cannot successfully +complete the triple-DH calculation to compute the shared secret, $`S`$, +which in turn prevents her decrypting intercepted messages, or from creating +new messages with valid MACs. Obviously, for protection to be complete, Bob +must similarly verify Alice's key. + +However, the use of the Curve25519 key as the "fingerprint" in this way makes +it difficult to carry out signing operations. For instance, it may be useful to +cross-sign identity keys for different devices, or, as discussed below, to sign +one-time keys. Curve25519 keys are intended for use in DH calculations, and +their use to calculate signatures is non-trivial. + +The solution adopted in this library is to generate a signing key for each +user. This is an [Ed25519][] keypair, which is used to calculate a signature on +an object including both the public Ed25519 signing key and the public +Curve25519 identity key. It is then the **public Ed25519 signing key** which is +used as the device fingerprint which Alice and Bob verify with each other. + +By verifying the signatures on the key object, Alice and Bob then get the same +level of assurance about the ownership of the Curve25519 identity keys as if +they had compared those directly. + +## Signing one-time keys + +The Olm protocol requires users to publish a set of one-time keys to a key +server. To establish an Olm session, the originator downloads a key for the +recipient from this server. The decision of whether to sign these one-time keys +is left to the application. There are both advantages and disadvantages to +doing so. + +Consider the scenario where one-time keys are unsigned. Alice wants to initiate +an Olm session with Bob. Bob uploads his one-time keys, $`E_B`$, but Eve +replaces them with ones she controls, $`E_E`$. Alice downloads one of the +compromised keys, and sends a pre-key message using a shared secret $`S`$, +where: + +```math +S = ECDH\left(I_A,\,E_E\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; + \parallel\;ECDH\left(E_A,\,E_E\right) +``` + +Eve cannot decrypt the message because she does not have the private parts of +either $`E_A`$ nor $`I_B`$, so cannot calculate +$`ECDH\left(E_A,\,I_B\right)`$. However, suppose she later compromises +Bob's identity key $`I_B`$. This would give her the ability to decrypt any +pre-key messages sent to Bob using the compromised one-time keys, and is thus a +problematic loss of forward secrecy. If Bob signs his keys with his Ed25519 +signing key (and Alice verifies the signature before using them), this problem +is avoided. + +On the other hand, signing the one-time keys leads to a reduction in +deniability. Recall that the shared secret is calculated as follows: + +```math +S = ECDH\left(I_A,\,E_B\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; + \parallel\;ECDH\left(E_A,\,E_B\right) +``` + +If keys are unsigned, a forger can make up values of $`E_A`$ and +$`E_B`$, and construct a transcript of a conversation which looks like it +was between Alice and Bob. Alice and Bob can therefore plausibly deny their +partition in any conversation even if they are both forced to divulge their +private identity keys, since it is impossible to prove that the transcript was +a conversation between the two of them, rather than constructed by a forger. + +If $`E_B`$ is signed, it is no longer possible to construct arbitrary +transcripts. Given a transcript and Alice and Bob's identity keys, we can now +show that at least one of Alice or Bob was involved in the conversation, +because the ability to calculate $`ECDH\left(I_A,\,E_B\right)`$ requires +knowledge of the private parts of either $`I_A`$ (proving Alice's +involvement) or $`E_B`$ (proving Bob's involvement, via the +signature). Note that it remains impossible to show that *both* Alice and Bob +were involved. + +In conclusion, applications should consider whether to sign one-time keys based +on the trade-off between forward secrecy and deniability. + +## License + +This document is licensed under the Apache License, Version 2.0 +http://www.apache.org/licenses/LICENSE-2.0. + +## Feedback + +Questions and feedback can be sent to olm at matrix.org. + +[Ed25519]: http://ed25519.cr.yp.to/ +[Olm]: https://gitlab.matrix.org/matrix-org/olm/blob/master/docs/olm.md diff --git a/docs/signing.rst b/docs/signing.rst deleted file mode 100644 index 05c55eb..0000000 --- a/docs/signing.rst +++ /dev/null @@ -1,118 +0,0 @@ -.. Copyright 2016 OpenMarket Ltd -.. -.. Licensed under the Apache License, Version 2.0 (the "License"); -.. you may not use this file except in compliance with the License. -.. You may obtain a copy of the License at -.. -.. http://www.apache.org/licenses/LICENSE-2.0 -.. -.. Unless required by applicable law or agreed to in writing, software -.. distributed under the License is distributed on an "AS IS" BASIS, -.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -.. See the License for the specific language governing permissions and -.. limitations under the License. - - -Signature keys and user identity in libolm -========================================== - -The use of any public-key based cryptography system such as Olm presents the -need for our users Alice and Bob to verify that they are in fact communicating -with each other, rather than a man-in-the-middle. Typically this requires an -out-of-band process in which Alice and Bob verify that they have the correct -public keys for each other. For example, this might be done via physical -presence or via a voice call. - -In the basic `Olm `_ protocol, it is sufficient to compare the public -Curve25519 identity keys. As a naive example, Alice would meet Bob and ensure -that the identity key she downloaded from the key server matched that shown by -his device. This prevents the eavesdropper Eve from decrypting any messages -sent from Alice to Bob, or from masquerading as Bob to send messages to Alice: -she has neither Alice's nor Bob's private identity key, so cannot successfully -complete the triple-DH calculation to compute the shared secret, :math:`S`, -which in turn prevents her decrypting intercepted messages, or from creating -new messages with valid MACs. Obviously, for protection to be complete, Bob -must similarly verify Alice's key. - -However, the use of the Curve25519 key as the "fingerprint" in this way makes -it difficult to carry out signing operations. For instance, it may be useful to -cross-sign identity keys for different devices, or, as discussed below, to sign -one-time keys. Curve25519 keys are intended for use in DH calculations, and -their use to calculate signatures is non-trivial. - -The solution adopted in this library is to generate a signing key for each -user. This is an `Ed25519`_ keypair, which is used to calculate a signature on -an object including both the public Ed25519 signing key and the public -Curve25519 identity key. It is then the **public Ed25519 signing key** which is -used as the device fingerprint which Alice and Bob verify with each other. - -By verifying the signatures on the key object, Alice and Bob then get the same -level of assurance about the ownership of the Curve25519 identity keys as if -they had compared those directly. - -Signing one-time keys ---------------------- - -The Olm protocol requires users to publish a set of one-time keys to a key -server. To establish an Olm session, the originator downloads a key for the -recipient from this server. The decision of whether to sign these one-time keys -is left to the application. There are both advantages and disadvantages to -doing so. - -Consider the scenario where one-time keys are unsigned. Alice wants to initiate -an Olm session with Bob. Bob uploads his one-time keys, :math:`E_B`, but Eve -replaces them with ones she controls, :math:`E_E`. Alice downloads one of the -compromised keys, and sends a pre-key message using a shared secret :math:`S`, -where: - -.. math:: - S = ECDH\left(I_A,\,E_E\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; - \parallel\;ECDH\left(E_A,\,E_E\right) - -Eve cannot decrypt the message because she does not have the private parts of -either :math:`E_A` nor :math:`I_B`, so cannot calculate -:math:`ECDH\left(E_A,\,I_B\right)`. However, suppose she later compromises -Bob's identity key :math:`I_B`. This would give her the ability to decrypt any -pre-key messages sent to Bob using the compromised one-time keys, and is thus a -problematic loss of forward secrecy. If Bob signs his keys with his Ed25519 -signing key (and Alice verifies the signature before using them), this problem -is avoided. - -On the other hand, signing the one-time keys leads to a reduction in -deniability. Recall that the shared secret is calculated as follows: - -.. math:: - S = ECDH\left(I_A,\,E_B\right)\;\parallel\;ECDH\left(E_A,\,I_B\right)\; - \parallel\;ECDH\left(E_A,\,E_B\right) - -If keys are unsigned, a forger can make up values of :math:`E_A` and -:math:`E_B`, and construct a transcript of a conversation which looks like it -was between Alice and Bob. Alice and Bob can therefore plausibly deny their -partition in any conversation even if they are both forced to divulge their -private identity keys, since it is impossible to prove that the transcript was -a conversation between the two of them, rather than constructed by a forger. - -If :math:`E_B` is signed, it is no longer possible to construct arbitrary -transcripts. Given a transcript and Alice and Bob's identity keys, we can now -show that at least one of Alice or Bob was involved in the conversation, -because the ability to calculate :math:`ECDH\left(I_A,\,E_B\right)` requires -knowledge of the private parts of either :math:`I_A` (proving Alice's -involvement) or :math:`E_B` (proving Bob's involvement, via the -signature). Note that it remains impossible to show that *both* Alice and Bob -were involved. - -In conclusion, applications should consider whether to sign one-time keys based -on the trade-off between forward secrecy and deniability. - -License -------- - -This document is licensed under the `Apache License, Version 2.0 -`_. - -Feedback --------- - -Questions and feedback can be sent to olm at matrix.org. - -.. _`Ed25519`: http://ed25519.cr.yp.to/ -- cgit v1.2.3 From b6cd1690f2728d9ffc08be7bf7a3589b6e5a28e2 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Mon, 20 May 2019 21:38:16 +0100 Subject: merge --- docs/megolm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'docs') diff --git a/docs/megolm.md b/docs/megolm.md index f9acfd0..b9eedec 100644 --- a/docs/megolm.md +++ b/docs/megolm.md @@ -76,7 +76,7 @@ from $`R_{i,0}`$. The complete ratchet value, $`R_{i}`$, is hashed to generate the keys used to encrypt each message. This scheme allows the ratchet to be advanced an -arbitrary amount forwards while needing at most 1023 hash computations. A +arbitrary amount forwards while needing at most 1020 hash computations. A client can decrypt chat history onwards from the earliest value of the ratchet it is aware of, but cannot decrypt history from before that point without reversing the hash function. -- cgit v1.2.3