diff --git a/protocols/BTP.md b/protocols/BTP.md
index 7706c5a7d3893e4e249b85646619e9d438a8a5d2..af96bb8151040a9674fab9279d69a11a1e2d6e06 100644
--- a/protocols/BTP.md
+++ b/protocols/BTP.md
@@ -1,6 +1,6 @@
 # Bramble Transport Protocol, version 2
 
-## Introduction
+## 1 Introduction
 
 Bramble Transport Protocol (BTP) provides a secure channel between two endpoint devices, ensuring the confidentiality, integrity, authenticity and forward secrecy of their communication across a wide range of underlying transports.
 
@@ -14,63 +14,63 @@ The BTP wire protocol includes optional padding and does not use any timeouts, h
 
 BTP does not attempt to conceal the identities of the communicating parties or the fact that they are communicating - in other words, it does not provide anonymity, unlinkability or unobservability. If such properties are required, BTP can use anonymity systems such as Tor and Mixminion as underlying transports.
 
-Forward secrecy is achieved by establishing an initial shared secret between each pair of endpoint devices and using a one-way key derivation function to generate a series of temporary shared secrets from the initial shared secret. Once both devices have destroyed a given temporary secret, any keys derived from it cannot be re-derived if the devices are later compromised. A protocol for establishing the shared secret is described in other work.
+Forward secrecy is achieved by establishing an initial shared secret between each pair of endpoint devices and using a one-way key derivation function to generate a series of temporary keys from the shared secret. Once both devices have erased a given key, it cannot be re-derived if the devices are later compromised. A protocol for establishing the shared secret is described in other work.
 
-### Motivation
+### 1.1 Motivation
  
-The primary motivation for BTP's design is to provide a building block for censorship-resistant and privacy-preserving communication systems, given powerful adversaries such as governments who can compromise the availability and/or privacy of centralised servers.
+The primary motivation for BTP's design is to provide a building block for censorship-resistant communication systems, given powerful adversaries such as governments who can compromise the availability and/or privacy of centralised servers.
 
 BTP is therefore designed to be used across a diverse mixture of transports, both online and offline, with varying properties. The transports are not given access to unencrypted or unauthenticated data; nor are they required to ensure the confidentiality, integrity, authenticity or forward secrecy of the data they carry. BTP is responsible for providing those properties.
 
-BTP is unusual in that it provides forward secrecy without two-way communication, and also has features that can make it non-trivial for an adversary to detect when BTP is in use.
+BTP is unusual in that it provides forward secrecy without ongoing two-way communication, and also has features that can make it non-trivial for an adversary to detect when BTP is in use.
 
-### Design Requirements
+### 1.2 Design Requirements
 
--   ***Flexibility***
+-   **Flexibility**
 
     BTP should be able to operate over a wide range of underlying transports with bandwidths varying from kilobits to gigabits per second, and with latencies varying from milliseconds to days.
 
--   ***Layering***
+-   **Layering**
 
     BTP should treat each underlying transport connection as a unidirectional or bidirectional sequence of bytes with a simple socket-like interface (open, read/write, close). Likewise, BTP should provide a similar interface to higher protocol layers.
 
--   ***Concealability***
+-   **Concealability**
 
     BTP should not reveal any plaintext fields that would make it easily distinguishable from other protocols. It should be compatible with techniques such as traffic morphing that are designed to resist traffic analysis and traffic classification.
 
--   ***Confidentiality***
+-   **Confidentiality**
 
     The adversary should not be able to learn what data is being transported across a BTP stream.
 
--   ***Integrity***
+-   **Integrity**
 
     The adversary should not be able to cause either endpoint of a BTP stream to read data from the BTP layer that differs from the data written to the BTP layer by the other endpoint. If the adversary truncates a BTP stream, the receiving endpoint should be able to detect that this has happened.
 
--   ***Authenticity***
+-   **Authenticity**
 
     The adversary should not be able to cause either endpoint of a BTP stream to accept data from any third party as though it came from the other endpoint.
 
--   ***Forward Secrecy***
+-   **Forward Secrecy**
 
     The adversary should not be able to learn what data was transported across a BTP stream if, at some later time, the adversary compromises one or both of the endpoint devices.
 
 BTP is not required to conceal the identities of the communicating parties or the fact that they are communicating.
 
-### Adversary Model
+### 1.3 Adversary Model
 
 BTP is intended to be used in systems that resist surveillance and censorship by powerful adversaries, such as governments. We must therefore assume:
 
-â€¢ The adversary can observe, block, delay, replay and modify traffic on all underlying transports.
+- The adversary can observe, block, delay, replay and modify traffic on all underlying transports.
 
-â€¢ The adversary can choose the data written to the BTP layer by higher protocol layers.
+- The adversary can choose the data written to the BTP layer by higher protocol layers.
 
-â€¢ The adversary has a limited ability to compromise endpoint devices. If a device is compromised, the adversary can access any information held in the device's volatile memory or persistent storage.
+- The adversary has a limited ability to compromise endpoint devices. If a device is compromised, the adversary can access any information held in the device's volatile memory or persistent storage.
 
-â€¢ The adversary cannot break standard cryptographic primitives such as block ciphers and message authentication codes.
+- The adversary cannot break standard cryptographic primitives such as block ciphers and message authentication codes.
 
-### Underlying Transports
+### 1.4 Underlying Transports
 
-BTP can operate over any transport that can deliver a unidirectional or bidirectional sequence of bytes, which we refer to as a ***connection***. BTP uses each unidirectional connection to carry an encrypted and authenticated sequence of bytes, which we refer to as a ***stream***. BTP treats each bidirectional connection as a pair of complementary unidirectional connections and uses it to carry a stream in each direction.
+BTP can operate over any transport that can deliver a unidirectional or bidirectional sequence of bytes, which we refer to as a **connection**. BTP uses each unidirectional connection to carry an encrypted and authenticated sequence of bytes, which we refer to as a **stream**. BTP treats each bidirectional connection as a pair of complementary unidirectional connections and uses it to carry a stream in each direction.
 
 Transports must ensure that bytes within a given connection arrive in the correct order. If the bytes within a connection are reordered, none of BTP's security properties are lost, but it will reject the stream or streams carried by the connection.
 
@@ -78,137 +78,135 @@ Transport connections themselves may be reordered. There is no requirement that
 
 BTP can be used even on transports with very high latency, such as disks sent through the mail.
 
-If a transport imposes a maximum connection length, such as the storage capacity of a disk, BTP passes this restriction on to higher protocol layers; it does not fragment a stream across multiple connections. BTP cannot use transport connections with a capacity less than the minimum length of a stream. (*Note*: In the current version of the protocol, the minimum length of a stream is 100 bytes.)
+If a transport imposes a maximum connection length, such as the storage capacity of a disk, BTP passes this restriction on to higher protocol layers; it does not fragment a stream across multiple connections. BTP cannot use transport connections with a capacity less than the minimum length of a stream. (*Note:* In the current version of the protocol, the minimum length of a stream is 100 bytes.)
 
 The use of BTP over a datagram-oriented transport such as UDP (which does not have a concept of connections) would require the use of an intermediate connection-oriented protocol such as UDT.
 
-### Initial state
+### 1.5 Initial state
 
 Before two devices can communicate using BTP they must establish the following initial state:
 
--   An agreement as to which device plays the role of ***Alice*** and which plays the role of ***Bob***
+- An agreement as to which device plays the role of **Alice** and which plays the role of **Bob**
 
--   A shared secret KEY\_LEN bytes long, ***S***
+- A shared secret KEY\_LEN bytes long, **S**
 
--   A timestamp in seconds since the Unix epoch, ***T***
+- A timestamp in seconds since the Unix epoch, **T**
 
--   The maximum expected difference between the devices' clocks in seconds, ***D***
+- The maximum expected difference between the devices' clocks in seconds, **D**
 
--   The maximum expected latency of the transport in seconds, ***L***
+- The maximum expected latency of the transport in seconds, **L**
 
 The establishment of the initial state is not addressed by BTP itself. BTP is designed to be used with a separate key agreement protocol that securely establishes the initial state.
 
-#### Roles of the two parties
+##### Roles of the two parties
 
 The roles of Alice and Bob are identical except for some key derivation constants. It does not matter which device plays which role, as long as one is Alice and the other Bob.
 
-#### Shared secret
+##### Shared secret
 
-The shared secret must have KEY\_LEN * 8 bits of entropy. BTP does not place any restrictions on the method used to establish the initial shared secret, except that it must not be possible to re-derive the secret from any information retained by the parties after the secret has been destroyed. If Alice and Bob wish to communicate across more than one transport, they must establish a separate initial shared secret for each transport to ensure they do not reuse keys.
+The shared secret must have KEY\_LEN * 8 bits of entropy. BTP does not place any restrictions on the method used to establish the shared secret, except that it must not be possible to re-derive the secret from any information retained by the parties after the secret has been erased. If Alice and Bob wish to communicate across more than one transport, they must establish a separate shared secret for each transport to ensure they do not reuse keys.
 
-#### Timestamp
+##### Timestamp
 
 T must be in the past according to both devices' clocks. T may be hard-coded or negotiated by the key agreement protocol.
 
-#### Maximum clock difference
+##### Maximum clock difference
 
-D may be hard-coded. The current version of BTP assumes D = 86400 (24 hours). This is to accommodate mobile devices, which often have inaccurate clocks.
+D may be hard-coded. The current version of BTP assumes D = 86,400 (24 hours). This is to accommodate mobile devices, which often have inaccurate clocks.
 
-#### Maximum latency
+##### Maximum latency
 
 Alice and Bob must also agree on a maximum latency for each transport they wish to use. If one endpoint of a connection starts writing to the underlying transport at time T0 and the other endpoint starts reading from the transport at time T1, we call T1 - T0 the latency of the connection. For any given transport we can choose some maximum latency, L , such that the latency of any connection is unlikely to exceed L under normal conditions. For example, we might choose one minute as the maximum latency for TCP, or two weeks as the maximum latency for disks sent through the mail.
 
 If a connection exceeds the maximum latency, none of BTP's security properties are lost but it may reject the stream or streams carried by the connection.
 
-### Cryptographic Primitives
+### 1.6 Cryptographic Primitives
 
-**NOTATION**: || denotes concatenation, double quotes denote an ASCII string, int(x) denotes x represented as a 64-bit integer, and len(x) denotes the length of x in bytes, represented as a 32-bit integer. All integers in BTP are big-endian.
+**Notation:** || denotes concatenation, double quotes denote an ASCII string, int(x) denotes x represented as a 64-bit integer, and len(x) denotes the length of x in bytes, represented as a 32-bit integer. All integers in BTP are big-endian.
 
 BTP uses three cryptographic primitives:
 
-1.  ***A pseudo-random function***, PRF(k, m)
+1.  **A pseudo-random function**, PRF(k, m)
 
-    The output of PRF(k, m) is PRF\_LEN bytes. *Note*: The current version of BTP uses keyed BLAKE2s as the pseudo-random function, giving PRF\_LEN = 32.
+    The output of PRF(k, m) is PRF\_LEN bytes. (*Note:* The current version of BTP uses keyed BLAKE2s as the pseudo-random function, giving PRF\_LEN = 32.)
 
-2.  ***An authenticated cipher***, ENC(k, n, m) and DEC(k, n, m), where n is a nonce
+2.  **An authenticated cipher**, ENC(k, n, m) and DEC(k, n, m), where n is a nonce
 
-    The output of ENC(k, n, m) is AUTH\_LEN bytes longer than m. All keys are KEY\_LEN bytes and all nonces are NONCE\_LEN bytes. For simplicity we require that PRF\_LEN == KEY\_LEN. *Note*: The current version of BTP uses XSalsa20/Poly1305 for the authenticated cipher, giving KEY\_LEN = 32, NONCE\_LEN = 24, and AUTH\_LEN = 16.
+    The output of ENC(k, n, m) is AUTH\_LEN bytes longer than m. All keys are KEY\_LEN bytes and all nonces are NONCE\_LEN bytes. For simplicity we require that PRF\_LEN = KEY\_LEN. (*Note:* The current version of BTP uses XSalsa20/Poly1305 for the authenticated cipher, giving KEY\_LEN = 32, NONCE\_LEN = 24, and AUTH\_LEN = 16.)
 
-3.  ***A random number generator***, R(n) with an output length of n bytes
+3.  **A random number generator**, R(n), with an output length of n bytes
 
     R(n) must be either a true random number generator or a cryptographically secure pseudo-random number generator.
 
-## Key Management Protocol
+## 2 Key Management Protocol
 
-### Key Derivation Function
+### 2.1 Key Derivation Function
 
-BTP's forward secrecy relies on the one-way nature of the key derivation function.
+BTP uses a **key derivation function** to derive encryption and authentication keys from the shared secret S. Forward secrecy relies on a one-way nature of the key derivation function.
 
-BTP uses PRF(k, m) to define the key derivation function:
+The key derivation function is based on PRF(k, m):
 
--   KDF(k, x\_1, ..., x\_n) == PRF(k, len(x\_1) || x\_1 || ... || len(x\_n) || x\_n)
+- KDF(k, x\_1, ..., x\_n) = PRF(k, len(x\_1) || x\_1 || ... || len(x\_n) || x\_n)
 
-### Initial keys
+### 2.2 Initial Keys
 
-Each device derives four initial keys from S. Alice derives her initial keys as follows:
+Each endpoint derives four initial keys from S. Alice derives her initial keys as follows:
 
--   outgoing\_tag\_key = KDF(S, "ALICE\_TAG\_KEY")
+- outgoing\_tag\_key := KDF(S, "ALICE\_TAG\_KEY")
 
--   outgoing\_header\_key = KDF(S, "ALICE\_HEADER\_KEY")
+- outgoing\_header\_key := KDF(S, "ALICE\_HEADER\_KEY")
 
--   incoming\_tag\_key = KDF(S, "BOB\_TAG\_KEY")
+- incoming\_tag\_key := KDF(S, "BOB\_TAG\_KEY")
 
--   incoming\_header\_key = KDF(S, "BOB\_HEADER\_KEY")
+- incoming\_header\_key := KDF(S, "BOB\_HEADER\_KEY")
 
 Bob derives his initial keys as follows:
 
--   outgoing\_tag\_key = KDF(S, "BOB\_TAG\_KEY")
+- outgoing\_tag\_key := KDF(S, "BOB\_TAG\_KEY")
 
--   outgoing\_header\_key = KDF(S, "BOB\_HEADER\_KEY")
+- outgoing\_header\_key := KDF(S, "BOB\_HEADER\_KEY")
 
--   incoming\_tag\_key = KDF(S, "ALICE\_TAG\_KEY")
+- incoming\_tag\_key := KDF(S, "ALICE\_TAG\_KEY")
 
--   incoming\_header\_key = KDF(S, "ALICE\_HEADER\_KEY")
+- incoming\_header\_key := KDF(S, "ALICE\_HEADER\_KEY")
 
-Thus Alice's outgoing keys are Bob's incoming keys and vice versa. Both devices MUST then erase the shared secret S.
+Thus Alice's outgoing keys are Bob's incoming keys and vice versa. Both devices must then erase the shared secret S.
 
-### Rotation and Retention Periods
+### 2.3 Rotation and Retention Periods
 
 BTP achieves forward secrecy by rotating keys periodically. The key rotation function is deterministic, so devices that start from the same S and T will have matching keys in each rotation period.
 
 The length of each rotation period is R = D + L seconds. Rotation periods are aligned with the Unix epoch.
 
-If the sender starts sending a stream at time t according to the sender's clock, the recipient may start receiving the stream at any time between t - D and t + D + L according to the recipient's clock. Therefore each device MUST retain the incoming keys for the previous, current and next rotation periods, along with the outgoing keys for the current rotation period.
+If the sender starts sending a stream at time t according to the sender's clock, the recipient may start receiving the stream at any time between t - D and t + D + L according to the recipient's clock. Therefore each device must retain the incoming keys for the previous, current and next rotation periods, along with the outgoing keys for the current rotation period.
 
 The initial keys derived from S are used as the keys for the rotation period that precedes the period containing the timestamp T.
 
 The keys for the i<sup>th</sup> rotation period are derived from the previous period's keys as follows:
 
--   outgoing\_tag\_key = KDF(outgoing\_tag\_key, "ROTATE", int(i))
+- outgoing\_tag\_key := KDF(outgoing\_tag\_key, "ROTATE", int(i))
 
--   outgoing\_header\_key = KDF(outgoing\_header\_key, "ROTATE", int(i))
+- outgoing\_header\_key := KDF(outgoing\_header\_key, "ROTATE", int(i))
 
--   incoming\_tag\_key = KDF(incoming\_tag\_key, "ROTATE", int(i))
+- incoming\_tag\_key := KDF(incoming\_tag\_key, "ROTATE", int(i))
 
--   incoming\_header\_key = KDF(incoming\_header\_key, "ROTATE", int(i))
+- incoming\_header\_key := KDF(incoming\_header\_key, "ROTATE", int(i))
 
-Keys MUST be erased when they are no longer needed , to ensure forward secrecy.
+Keys must be erased when they are no longer needed , to ensure forward secrecy.
 
-## Wire Protocol
+## 3 Wire Protocol
 
-A stream consists of three parts: a ***tag***, a ***stream header***, and one or more ***frames***.
+A stream consists of three parts: a **tag**, a **stream header**, and one or more **frames**.
 
-BTP uses ***reordering windows*** operating on the tags as a mechanism to allow the recipient to recognise streams that are received out of order due to reordering and/or loss of connections by the underlying transport.
+### 3.1 Tags
 
-### Tags
-
-The sender starts each stream with a pseudo-random tag, which is TAG\_LEN bytes long. (*Note*: In the current version of the protocol, TAG\_LEN = 16.) The recipient calculates the same tag in advance and uses it to recognise which sender the stream comes from and which incoming header key should be used for the stream.
+Each stream starts with a pseudo-random tag TAG\_LEN bytes long. (*Note:* In the current version of the protocol, TAG\_LEN = 16.) The recipient calculates the same tag in advance and uses it to recognise which sender the stream comes from and which incoming header key should be used for the stream.
 
 The tag for the i<sup>th</sup> stream from a given sender to a given recipient in a given rotation period is the first TAG\_LEN bytes of PRF(k, int(i)), where k is the sender's outgoing tag key (which is also the recipient's incoming tag key). For simplicity we require that TAG\_LEN â‰¤ PRF\_LEN. Streams are counted from zero in each rotation period.
 
-### Stream Headers
+### 3.2 Stream Headers
 
-The pseudo-random tag is followed by the stream header, which consists of a true random ***initialisation vector*** (***IV***) followed by a symmetric ***ephemeral cipher key.***
+The pseudo-random tag is followed by the stream header, which consists of a true random **initialisation vector (IV)** followed by a symmetric **ephemeral cipher key.**
 
 The ephemeral cipher key is encrypted and authenticated with the sender's outgoing header key, using the random IV as the nonce. The ephemeral cipher key is used for encrypting and authenticating the rest of the stream.
 
@@ -216,69 +214,65 @@ The stream header is NONCE\_LEN + KEY\_LEN + AUTH\_LEN bytes long.
 
 The stream header is composed as follows:
 
--   stream\_iv = R(NONCE\_LEN)
-
--   stream\_header = stream\_iv || ENC(outgoing\_header\_key, stream\_iv, ephemeral\_cipher\_key)
+- stream\_iv = R(NONCE\_LEN)
 
-### Frames
+- stream\_header = stream\_iv || ENC(outgoing\_header\_key, stream\_iv, ephemeral\_cipher\_key)
 
-The remainder of the stream consists of one or more frames. Each frame has a fixed-length ***frame header*** and a variable-length ***frame body*** that may contain data and/or padding. The frames in each stream are numbered from zero. A stream may not contain more than 2<sup>63</sup> frames.
+### 3.3 Frames
 
-The header and body of the frame are encrypted and authenticated separately as described below.
+The remainder of the stream consists of one or more frames. Each frame has a fixed-length **frame header** and a variable-length **frame body** that may contain data and/or padding. The frames in each stream are numbered from zero. A stream may not contain more than 2<sup>63</sup> frames.
 
-#### Frame header
+##### Frame header
 
-The plaintext frame header is 4 bytes long, with the following format:
+The plaintext frame header is four bytes long, with the following format:
 
--   Bit 0: Final frame flag, set to one if this is the last frame in the stream
+- Bit 0: Final frame flag, set to one if this is the last frame in the stream
 
--   Bits 1-15: Length of the data in bytes as a 15-bit integer
+- Bits 1-15: Length of the data in bytes as a 15-bit integer
 
--   Bit 16: Zero
+- Bit 16: Zero
 
--   Bits 17-31: Length of the padding in bytes as a 15-bit integer
+- Bits 17-31: Length of the padding in bytes as a 15-bit integer
 
 The final frame flag allows the recipient to detect the end of the stream without reading to EOF, which is not possible for all transports on all platforms.
 
-#### Frame body
+##### Frame body
 
 The plaintext frame body contains data and/or padding. The total length of the data and padding must be less than 2<sup>15</sup> bytes.
 
-#### Encryption and authentication
+##### Encryption and authentication
 
-The header and body are encrypted and authenticated separately using the ephemeral cipher key and deterministic nonces. The nonces are not sent.
-
-The encrypted and authenticated frame header is 4 + AUTH\_LEN bytes long, while the encrypted and authenticated frame body is AUTH\_LEN bytes longer than the data and padding.
-
-#### Nonces
+The header and body of each frame are encrypted and authenticated separately using the ephemeral cipher key and deterministic nonces. The nonces are not sent.
 
 The nonce for the frame header is NONCE\_LEN bytes long, with the following format:
 
--   Bit 0: Header flag, set to one
+- Bit 0: Header flag, set to one
+
+- Bits 1-63: Frame number as a 63-bit integer
 
--   Bits 1-63: Frame number as a 63-bit integer
+- Remaining bits: Zero
 
--   Remaining bits: Zero
+The nonce for the frame body is NONCE\_LEN bytes long, with the following format:
 
-    The nonce for the frame body is NONCE\_LEN bytes long, with the following format:
+- Bit 0: Header flag, set to zero
 
--   Bit 0: Header flag, set to zero
+- Bits 1-63: Frame number as a 63-bit integer
 
--   Bits 1-63: Frame number as a 63-bit integer
+- Remaining bits: Zero
 
--   Remaining bits: Zero
+The encrypted and authenticated frame header is 4 + AUTH\_LEN bytes long, while the encrypted and authenticated frame body is AUTH\_LEN bytes longer than the data and padding.
 
-### Reordering windows
+## 4 Reordering windows
 
-Each endpoint maintains reordering windows for the previous, current and next rotation periods. The windows are used to recognise incoming streams by their tags.
+BTP uses **reordering windows** to allow endpoints to recognise streams that are received out of order due to reordering and/or loss of connections by the underlying transport. Each endpoint maintains reordering windows for the previous, current and next rotation periods. The windows are used to recognise incoming streams by their tags.
 
-Each reordering window contains ***W*** tags, each of which is marked as *seen* or *unseen*. W is an implementation parameter. Maintaining larger windows makes it possible to tolerate more reordering and/or loss by the underlying transport, at the cost of storing more tags. (*Note*: In the current implementation of the protocol, W = 32.)
+Each window contains **W** tags, each of which is marked as *seen* or *unseen*. W is an implementation parameter. Maintaining larger windows makes it possible to tolerate more reordering and/or loss by the transport, at the cost of storing more tags. (*Note:* In the current implementation of the protocol, W = 32.)
 
 When a previously unseen tag is marked as seen, the window slides according to the following rules:
 
-1.  Slide the window until all tags in the top half of the window are unseen.
+1. Slide the window until all tags in the top half of the window are unseen.
 
-2.  Slide the window until the lowest tag in the window is unseen.
+2. Slide the window until the lowest tag in the window is unseen.
 
 If the window slides past a tag that has not been seen, the recipient can no longer recognise the corresponding stream.