diff --git a/protocols/BSP.md b/protocols/BSP.md
index af1c116358da6ed117fff8fb3851faa83c4527a2..e0cfee8dea076f8578c05d02989f8992a3812c7c 100644
--- a/protocols/BSP.md
+++ b/protocols/BSP.md
@@ -51,17 +51,17 @@ BSP uses a major/minor numbering scheme to distinguish between versions of each
 
 The client identifier and major version are included in the calculation of group identifiers, so different major versions of a given client use distinct groups, whereas different minor versions use the same groups.
 
-The minor version is used to indicate backward-compatible changes. A client may not send a message that another client with the same identifier and major version but a different minor version would not be able to handle.
-
 The major version is used to indicate backward-incompatible changes. A client may send a message that another client with the same identifier but a different major version would not be able to handle. The use of distinct groups for each major version makes this safe.
 
+The minor version is used to indicate backward-compatible changes. A client may not send a message that another client with the same identifier and major version but a different minor version would not be able to handle.
+
 ### 2.3 Groups
 
 Each group has a unique identifier HASH\_LEN bytes long. The identifier is calculated by hashing the client identifier and major version with a byte string called the **group descriptor**:
 
-- group\_id = HASH("org.briarproject.bramble/GROUP\_ID", int\_8(format\_version), client\_id, int\_32(client\_major\_version), group\_descriptor)
+- group\_id = HASH("org.briarproject.bramble/GROUP\_ID", int\_8(group\_format\_version), client\_id, int\_32(client\_major\_version), group\_descriptor)
 
-The format version is 1 for the current version of BSP.
+The group format version is 1 for the current version of BSP.
 
 The group descriptor is supplied by the client and is not interpreted by BSP. The maximum length of a group descriptor is 16 KiB.
 
@@ -73,11 +73,15 @@ The message body is supplied by the client and is not interpreted by BSP. The ma
 
 Each message has a unique identifier HASH\_LEN bytes long. The identifier is calculated by hashing the group identifier, timestamp and message body:
 
-- body\_hash = HASH("org.briarproject.bramble/MESSAGE\_BLOCK", int\_8(format\_version), message\_body)
-- message\_id = HASH("org.briarproject.bramble/MESSAGE\_ID", int\_8(format\_version), group\_id, int\_64(timestamp), body\_hash)
+- body\_hash = HASH("org.briarproject.bramble/MESSAGE\_BLOCK", int\_8(message\_format\_version), message\_body)
+- message\_id = HASH("org.briarproject.bramble/MESSAGE\_ID", int\_8(message\_format\_version), group\_id, int\_64(timestamp), body\_hash)
+
+The message format version is 1 for the current version of BSP.
 
 ## 3 Wire Protocol
 
+### 3.1 Record Format
+
 Peers synchronise data by exchanging **records** via the transport layer security protocol. Each record starts with a four-byte header with the following format:
 
 - Bits 0-7: Protocol version
@@ -88,6 +92,8 @@ Peers synchronise data by exchanging **records** via the transport layer securit
 
 The maximum length of the payload is 48 KiB.
 
+### 3.2 Record Types
+
 The current version of the protocol is 0, which has five record types:
 
 **0: ACK** - The record's payload consists of one or more message identifiers. This record informs the recipient that the sender has seen the listed messages.
@@ -116,9 +122,9 @@ A device stores the following synchronisation state for each message it is shari
 
 - **Send count** - The number of times the message has been offered or sent to the peer
 
-- **Send time** - The time at which the message can next be offered or sent to the peer
+- **Send time** - The time at which the message can next be offered or sent to the peer, or zero if the message has never been offered or sent to the peer
 
-- **Expected arrival time** - The time at which the offer or message is expected to arrive at the peer, or zero if the message has not been offered or sent to the peer
+- **Expected arrival time** - The time at which the offer or message is expected to arrive at the peer, or zero if the message has never been offered or sent to the peer
 
 The device may also store a list of message identifiers that have been offered by the peer and not yet requested by the device. This list allows requests to be sent asynchronously. The length of the list may be bounded and the peer may discard offered message identifiers when the list is full.
 
@@ -130,7 +136,7 @@ Interactive mode uses less bandwidth than batch mode, but needs two round-trips
 
 The device may choose a mode based on prior knowledge or measurements of the underlying transport, such as the round-trip time, or it may use any other method. Devices may use different methods from their peers.
 
-##### Interactive Mode
+##### 4.2.1 Interactive Mode
 
 In interactive mode, messages are offered before being sent. The device does the following, in any order:
 
@@ -144,7 +150,7 @@ In interactive mode, messages are offered before being sent. The device does the
 
 - **Offer** any messages that the device is **sharing** with the peer, and does not know whether the peer has seen, and that are ready to send
 
-##### Batch Mode
+##### 4.2.2 Batch Mode
 
 In batch mode, messages are sent without being offered. The device does the following, in any order:
 
@@ -162,7 +168,7 @@ In batch mode, messages are sent without being offered. The device does the foll
 
 Whenever a device offers or sends a message to a peer, it increases the message's **send count** and **send time** and updates the message's **expected arrival time**.
 
-A device may increase the send time based on prior knowledge or measurements of the underlying transport, such as the round-trip time, or it may use any other method. Devices may use different methods from their peers. BSP only requires that the amount by which the send time is increased should increase exponentially with the send count. In other words, retransmission should use some form of exponential backoff.
+A device may increase the send time based on prior knowledge or measurements of the underlying transport, such as the round-trip time, or it may use any other method. Devices may use different methods from their peers. BSP only requires that the send time should increase exponentially with the send count. In other words, retransmission should use some form of exponential backoff.
 
 Similarly, a device may update the expected arrival time based on prior knowledge or measurements of the underlying transport, or it may use any other method. Devices may use different methods from their peers.