BSP.md 12.2 KB
Newer Older
1
# Bramble Synchronisation Protocol, version 0
2 3 4 5 6 7 8 9 10 11 12

## 1 Introduction

Bramble Synchronisation Protocol (BSP) is an application layer data synchronisation protocol suitable for delay-tolerant networks.

### 1.1 System Model

BSP synchronises data among dynamic sets of **devices**. Each device has a dynamic set of **peers** with which it communicates.

Data is organised into **groups**. Each group is an independent synchronisation scope containing a **message graph** made up of immutable **messages**.

13
If a device takes part in synchronising a group's messages, we say the device is a **member** of the group. If two peers synchronise a group's messages with each other, we say they **share** the group with each other. Group membership and sharing are dynamic. Peers that are members of a group do not necessarily share it with each other.
14

15
Each member of a group stores a partial copy of the message graph. Each message in the partial copy of the graph may be **shared** or **deleted**. If a message is shared, the device will synchronise the message to any peers with which it shares the group. If a message is deleted, the device will delete its copy of the message but retain some information about the message's position in the graph.
16

17
Each group belongs to a **client**, which is an application component that uses BSP to synchronise data. The client is responsible for deciding which peers the group should be shared with, what constitutes a valid message, which messages should be shared, and which messages should be deleted. BSP implements these decisions on behalf of the client.
18

19
### 1.2 Transport Layer Security
20

21
BSP requires a **transport layer security protocol** that can deliver data from one device to another on a best-effort basis, meaning that data may be delayed, lost, reordered or duplicated. The transport layer security protocol is responsible for ensuring the confidentiality, integrity, authenticity and forward secrecy of the data it carries.
22

23
### 1.3 Notation
24

25 26 27 28
* || denotes concatenation
* Double quotes denote an ASCII string
* len(x) denotes the length of x in bytes
* int\_n(x) denotes x represented as an unsigned, big-endian, n-bit integer
29

30
### 1.4 Cryptographic Primitives
31

32
BSP uses a **cryptographic hash function**, H(m).
33 34 35

We use H(m) to define a multi-argument hash function:

36
- HASH(x\_1, ..., x\_n) = H(int\_32(len(x\_1)) || x\_1 || ... || int\_32(len(x\_n)) || x\_n)
37

38 39 40
All hashes are HASH\_LEN bytes.

(*Note:* The current version of the protocol uses BLAKE2b as the hash function, with an output length of 32 bytes, giving HASH\_LEN = 32.)
41 42 43 44 45

## 2 Data Formats

### 2.1 Client Identifiers

46 47 48 49 50 51 52 53 54
Each client is identified by a namespaced string. The namespace is based on domain names and is intended to prevent accidental collisions between clients created by different developers. For example, all clients developed by the Briar project have identifiers starting with `org.briarproject`.

### 2.2 Client Versioning

BSP uses a major/minor numbering scheme to distinguish between versions of each client.

The client identifier and major version are included in the calculation of group identifiers, so different major versions of a given client use distinct groups, whereas different minor versions use the same groups.

The major version is used to indicate backward-incompatible changes. A client may send a message that another client with the same identifier but a different major version would not be able to handle. The use of distinct groups for each major version makes this safe.
55

56 57
The minor version is used to indicate backward-compatible changes. A client may not send a message that another client with the same identifier and major version but a different minor version would not be able to handle.

58
### 2.3 Groups
59

60
Each group has a unique identifier HASH\_LEN bytes long. The identifier is calculated by hashing the client identifier and major version with a byte string called the **group descriptor**:
61

62
- group\_id = HASH("org.briarproject.bramble/GROUP\_ID", int\_8(group\_format\_version), client\_id, int\_32(client\_major\_version), group\_descriptor)
63

64
The group format version is 1 for the current version of BSP.
65

66
The group descriptor is supplied by the client and is not interpreted by BSP. The maximum length of a group descriptor is 16 KiB.
67

68
### 2.4 Messages
69

70
A message consists of a byte string called the **message body** and a timestamp indicating the time at which the message was created, represented in milliseconds since the Unix epoch.
71

72
The message body is supplied by the client and is not interpreted by BSP. The maximum length of a message body is 32 KiB.
73 74 75

Each message has a unique identifier HASH\_LEN bytes long. The identifier is calculated by hashing the group identifier, timestamp and message body:

76 77 78 79
- body\_hash = HASH("org.briarproject.bramble/MESSAGE\_BLOCK", int\_8(message\_format\_version), message\_body)
- message\_id = HASH("org.briarproject.bramble/MESSAGE\_ID", int\_8(message\_format\_version), group\_id, int\_64(timestamp), body\_hash)

The message format version is 1 for the current version of BSP.
80 81 82

## 3 Wire Protocol

83 84
### 3.1 Record Format

akwizgran's avatar
akwizgran committed
85
Peers synchronise data by exchanging **records** via the transport layer security protocol. Each record has the following format:
86

akwizgran's avatar
akwizgran committed
87
- record\_header = int\_8(protocol\_version) || int\_8(record\_type) || int\_16(len(payload))
88

akwizgran's avatar
akwizgran committed
89
- record = record\_header || payload
90

91
The maximum length of the payload is 48 KiB.
92

93 94
### 3.2 Record Types

95
The current version of the protocol is 0, which has five record types:
96

97
**0: ACK** - The record's payload consists of one or more message identifiers. This record informs the recipient that the sender has seen the listed messages.
98

99
**1: MESSAGE** - The record's payload consists of a message (group identifier, timestamp and message body).
100

101 102 103 104 105 106 107
**2: OFFER** - The record's payload consists of one or more message identifiers. This record informs the recipient that the sender has seen the listed messages, is sharing them with the recipient, and does not know whether the recipient has seen them.

**3: REQUEST** - The record's payload consists of one or more message identifiers. This record asks the recipient to send the listed messages to the sender.

**4: VERSIONS** - The record's payload consists of up to ten 8-bit integers, each identifying a version of BSP supported by the sender. This record will be used for version negotiation by future versions of BSP.

A device should reject any record with an unsupported protocol version and ignore any record with a supported protocol version but an unrecognised record type. This allows new record types to be added without breaking compatibility.
108 109 110 111 112

## 4 Synchronisation

### 4.1 Synchronisation State

113
A device stores the following synchronisation state for each message it is sharing with a peer:
114

115
- **Seen flag** - Raised if the peer is known to have seen the message, otherwise lowered
116

117
- **Ack flag** - Raised if the peer has offered or sent the message since the device last acknowledged it, otherwise lowered
118

119
- **Request flag** - Raised if the peer has requested the message since the device last offered or sent it, otherwise lowered
120 121 122

- **Send count** - The number of times the message has been offered or sent to the peer

123
- **Send time** - The time at which the message can next be offered or sent to the peer, or zero if the message has never been offered or sent to the peer
124

125
- **Expected arrival time** - The time at which the offer or message is expected to arrive at the peer, or zero if the message has never been offered or sent to the peer
126 127

The device may also store a list of message identifiers that have been offered by the peer and not yet requested by the device. This list allows requests to be sent asynchronously. The length of the list may be bounded and the peer may discard offered message identifiers when the list is full.
128 129 130 131 132 133 134

### 4.2 Synchronisation Modes

Each time a device has an opportunity to send records to a peer, it decides whether to use ***interactive mode*** or ***batch mode***.

Interactive mode uses less bandwidth than batch mode, but needs two round-trips for synchronisation, whereas batch mode needs one.

135
The device may choose a mode based on prior knowledge or measurements of the underlying transport, such as the round-trip time, or it may use any other method. Devices may use different methods from their peers.
136

137
##### 4.2.1 Interactive Mode
138 139 140 141 142

In interactive mode, messages are offered before being sent. The device does the following, in any order:

- **Acknowledge** any messages **received** from the peer that the device has not yet acknowledged

143
- **Acknowledge** any messages **offered** by the peer that the device has seen and has not yet acknowledged
144

145
- **Request** any messages **offered** by the peer that the device has not seen and has not yet requested
146

147
- **Send** any messages that the device is **sharing** with the peer, that have been **requested** by the peer, and that are **ready to send** (see below)
148

149
- **Offer** any messages that the device is **sharing** with the peer, and does not know whether the peer has seen, and that are ready to send
150

151
##### 4.2.2 Batch Mode
152 153 154 155 156

In batch mode, messages are sent without being offered. The device does the following, in any order:

- **Acknowledge** any messages **sent** by the peer that the device has not yet acknowledged

157
- **Acknowledge** any messages **offered** by the peer that the device has seen and has not yet acknowledged
158

159
- **Request** any messages **offered** by the peer that the device has not seen and has not yet requested
160

161
- **Send** any messages that the device is **sharing** with the peer, that have been **requested** by the peer, and that are ready to send
162

163
- **Send** any messages that the device is **sharing** with the peer, and does not know whether the peer has seen, and that are ready to send
164 165 166

### 4.3 Retransmission

167 168
Whenever a device offers or sends a message to a peer, it increases the message's **send count** and **send time** and updates the message's **expected arrival time**.

169
A device may increase the send time based on prior knowledge or measurements of the underlying transport, such as the round-trip time, or it may use any other method. Devices may use different methods from their peers. BSP only requires that the send time should increase exponentially with the send count. In other words, retransmission should use some form of exponential backoff.
170 171

Similarly, a device may update the expected arrival time based on prior knowledge or measurements of the underlying transport, or it may use any other method. Devices may use different methods from their peers.
172

173
A message is considered **ready to send** if either its send time has been reached, or the expected arrival time would be reduced if the message was sent now.
174 175 176 177 178 179 180 181 182 183 184 185

## 5 The Message Graph

Each group belongs to a client that is responsible for deciding which peers the group should be shared with, what constitutes a valid message, which messages should be shared, and which messages should be deleted.

The message graph is a directed graph where each message points to its **dependencies**. The list of dependencies is encoded in the message body in a format determined by the client. Since messages are immutable, dependency relationships are also immutable and the graph is acyclic. The graph may have more than one component.

When a device receives a message for the first time, the client responsible for the corresponding group parses the message body and extracts a list of dependencies. BSP uses the list of dependencies to determine the message's position in the message graph. If the client cannot parse the message, BSP marks it as invalid and deletes it. Any messages that transitively depend on it are also marked as invalid and deleted.

A message is eligible to be **delivered** to the client when all its dependencies have been delivered. This provides **causal consistency**. When a message is delivered, the client validates the message in the context of its dependencies. If the client decides the message is invalid, BSP marks it as invalid and deletes it. Any messages that transitively depend on it are also marked as invalid and deleted.

The client may share any message that has been delivered, in which case the message's transitive dependencies are also shared. The client may delete any message that it no longer needs to store. This does not affect the message's dependencies or dependents.