akwizgran · ceaf0b6f
--- a/BSP.md
+++ b/BSP.md
+BSP is an application layer data synchronisation protocol for delay-tolerant networks. It can operate over any transport that can deliver a simplex stream of bytes from a sender to a recipient on a best-effort basis, meaning that streams may be delayed, lost, reordered or duplicated by the transport. BSP does not ensure the confidentiality, authenticity or integrity of streams; that is the responsibility of a transport layer security protocol such as [BTP](BTP).
+
+BSP synchronises data between two devices referred to as the local and remote peers. The data to be synchronised consists of messages, which are organised into channels. From BSP's point of view, a message is simply a sequence of bytes and a channel is simply a set of messages. A message created by the local peer is called a local message, while a message synchronised from the remote peer is called a remote message.
+
+Each channel on the local peer belongs to a client, which is responsible for deciding which messages are valid (the validity policy), which remote messages should be stored on the local peer (the storage policy), and which local messages should be shared with the remote peer (the sharing policy).
+
+### Notation
+
+We use || to denote concatenation, double quotes to denote an ASCII string, and len(x) to denote the length of x in bytes, represented as a 32-bit integer. All integers in BSP are big-endian.
+
+### Crypto primitives
+
+BSP uses a cryptographic hash function, H(m), with an output length of HASH_LEN bytes, and a random number generator, R(n), with an output length of n bytes. R(n) must be either a true random number generator or a cryptographically secure pseudo-random number generator. We use H(m) to define a multi-argument hash function:
+
+* `HASH(x_1, ..., x_n) == H(len(x_1) || x_1 || ... || len(x_n) || x_n)`
+
+> Implementation note: We propose to use BLAKE2s as the hash function, which gives HASH_LEN = 32.
+
+### Client identifiers
+
+Each client has a unique random identifier HASH_LEN bytes long:
+
+* `client_id = R(HASH_LEN)`
+
+### Channel identifiers
+
+Each channel has a unique identifier HASH_LEN bytes long. The identifier is calculated by hashing the client identifier and a data structure called the channel descriptor:
+
+* `channel_id = HASH("CHANNEL_ID", client_id, channel_descriptor)`
+
+The channel descriptor is supplied by the client and is not interpreted by BSP. Including the client identifier in the hash prevents collisions between clients that use similar data structures for their descriptors.
+
+### Message format
+
+Each message consists of one or more blocks. Each block is BLOCK_LEN bytes long, except the last block of the message, which may be shorter. (We require that BLOCK_LEN &lt;= 2^15 .)
+
+* `message = block_1 || ... || block_n`
+
+The blocks form the leaves of a binary hash tree. Each parent node consists of the concatenated hashes of its children. If the number of blocks in the message is not a power of two, some parent nodes will only have one child. The hash of the root node is called the root hash.
+
+* `block_hash = HASH("BLOCK", block)`
+* `parent_node = only_child_hash`
+* `parent_node = left_child_hash || right_child_hash`
+* `parent_hash = HASH("TREE", parent_node)`
+
+The message's unique identifier is calculated by hashing a message header and the root hash. The message header consists of the channel identifier, a salt, a timestamp, the message length and the message type.
+
+* `salt = R(HASH_LEN)`
+* `message_header = channel_id || salt || timestamp || message_length || message_type`
+* `message_id = HASH("MESSAGE_ID", message_header, root_hash)`
+
+The random salt prevents collisions between otherwise identical messages. The timestamp is a 64-bit integer representing seconds since the Unix epoch. (All integers in BSP are big-endian.) The message length is a 64-bit integer representing the length of the message in bytes. The message type is a single byte that is supplied by the client and is not interpreted by BSP.
+
+Each block has a unique identifier, which is calculated by hashing the message header, a block header and the hash of the block itself. The block header consists of the block number and a list of hashes called the path.
+
+* `block_header = block_number || path_1 || ... || path_n`
+* `block_id = HASH("BLOCK_ID", message_header, block_header, block_hash)`
+
+The block number is a 64-bit integer starting from zero for the first block of the message. The path consists of the hashes of the siblings of the block's ancestors in the hash tree. If the number of blocks in the message is not a power of two, some of the block's ancestors may not have siblings. The positions of any such ancestors can be calculated from the message length and the block number.
+
+A block accompanied by its message and block headers is called a portable block. A portable block is valid if the message length, block number, path and block length are consistent with each other. A valid portable block contains all the information needed to calculate the message identifier. Any valid portable blocks that produce the same message identifier are guaranteed to be consistent with each other.
+
+### Records
+
+The local and remote peers synchronise data by sending simplex streams of bytes to each other. Each stream consists of a series of records. Each record starts with a record header that is 4 bytes long, with the following format:
+
+* Bits 0-7: Protocol version
+* Bits 8-15: Record type
+* Bits 16-31: Length of the payload in bytes as a 16-bit integer
+
+A stream may contain records of any type in any order. If the recipient does not recognise a record's protocol version or record type, the recipient skips to the next record in the stream.
+
+The current version of the protocol is 1, which has five record types:
+
+**0: OFFER** - The payload consists of one or more block identifiers. This record informs the recipient that the sender holds the listed blocks and asks the recipient whether to send them.
+
+**1: REQUEST** - The payload consists of one or more block identifiers. This record asks the recipient to send the listed blocks.
+
+**2: DATA** - The payload consists of a portable block.
+
+**3: ACK** - The payload consists of one or more block identifiers. This record informs the recipient that the sender holds the listed blocks.
+
+**4. RESET** - The payload is empty. This record asks the recipient to discard any information about which blocks the sender holds.
+
+The local peer is said to hold a block if it is storing the block and sharing it with the remote peer according to the client's sharing policy. If the local peer is storing a block but not sharing it with the remote peer, the local peer acts as though it were not storing the block.
+
+### Synchronisation state
+
+The local peer stores the following synchronisation state for each block it holds:
+
+* Hold flag - Set to 1 if the remote peer is known to hold the block, otherwise 0
+* Ack flag - Set to 1 if the local peer needs to acknowledge the block, otherwise 0
+* Request flag - Set to 1 if the remote peer has requested the block since it was last sent, otherwise 0
+* Send count - The number of times the block has been offered or sent to the remote peer
+* Send time - A timestamp indicating when the block can next be offered or sent to the remote peer, measured in seconds since the Unix epoch and initialised to 0
+
+The local peer also stores a list of block identifiers that have been offered by the remote peer and not yet acknowledged or requested by the local peer. The length of this list should be bounded, and the local peer should replace the oldest entry in the list if the maximum length is reached.
+
+### Interactive mode and batch mode
+
+The sender of each stream decides whether the stream will use interactive mode or batch mode. Interactive mode uses less bandwidth than batch mode, but needs two round-trips for synchronisation, whereas batch mode needs one. The sender's choice may depend on prior knowledge or measurements of the underlying transport. BSP does not specify how to decide which mode to use - the local and remote peers may use different criteria, and peers may change their criteria at any time.
+
+In interactive mode, blocks are offered before being sent. The sender does the following, in any order:
+
+* Acknowledge any blocks sent by the recipient
+* Acknowledge any blocks offered by the recipient that the sender holds
+* Request any blocks offered by the recipient that the sender does not hold
+* Offer any blocks that the sender does not know whether the recipient holds
+* Send any blocks requested by the recipient
+
+In batch mode, blocks are sent without being offered. The sender does the following, in any order:
+
+* Acknowledge any blocks sent by the recipient
+* Acknowledge any blocks offered by the recipient that the sender holds
+* Request any blocks offered by the recipient that the sender does not hold
+* Send any blocks requested by the recipient
+* Send any blocks that the sender does not know whether the recipient holds
+
+A block is not offered or sent until its send time is reached.
+
+### Retransmission
+
+Whenever the local peer offers or sends a block it increases the block's send count and send time. BSP does not specify how the send time should be increased, except that the amount by which it is increased should increase exponentially with the send count. The local peer may increase the send time based on measurements of the transport's round-trip time and round-trip time variance, as in TCP, or it may use any other method.
+
+### Resetting
+
+If the local peer crashes, it may fail to store blocks that it has already acknowledged. (This can happen even if the local peer stores the blocks before acknowledging them, as there are many layers of buffers between the application and the physical storage medium.) The remote peer will no longer offer or send blocks that the local peer has acknowledged, so the peers may remain out of sync indefinitely. If the local peer detects that it has crashed, it should send a reset record to reset the remote peer's information about which blocks the local peer holds.