Changes

akwizgran · a3b4a65e
--- a/Analysis-of-Human-Mobility-Datasets.md
+++ b/Analysis-of-Human-Mobility-Datasets.md
@@ -705,7 +705,6 @@ For example, looking at the left edge of the Haggle figure, we can see that 31%
 The distribution changes throughout the week due to weekly patterns in the participants' mobility.

 It's clear that for all of the datasets, these latencies are much longer than the latencies that users of Internet-based communication tools are used to.
-Many activities that we're used to coordinating through such tools would be seriously disrupted by communication latencies on the order of days or weeks.

 ### Maximum Path Latency

@@ -719,28 +718,44 @@ This is also the *maximum useful time-to-live* for the dataset, as increasing th
 | Office  | 17 days              |
 | Village | 22 days              |

+The maximum path latency gives us a lower bound on the worst-case delivery latency that we could expect any real delay-tolerant network to achieve, given the mobility patterns recorded in the datasets.
+For example, for the Haggle dataset, no real delay-tolerant network could achieve a delivery latency of less than 14 days in the worst case (ie, in the case of the sender-recipient pairs that are connected by the slowest paths).
+
 It's worth mentioning that only a small fraction of paths come close to the maximum path latency.
 In other words, the time-to-live could be set a few days shorter than the maximum path latency without a large impact on reachability.

 This is likely to be a relevant concern in real networks, where increasing the time-to-live may increase the storage and bandwidth costs of carrying a given volume of message traffic.
-Real-world performance may therefore *decrease* when the time-to-live is increased, as old messages (which have a low probability of reaching their destinations if they have not already done so, and are redundant if they *have* already done so) compete for storage and bandwidth with new messages.
+Real-world performance may therefore *decrease* when the time-to-live is increased beyond a certain point, as old messages compete for storage and bandwidth with new messages.

-Fully understanding these performance tradeoffs requires a more realistic simulation of a specific delay-tolerant network design, including its rules for deciding which messages to store and forward when storage and bandwidth are finite.
-
-The maximum path latency gives us a lower bound on the worst-case delivery latency that we could expect any real delay-tolerant network to achieve, given the mobility patterns recorded in the datasets.
-For example, for the Haggle dataset, no real delay-tolerant network could achieve a delivery latency of less than 14 days in the worst case (ie, in the case of the sender-recipient pairs that are connected by the slowest paths).
+Fully understanding these performance tradeoffs requires a more realistic simulation of a specific delay-tolerant network design, including its rules for deciding which messages to store and forward when storage and bandwidth are limited.

 ## Summary of Findings

-* All of the datasets show daily patterns in the frequency of contact events and the number of connected devices.
+* All of the datasets show daily patterns in the frequency of contact events and the number of connected devices. The Haggle and office datasets also show weekly patterns.
 * The datasets differ greatly in the amount of time devices spend connected.
-* The static contact graphs have different structures: the Haggle graph has a dense core and a sparse periphery, the office graph is uniform, and the village graph is sparse and clustered.
-* Static reachability is high (at least 92.7%) for all datasets. It's below 100% for the Haggle and village datasets because of a few isolated nodes and small components. These reflect realistic conditions: some people are socially isolated or unable to leave their homes.
+* The static contact graphs for the datasets have different structures: the Haggle graph has a dense core and a sparse periphery, the office graph is uniform, and the village graph is sparse and clustered.
+* Static reachability is high (at least 92.7%) for all datasets. It's below 100% for the Haggle and village datasets because of a few isolated nodes and small components.
 * The datasets are too short to be used for realistic network simulations, or for understanding how reachability changes with time: a boundary effect caused by the end of the dataset would distort the results.
-* We can correct for this by selecting a week's worth of data and looping it, then analysing the "clean" data that's outside the boundary effect.
-* The static contact graphs for the looped data are qualitatively similar to those for the original data, but with a higher incidence of isolated nodes and small components, and thus static reachability for the office and village datasets is lower.
-* Path latencies are high for all datasets. For the Haggle dataset, 21 to 56% of paths reach their destinations within one day (depending on the time of week) and some paths take 14 days to reach their destinations. The other datasets have even higher latencies.
+* We can correct for this by selecting a week's worth of data and looping it to create a longer dataset, then analysing the "clean" data that's outside the boundary effect.
+* The static contact graphs for the looped data are qualitatively similar to those for the original data, but with a higher incidence of isolated nodes and small components. Thus static reachability for the office and village datasets is lower for the looped data than for the original data.
+* Path latencies are high for all datasets. For the Haggle dataset, 44-79% of paths take more than a day to reach their destinations (depending on the time of week), while the slowest paths take 14 days to reach their destinations. The other datasets have even higher latencies.
+
+## Conclusions and Future Work
+
+We extracted static and dynamic contact graphs from three datasets and analysed the graphs to obtain some bounds on the performance achievable by any delay-tolerant network, given the mobility patterns recorded in the datasets.
+
+First we established an upper bound for delivery probability.
+This upper bound is high, meaning that under the conditions recorded in the datasets, a delay-tolerant network can, in principle, deliver a message from any sender to any recipient with high probability.
+
+There's an important caveat, however: the datasets show that some devices don't make contact with any other devices, while others belong to small groups of devices that only make contact with each other (these correspond to isolated nodes and small components in the static contact graph, respectively).
+This reflects realistic conditions: some people are relatively socially isolated, and delay-tolerant networks are unlikely to serve these people as effectively as people who have more frequent social contacts.
+
+Next we established a lower bound for delivery latency.
+This lower bound is high, meaning that under the conditions recorded in the datasets, a delay-tolerant network cannot, even in principle, deliver messages quickly between all senders and recipients.
+For many sender-recipient pairs, latencies are on the order of days or weeks - much longer than the latencies of seconds or minutes that users of Internet-based communication tools are used to.

-## Conclusions
+Many activities that we're used to coordinating through Internet-based tools would be seriously disrupted by communication latencies of days or weeks, so this finding calls into question the whole notion of using tools built on delay-tolerant networks as functional replacements for Internet-based tools.

-TODO
\ No newline at end of file
+All of the datasets we studied were collected by asking people to go about their everyday lives while carrying or wearing measurement devices: the participants were not asked to make any special effort to have encounters that could be useful for forming a delay-tolerant network.
+This raises the possibility that performance (specifically latency) might be improved if some of the participants were purposefully taking part in carrying and passing on messages, rather than merely doing so as a side-effect of going about their usual business.
+This is a possibility we intend to explore in future research.