Write a script that produces a reproducible APK from a fresh checkout of the repo. This should include optionally compiling the Tor binaries rather than downloading them.
This might be tricky to do with a single script that can be run in diverse environments with different software version.
The script could either check all dependencies and their versions or we consider a different approach like using a docker container for building where all dependencies are fixed which could be considered cheating though ;)
The script could either check all dependencies and their versions or we consider a different approach like using a docker container for building where all dependencies are fixed which could be considered cheating though ;)
For the android / gradle part, which dependencies do we have besides bash to run the script?
I'm perfectly happy with using a container/VM - that's how I assumed we'd do it. We just need a way for people to verify the binaries, based on some reasonable trusted foundations like Debian and the Android SDK. (Trusted in the sense that if they're compromised it's game over anyway.)
Sorry, if you meant that creating our own container/VM image would be cheating, I agree. We should use an off-the-shelf image from some reasonably trusted source and then script all the changes needed to get it to the point where it can run the build.
The way F-Droid verifies that an APK built reproducible currently works like this: The signature from the reference binary is transferred to the built APK, and then jarsigner is used to verify that the signature from the reference APK is also valid for the newly built one.
jarsigner seems to compare the SHA1 digest for each file for the comparison.
Current state-of-the-art however is using apksigner instead. Verifying with this fails, because F-Droid's approach does not seem to work for the new v2 APK signatures.
Alternatively, we could also aim at 100% binary equality which is more tricky to achieve since it would also take file timestamps and APK file metadata into account.
Shall we try running sortjar on the original (minus signature) and reproduced APKs, and see if that gives us binary equality? I doubt it will work, but if it does it could save us a lot of effort.
Otherwise maybe we should look into how much work would be involved in adding apksigner support to F-Droid and getting it merged upstream? Presumably this is something they'll need to work on eventually - we can't be the only free software project using v2 signatures.
A third possibility would be to write our own script that either unpacks the original (minus signature) and reproduced APKs and compares the SHA-256 hashes of the files, or transplants the v1 and v2 signatures into the reproduced APK and runs apksigner verify.
I don't feel great about relying on SHA-1 at this point in history, but I guess that's our fallback option.
Alternatively, we could also aim at 100% binary equality which is more tricky to achieve since it would also take file timestamps and APK file metadata into account.
File timestamps should be fixable without too much work, shouldn' it?
It installs the required dependencies, fetches briar's source and builds the given (or the latest) tag from the repo and compares it to the reference binary. The comparison works by repacking the APKs and then comparing the hashes of the repacked version.
Next step is adding a CI. Ours doesn't run in (insecure) privileged mode (needed for docker inside docker), so I am using the one of gitlab.com for now. You can check out an example run here: https://gitlab.com/grote/briar-reproducer/-/jobs/68468149 (it still combines build, success test and failure test in one stage for now)
If there's no fundamental problems, I am also going to write a bit of documentation for people wanting to run this locally.
Depending on how we resolve the CI issue (second limited runner, other server or using gitlab.com) I would also like to trigger an automated run whenever a new tag is pushed to the briar repo.
I wonder if we could occasionally run into reproducibility issues when Debian updates a package that we depend on.
Would it be possible to avoid that by specifying an exact version of Debian (rather than current stable), pinning the versions of any packages we install (including any dependencies they pull in), and not running apt-get upgrade?
Or maybe this is unlikely to happen since most sources of non-reproducibility are in the native toolchain, which is provided by the Android NDK rather than Debian?
Looks like the repacking may depend on the order of the entries in the original APKs, which may in turn depend on the order in which the filesystem returns directory entries, which isn't deterministic in general. Unless the APK format ensures a canonical ordering? ZIP/JAR don't.
Looks like the repacking may depend on the order of the entries in the original APKs, which may in turn depend on the order in which the filesystem returns directory entries, which isn't deterministic in general. Unless the APK format ensures a canonical ordering? ZIP/JAR don't.
I actually hoped that docker is using an FS which has a deterministic file ordering :/
Nice! Comparison failed for me but that was to be expected?
The current 1.0.3 release fails because of the timestamp issue. 1.0.2 fails as well and there I haven't been able to find the reason. resources.arsc differs, but its content doesn't. 1.0.1 builds reproducible.
Would it be possible to avoid that by specifying an exact version of Debian (rather than current stable), pinning the versions of any packages we install (including any dependencies they pull in), and not running apt-get upgrade?
The way this normally works is that the CI image is tagged and uploaded to some Docker registry. So it stays stable until we build a new one.
Or maybe this is unlikely to happen since most sources of non-reproducibility are in the native toolchain, which is provided by the Android NDK rather than Debian?
I haven't yet gotten to the NDK part, but yes I think Debian and its packages shouldn't matter too much. However, I guess we will find out soon ;)
Looks like the repacking may depend on the order of the entries in the original APKs, which may in turn depend on the order in which the filesystem returns directory entries, which isn't deterministic in general. Unless the APK format ensures a canonical ordering? ZIP/JAR don't.
The repacking maintains the order and so far the order in both APKs was the always the same. But I haven't seen yet where this is reliably enforced and if so where. I guess we can always introduce sorting if the order ever differs.
Looks like apktool is downloaded but not used (yet) - unless it's an optional dependency of diffoscope?
Yes exactly! Diffoscope is using it indirectly to show you how two APKs differ. The CI run I posted above uses it in the last example which fails to reproduce the 1.0.3 release.
So if someone else wants to reproduce the APK, they need to use our image?
This would be the easiest path. Then they'd need to trust that our image was really built based on the published files. However, I plan on adding instructions for the full process that includes building our image locally as well.
Then however, people might run into reproducibility issues due to Debian package updates. That's right.
I'll change the base image from debian:stable to debian:stretch though, because otherwise issues are quite likely when the next stable comes out.
So I talked to Hans from the GuardianProject who does most of the reproducible builds work in the Android space and F-Droid specifically.
He said that the jarsigner verification is using SHA-1, because Android API < 18 can only handle those. So we would need to set our minSdkLevel to 18 to get SHA-256 hashes in our APK's MANIFEST.MF. Then jarsigner could/would use those for verification.
Hans prefers using the standard verification mechanism, because this is what Android itself uses. While he didn't see any immediate problems with our approach of repacking the APKs, he mentioned the Janus Vulnerability and the Master Key Vulnerability as examples for unexpected things that can go wrong.
We agreed that the best approach would be supporting and using the Android v2 signature. However, I am not sure it would be worth the extra effort for our needs at this stage. Also, there don't really seem to be libraries supporting it, yet. There's apk-signature-verify which supports v2 verification, but looks a bit cryptic and comes with Chinese code comments.