Lessons From a Two Week Sprint
Learning from the Android/iPhone team's work on exposure notificationby Joe Honton
It’s a marvel that these two large organizations could move this quickly. Getting it right, the first time around, is vitally important.
Two weeks ago Apple and Google announced a joint partnership to develop a cellphone protocol to help us in the fight against Covid-19. It will assist epidemiologists and health care workers with the important work of contact tracing.
Currently, contact tracing is carried out in a laborious process that requires many manual steps. Furthermore, it relies on a patient's memory to recall who they have come in contact with in the recent past. With this new protocol, contact tracing can be accomplished more efficiently and potentially with greater fidelity.
Having good contact tracing in place is one of the essential steps listed by government authorities before reopening society.
When the partnership was first announced, I took a special interest in understanding how this privacy preserving protocol worked. The cybersecurity details were examined here.
That was a good first design, with privacy concerns taking center stage. But the design has improved since then, and there are important lessons to be learned by studying what's changed.
For programmers, there are several important lessons to be gleaned from the work that the Apple/Google team have done over the past two weeks.
Here's what's new:
- The Bluetooth transmit level is used for better distance approximation.
- The protocol version is recorded in a new encrypted metadata section.
- There is no longer a device tracing key.
- Rolling proximity identifiers now use the Advanced Encryption Standard.
- Temporary exposure keys are used instead of date and time.
Contact Tracinghas been renamed to
Let's examine these one by one, and see what we can learn.
The Bluetooth transmit level is now recorded at the time of the encounter. The power level of low-energy Bluetooth transmissions fluctuates with interference from purses, pockets, and other obstacles. This in turn biases the determination of how far apart two cellphones are.
Keeping this power level together with each contact record will allow potentially exposed individuals to make their own risk determination. They will know best whether or not their cellphone was subject to that type of interference.
In this case, community review of the initial protocol specification was the impetus for the change, and it was heard loud and clear.
Lesson: Listen to feedback from your peers.
The protocol version is now recorded as well. Recording the version will allow updates to be deployed when its limitations are discovered. Omitting this identifier would have been a mistake, one that software architects often make, whether they're senior or newbie.
Lesson: Anticipate the inevitability of change.
There is no longer a master device tracing key. This was previously specified to be a cryptographic private key that was to be generated when the software was first installed. It was never to leave the cellphone. It was to be used only in the generation of randomized daily keys.
Psychologically this felt like an extra layer of privacy, but architecturally the device tracing key served no purpose. Instead, a new daily tracing key is generated randomly each 24 hours and kept on the device for a two week period.
Lesson: Simple designs are better.
Rolling proximity identifiers are now computed with the fast AES-128 cryptographic standard. The result is stored as a one-way 16-byte hash. Previously the spec called for the use of tamper-proof SHA-256 HMACs, with only the first 16 bytes of the 32-byte result being saved.
That early decision was odd. Using a stronger (more time consuming) encryption method, and then throwing away half of it, was a head scratcher. The revised approach shows an important evolution in the engineers' thinking.
Lesson: Review your assumptions, and don't be afraid to admit mistakes.
Temporary exposure keys now completely replace the need for dates. The previously specified daily tracing number (the number of days since the Unix epoch) is no longer part of the protocol. The time interval number, which is the number of 10-minute intervals elapsed since midnight UTC, is retained and repurposed into the calculation of temporary exposure keys.
By refactoring the design of the daily tracing key, the Unix epoch became a redundant bit of data.
Lesson: Get rid of redundancies by refactoring aggressively.
Finally, a non-technical change was made that is potentially more important than the others. The protocol's name
Contact Tracing has been replaced with
Even a quick glance at the news shows that there are quite a few nay-sayers out there. Big tech has gotten a black-eye recently when it comes to tracking, so it's no wonder that something as similar sounding as tracing would be looked upon with suspicion.
But the new name is more than just good marketing. The Apple/Google protocol has its genesis in Singapore's TraceTogether Bluetooth contact tracing system. Jason Bay, the project lead, has emphasized that automated contact tracing is not a panacea, and that keeping the human-in-the-loop is a critical part of the project's success. Rebranding the protocol as
Privacy Preserving Exposure Notification feels like a nod in the right direction.
Lesson: Software that's not fully trusted will not be used.
Overall, I think it's remarkable how quickly things have evolved. For those of us that use agile methodologies in our work, we know all too well that a two-week sprint can sometimes feel like we're just chipping away at the bigger problem.
It's a marvel that these two large organizations could move this quickly. Especially because the stakes are so high: this code will be pushed to billions of cellphones in the very near future.
And that's the final lesson. Getting it right, the first time around, is vitally important.