In several countries there are currently ideas to track the spread of HCoV-19 by use of mobile location data. A good knowledge of the virus' spread throughout communities and countries will certainly be a huge benefit to everyone, as it provides data that allows fast and decisive, but ultimately localized, action. However, gathering health and location data is a huge invasion of privacy and we should not be too keen about simply handing this information to the government or any third-party app.
Here is a rough idea that I think could allow to track infections in a way that respects privacy and (in parts) anonymity.
- When Alice and Bob meet, they exchange IDs and keys. They also save the date of their encounter.
- Alice and Bob publish their health status as ciphertext (using the ID and keys from earlier) on a centralized server.
- When Alice falls ill and is tested positively she updates her status.
- Bob regularly checks the list for updates under Alice's ID, he can decipher Alice's status and will be warned that he was in contact with Alice during a time she was not showing symptoms, but might have been infectious. He can now choose to self-isolate or implement stronger social distancing measures in order to reduce the number of people he infects.
- At the same time Charlie does not have access to Alice's or Bob's health status, nor should he be able to guess from the ID who the person is.
This procedure can certainly be generalized/modified in several ways. One could, e.g., also associate an ID to meetings, gyms, grocery stores such that it's not necessary to exchange IDs and keys for every person. They would be assigned a virtual health status, depending on wether one of the attendees might have been infectous.
Another idea: instead of only publishing "healthy/ill", one could publish one's degree of seperation to the next ill person (so a value of 2 would mean that one of my friends was in contact with someone who has been tested positive and whom I met subsequently at a time when he might have been infectious but has not yet shown symptoms. This will then signal to me that I should skip large meetings and reduce contact to at-risk persons in the next 2 weeks)
My questions are:
- Are there glaring privacy and anonymity flaws in the procedure outlined above?
- Is there a way to further anonymize everything? Eg. such that Bob can't reveal Alice's ID to Charlie. Or such that checking the published list only reveals that one of my recent contacts has been tested positive, but it's not possible to deduce which one.
- Is there a protocol that already implements the cryptographic requirements of this procedure? I imagine standard end-to-end encryption will not scale well. Also, since Alice would need to encrypt and publish her status once for every person she had contact to in the last few weeks, it might be possible to reverse-engineer parts of the social graph, which might compromise anonymity.