MIT Technology Review has a brief article highlighting recent research activities in achieving protocols to enable privacy-preserving data mining. The article’s focus is a paper by Andrew Lindell, which he recently presented at Black Hat. From the article:
Lindell is one of a community of researchers studying ways to share this sort of information without exposing private details. Cryptographers have been working on solutions since the 1980s, and as more data is collected about individuals, Lindell says that it becomes increasingly important to find ways to protect data while also allowing it to be compared. Recently, he presented a cryptographic protocol that uses smart cards to solve the problem.
To use Lindell’s new protocol, the first party (“Alice” in cryptography speak) would create a key with which both parties could encrypt their data. The key would be stored on a special kind of secure smart card. Alice would then hand over the smart card to the second party in the scenario (known as “Bob”), and both parties would use the key to encrypt their respective databases. Next Alice sends her encrypted database to Bob.
The contents of Alice’s encrypted database cannot be read by Bob, but he can see where it matches entries in the encrypted version of his own database. In this way, Bob can see what information both he and Alice share. For extra protection, Bob would only have a limited amount of time to use the secret key on the smart card because it is deleted remotely by Alice, using a special messaging protocol.
The reporter of this article contacted me, asking for my perspective on the “societal implications” of this research. My quote:
Michael Zimmer, an assistant professor at the University of Wisconsin-Milwaukee who studies privacy and surveillance, says that Lindell is working on an important problem: “There can be some great benefits to data mining and the comparison of databases, and if we can arrive at methods to do this in privacy-protecting ways, that’s a good thing.” But he believes that developing secure ways of sharing information might encourage organizations to share even more data, raising new privacy concerns.
This is an active, and important, research area. (When I was at NYU, I participated in the PORTIA Project which did quite a bit of work trying to create similar solutions for privacy-protecting data mining.) But I hadn’t really thought about the concern expressed above until reflecting on it for this story. As I told the reporter, if new information-sharing activies emerge as a result of this kind of research, there will be great pressure on ensuring any new protocol has been sufficiently tested to ensure that re-identification is truly impossible.