I’ve blogged about the concerns with commercial data aggregation, the power of data mining, and about how “security via obscurity” no longer applies when databases are online and searchable.
Here’s a case showing just how easy it can be for amateurs to do a little data mining, track someone’s online activities, and perhaps even piece together some of their identity:
Over the weekend, Cory Doctorow blogged about this story of a woman lost her camera while on vacation, but the family who found it refused return it because their child liked it so much. A few days later, Cory received an e-mail from someone who claimed his name was “Don Deveny,” purportedly a Canadian lawyer, who implied that the post was illegal and that I was liable for making it. Cory doubted the legitimacy of the writer (misspelling “lawyer,” among other words, in the e-mail exchange sorta tipped him off), and decided to see what he could find out about “Don.”
Cory first contacted many of the law societies in Canada, none of whom had any record of a “Don Deveny” licensed to practice law in Canada. (BTW, it is illegal to pretend to be a lawyer). From their e-mail exchange, Cory was able to isolate the writer’s real e-mail address from the message headers, and through a Google search, find other pages that contain that address. That led Cory to a profile page for a user of the website called “Canada Kick A**” who shared the very same e-mail address. That profile page had a different person’s name (perhaps “Don’s” real name?), and also listed a location and profession for the user (he’s not a lawyer). Once Cory blogged about his discovery of this user page, its content was changed. (Cory has a screen shot of the original version on his site)
It didn’t take much to figure out (or at least get a better clue) as to who this e-mailer was.
Readers of Cory’s blog did some data mining of their own, and disovered a commenter at the original story’s site who shared many of the same sentiments of “Don,” along with many of the same spelling errors. This commentor used a different screen name, but when asked to identify himself, said he was a lawyer (Don, is that you?). Another reader then discovered that a user with that same screen name recently bid on memory cards at eBay that would have been used in the stolen camera. Have we found the thief?
NOW, a couple of my own comments.
First, this shows how easy it can be to track and cross-reference identities in different databases through examining e-mail headers, Google searches, and even IP tracking (which Cory didn’t mention doing, however). Anonymity is not easy to come by when we have so many different markers of our identity scatted throughout the Internet.
Second, many readers of the original post about the camera urged the woman to reveal the identity of the family who found it but refused to return it. As far as I can tell, she hasn’t given out their name, which is the right thing to do. Even when spurned, we should respect the privacy of those against us.
Third, Cory did decide to publish the entire correspondance between him and “Don,” as well as his real e-mail address, screenshots of the profile page, etc. I haven’t duplicated those here becuase I’m not fully comfortable disclosing that personal information (you can find it on Cory’s page if you really want it). This brings up issues of expectations of privacy, expectations that still exist even when we are trying to hide our true identity and eventually are “found out.”
And fourth, this makes me want to go and change many of my online accounts to use unique user names & e-mail addresses, limiting the ability for even an amateur to aggregate the data between my eBay purchases, Wikipedia edits, and Slashdot comments….
UPDATE: More amateur data mining led to the discovery of another forum page of a user with the same username and a common signature file – and this page provides a possible photo of “Don Deveny” – see the end of Cory’s post for the link.