Home » CEPE, Conferences, Facebook, Online Privacy, Research ethics

Draft Paper: “But the Data is Already Public”: On the Ethics of Research in Facebook

18 June 2009 880 views No Comment Print This Post

Next week I will be attending the 8th International Conference of Computer Ethics: Philosophical Enquiry in Corfu, Greece, where I will be presenting an early draft of a paper based on my critique of the “Taste, Ties, and Time” Facebook data release.

Recall that last fall, a group of researchers affiliated with the Berkman Center for Internet & Society at Harvard University released a dataset of Facebook profile information from an entire cohort (the class of 2009) of college students from “an anonymous, northeastern American university.” While the researchers took good faith steps to preserve the anonymity of the source of the data (and, presumably, the privacy of the subjects), I quickly narrowed it down to 7 possible universities, and then with only a little more effort, identified the source (with some confidence) as Harvard College. All this without ever even downloading or looking at the actual data.

The researchers have since pulled the data out of circulation, and plan to make it available again this month, presumably with some of the anonymity and privacy concerns addressed.

The draft paper I am presenting, “But the Data is Already Public”: On the Ethics of Research in Facebook, retells the circumstances around the T3 project and my partial re-identification of the dataset. It also describes some of the good faith efforts made by the T3 researchers to try to ensure the anonymity of the data, but exposes the limitations and errors in their procedures. Finally, it highlights the broader challenges for engaging in research on/in social networking sites that this case brings to light. These include:

  • the nature of consent in online research
  • identifying and respecting expectations of privacy on social network sites
  • developing sufficient strategies for data anonymization prior to the public release of potentially personally-identifiable data
  • measuring the relative expertise of institutional review boards when confronted with innovative research projects based on data gleaned from social media

Future versions of the paper will attempt to provide some guidelines in this regard. In the meantime, I welcome any comments on this draft. E-mail me if you would like to receive a copy.

The PDF of my CEPE presentation is here.

Related Posts »

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.