My Research in The Chronicle of Higher Education: “Harvard’s Privacy Meltdown”; some annotations

Picture1The Chronicle of Higher Education has published an article featuring my critique of the privacy protections and research methods related to the “Taste, Ties, and Time” (T3) Facebook research study conducted by a set of Harvard sociologists. Written by Marc Parry, the article is not-so-subtly teased as “Harvard’s Privacy Meltdown” on the Chronicle’s front page, and carries the title “Harvard Researchers Accused of Breaching Students’ Privacy: Social-network project shows promise and peril of doing social science online” within the link.

It is a well-written article, quite balanced, and features myself, the T3 principle researcher Jason Kaufman, and fellow Internet research experts Alex Halavais, Fred Stutzman, and Elizabeth Buchanan (I am friends with the latter three, for disclosure). The Chronicle also tracked down a Harvard student presumably within the dataset.

For those looking, my initial blog posts (from 2008) regarding the T3 dataset are here and here, and my full treatment of the dataset release was published here:

I don’t want to rehash the entire article or episode, but would like to provide a few annotations:


The article does a nice job pointing out the dual challenges of “Researchers [who] must navigate the shifting privacy standards of social networks and their users”, as well as the “the committees set up to protect research subjects—institutional review boards, or IRB’s—[who] lack experience with Web-based research.”

These are critical revelations that we cannot take lightly. There is much work to be done to ensure researchers of all disciplines and levels recognize and respond to the complexities of engaging in this kind of research online, and that IRBs are sufficiently trained to recognize issues related to Internet research ethics.

To these ends, the Association of Internet Researchers (AoIR) has published an ethics guide (now undergoing revisions) as “as at least a starting point for their inquiries and reflection”, and we’ve held various workshops on the subject. Elizabeth Buchanan and Charles Ess have spearheaded important research on the IRBs’ awareness of Internet-related concerns, and have launched the Internet Research Ethics Digital Library, Resource Center and Commons website as a valuable resource.

And, specific to the article’s mention that I have “pointed to the Harvard case in urging the federal government to do more to educate IRB’s about Web research”, I was privileged to present before the Secretary’s Advisory Committee on Human Research Protections (SACHRP), part of the Office for Human Research Protections in the United States Department of Health and Human Services (HHS). Joined by Elizabeth Buchanan, Montana Miller, and John Palfrey (of Harvard’s Berkman Center, by the way), we discussed emerging ethical issues with Internet-based research and urged the committee to take steps to ensure IRBs and researchers were suitably trained to recognize and address these important ethical issues.


In the context of this entire debate (and some of the original comments left on my blog posts), this passage from the article is quite telling:

But Mr. Kaufman talks openly about another controversial piece of his data gathering: Students were not informed of it. He discussed this with the institutional review board. Alerting students risked “frightening people unnecessarily,” he says.

“We all agreed that it was not necessary, either legally or ethically,” Mr. Kaufman says.

Frankly, I’m troubled by this statement. I will leave it to legal experts to determine if the research violated the consent requirements of the Federal Regulations for the Protection of Human Subjects (45 CFR 46), but from an ethical standpoint, I argue the researchers did have an obligation to respect the intentions of those students who might have restricted their Facebook profiles to only be visible to members of the Harvard community. The researcher’s own codebook acknowledged that the assistants used to access the profile data might have had preferential access to a profile, and that “a given student’s information should not be considered objectively ‘public’ or ‘private'”. This realization should have triggered an ethical concern over whether each students truly intended to have their profile data publicly visible and accessible for downloading.

This is the crux of the issue, and my earlier attempts to learn if and how this apparent waiver of the consent requirement was deliberated by Harvard’s IRB were unsuccessful. Perhaps now we can gain a bit more understanding of why it was deemed that consent wasn’t necessary (and I hope it was a more nuanced decision than simply avoiding “frightening people unnecessarily”).


I agree with the article’s conclusion that the “biggest victim” in this episode is academic scholarship.

The uniqueness of this dataset is of obvious value for sociologists and Internet researchers, and it wasn’t my goal to shut down this research project. It is unfortunate the researchers haven’t been able to find a suitable means of re-releasing the data, but just like the AOL search data release forced us to rethink methods of anonymization before again releasing large datasets of transaction logs, I’m hopeful that this episode can prompt meaningful consideration and debate of our understandings of privacy, anonymity/identifiability, consent, and harm when it comes to Internet-based research.


Finally, I wanted to provide a brief response to the implicit accusation made in the article that I’m a part of some kind of “academic paparazzi”.

I’m not even sure what this means. Perhaps someone thinks I spend my time trolling through other people’s research hoping to find a place where they slip up so I can have a “gotcha” moment? Hardly. I had never written on research ethics until I came across this particular case. I saw a passing mention of the data release on another scholar’s blog, and the ensuing discussion there about how the presumed anonymity of the dataset should be questioned due to its unique data variables. So I started to explore, and my discoveries followed. I’m not out to get anyone, but rather have taken quite a number of proactive steps to help researchers (both the T3 team and more broadly) address these complexities.

The complexities of research ethics and methodology in today’s Internet-based environment is complex, and I’m just starting to scratch the surface. But I don’t take this lightly; I’m a scholar, not a paparazzo.

As I conclude in my full article:

The purpose of this critical analysis of the T3 project is not to place blame or single out these researchers for condemnation, but to use it as a case study to help expose the emerging challenges of engaging in research within online social network settings. …The T3 research project might very well be ushering in ‘‘a new way of doing social science’’, but it is our responsibility scholars to ensure our research methods and processes remain rooted in long- standing ethical practices. Concerns over consent, privacy and anonymity do not disappear simply because subjects participate in online social networks; rather, they become even more important.

I hope that’s the takeaway from all this.

Leave a comment