Recent Entries

CFP: Performance, New Media, and Surveillance

Even in the Situation Room, the Medium is the Message

Having IP Problems with Google? Better Accept a Cookie, and Leave your Name at the Door

Proof Sergey Brin is Bored: Google SearchWiki with Sound

Position Announcement: Yale Information Society Project Fellowships

Maltego: Data-Mining Tool for the Masses

SearchWiki: Boon for Google, Bust for Privacy

The Future of Privacy Forum


Categories

4S  4th Amendment  A2K  AOIR  AOL  Academic  Amateur data mining  Ask.com  Auto Black Boxes  Behavioral targeting  Blogging  Books  CFP  CFP08  CIPR  Cellphones  Censorship  China  ChoicePoint  Conferences  Constitution  Contextual Integrity  Cookies  Copyright  DRM  DSRC  Dan Solove  Data Aggregation  Data mining  Dataveillance  Dissertation  DoubleClick  Ethics  Facebook  Facial recognition  Flickr  GPS  Gmail  Google  Google Print  Helen Nissenbaum  Human Rights  Humor  IINW  ISP  Identity  Identity 2.0  Information theory  Intellectual Privacy  Intellectual Property  Interfaces  Internet  Law  Libraries  Locational privacy  Media  Media Ecology  Microsoft  Milwaukee  MySpace  Netaveillance  Networked Vehicle Systems  OneWebDay  Online Privacy  Orkut  PORTIA  Paid Search  Perfect Search  Personal  Personalized Search  Policy  Privacy  Privacy in Public  Privacy on the Roads  Publications  Quaero  RFID  Reputation systems  Riya  SOIS  Search Engine Bias  Search Engines  Search privacy  Siva Vaidhyanathan  Social networks  Spyware  Street View  Surveillance  Talks  Technology & Society  TrackMeNot  Uncategorized  Values in Design  Web 2.0  Wi-fi  Wikipedia  Yahoo  YouTube  eHealth  iPod 

Rss Feed




  • Powered by FeedBlitz
  • Campaigns

    Join EFF Today

    I support individual rights

    Stop Data Retention

    I am a hard bloggin' scientist. Read the Manifesto.

    Meta

    Creative Commons License

    AOL Search Log Profiles Unmasked

    Posted on Wednesday, August 9th, 2006 at 7:56 am

    It is not that hard to identity actual users from the “anonymous” search data released by AOL. The New York Times quickly found user No. 4417749:

    No. 4417749 conducted hundreds of searches over a three-month period on topics ranging from “numb fingers” to “60 single men” to “dog that urinates on everything.”

    And search by search, click by click, the identity of AOL user No. 4417749 became easier to discern. There are queries for “landscapers in Lilburn, Ga,” several people with the last name Arnold and “homes sold in shadow lake subdivision gwinnett county georgia.”

    It did not take much investigating to follow that data trail to Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her three dogs. “Those are my searches,” she said, after a reporter read part of the list to her.

    And Philipp Lenssen has compiled many additional profiles who probably hope they won’t be found out:

    User 6426084

    6426084 is a definite fan of pitbull dogs. And pitbull fighting. Looks like he wants to register a pitbull dog now, too. Other than that, 6426084 likes to search for “gangbuses” and “gangboats”.

    User 8268

    We got a power searcher here. 8268 makes frequent use of the minus search operator, and is interested in anything from aerospace technology to Thai food, from the Windows Multimedia Knowledgecenter to the Alias season finale.

    User 29665

    29665 is one of the more innocent searchers, looking for Johnny Cash, the Middle East, and pictures of famous psychologists. 29665 also wants to know how to save the rainforest.

    User 19655

    It’s after midnight. 19655 is looking for “dirty jokes for Christians”. Later, 19655 clarifies; “clean dirty jokes” is what he’s after. Finally, 19655 decides to settle for “inspiring bible quotes”.

    In another search, 19655 reveals a full name, including when and where that person went to University, and other names of that family (as well as their jobs).

    User 3286034

    Like many other AOL users, 3286034 got hit with a phishing mail addressed to him (“dear [full name]…”), and he pasted it into the search box to check on it. This reveals his full name, which can then be connected to all the other searches he did over the course of three months.

    User 1045042

    1045042 is researching the relationship between Republicans and terrorism.

    User 24868

    The life of 24868 circles around pottery barns, HTML, MySpace (one of AOL users’ favorites), camping, limos, bedroom furniture and hair extension tools.

    User 11829

    11829 is also into dogs (bulldogs), though not quite as obsessed as 6426084 above. Who knows why 11829 looked up red roofs, palm trees, kibbutz houses and chicken houses… or “little Arabian boys.” Wait… March 7, several search for “dog porn”. OK, maybe 11829 is obsessed with dogs. Searches for “submit pictures of dogs online” follow. (Is 11829 producing dog porn?) Other searches reveal what might be 11829’s home town. A couple of more regular searches, like “hairy chests”, “fake hairy chests” and “the theme from jurassic park” round up the day.

    User 20320

    We got a horse guy here. 20320’s searches circle around saddlery, horse racing, and jockeys. Other searches reveal 20320’s hometown, the age of 20320’s children, and the summer camp they’re going to. A search on May 18 is compiling facts on a “fast divorce”.

    User 22542

    AOL user 22542 is a classic case of confusing the search box with the browser address bar. Almost all searches are URLs, like www.bowwowinformation.com and www.barbie.com.

    22817

    User 22817 seems to look up every word in a dictionary. 22817’s quest starts on March 12, 5 PM:

    what does acute mean
    what does accompany mean
    what does adrenaline mean
    what does alternative mean
    what does acute mean
    what does ample mean
    what does abundant mean
    what does ambition mean
    what does ambiguous mean
    what does agony mean
    what does achieve mean
    what does apprehend mean
    what does annoy mean
    what does aggravate mean

    22817’s gives up after just two hours. A while later, 22817 searches for “summer activities”. Maybe there’s something more interesting to do?

    User 28963

    At 10:08 PM, 28963 looks for “porn sites”. 28963 quickly amends the search query to read “freee porn sites”. (Two days later, 28963 shows a sudden interest in genital warts.)

    User 29076

    Hip Hop fan 29076 likes AntiStudy.com. His searches include “disney chanal”, “emty lots”, “michael jordon timeline” and “goolge”.

    User 1133

    1133 is looking for “Google grass”. (What’s Google grass?)

    User 2761

    2761 wants to acquire a box of lobster tails. Might come in handy for the trip to Amsterdam…

    Related Posts:

    7 Responses to “AOL Search Log Profiles Unmasked”

    1. AOL Security ‘Screw-Up’: Search Data Released -- misunderestimation.com Says:

      [...] AOL Search Log Profiles Unmasked [...]

    2. max Says:

      There’s a website to analyze and duscuss particular AOL users: http://aol.zanoza.lv/

      “My neighbour is killing cats”: http://aol.zanoza.lv/user/723190
      “ways to kill yourself”: http://aol.zanoza.lv/user/9486162
      “wife killer”: http://aol.zanoza.lv/user/17556639

      A Face Is Exposed for AOL Searcher No. 4417749: http://aol.zanoza.lv/user/4417749

    3. Mike Says:

      Are there any other logs from other search engines available on the web? I am looking for the raw data and not the personal information. Please help!

      Regards,
      Mike
      http://www.ICannotFindIT.com

    4. Mike Says:

      If you are trying to find somehing on the web and you are not successful, please report it to us. We are reaserching the commercialization of the web content and we need some raw data. No personal information is captured or asked for.

      Regards,
      Mike
      http://www.ICannotFindIT.com

    5. michaelzimmer.org » Archives » On the “Anonymity” of the Facebook Dataset Says:

      [...] course, this sounds like an AOL-search-data-release-style privacy disaster waiting to happen. Recognizing this, the researchers detail some of the steps they’ve taken [...]

    6. michaelzimmer.org » Archives » More On the “Anonymity” of the Facebook Dataset - It’s Harvard College (Updated) Says:

      [...] the codebook, reading a press release, and watching a video presentation. The New York Times did it with the AOL search data release, and I’m sure someone will do it with this Facebook [...]

    7. Liminal states » Berkman Center doesn’t bother to consult with privacy experts before publishing 1700 students’ Facebook data (DRAFT) Says:

      [...] 30: Of course, this sounds like an AOL-search-data-release-style privacy disaster waiting to [...]

    Leave a Reply