Recent Entries

A Code of Best Practices in Fair Use for Online Video

Google (Quietly/Oddly) Adds Privacy Link to Homepage

Court Orders Google to Give All YouTube User Histories to Viacom

My Local Library Requires Patron’s SSNs

SPT 2009: Converging Technologies, Changing Societies

CEPE 2009: Eighth International Conference of Computer Ethics Philosophical Enquiry

CDT’s “The Internet in Transition: A Platform to Keep the Internet Open, Innovative, and Free”

Disrupting Google’s homepage with a 14-charater string


Categories

4S  4th Amendment  A2K  AOIR  AOL  Academic  Amateur data mining  Andrew Keen  Ask.com  Auto Black Boxes  Behavioral targeting  Blogging  Books  CEPE  CFP08  Cellphones  Censorship  China  ChoicePoint  Conferences  Constitution  Contextual Integrity  Cookies  Copyright  DRM  DSRC  Dan Solove  Data Aggregation  Data mining  Dataveillance  Dissertation  DoubleClick  Ethics  Facebook  Facial recognition  Flickr  GPS  Gmail  Google  Google Desktop  Google Print  HealthVault  Helen Nissenbaum  Humor  IINW  ISP  Identity  Identity 2.0  Information theory  Intellectual Privacy  Intellectual Property  Interfaces  Internet  Knowledge Tools  Law  Libraries  Locational privacy  Media  Media Ecology  Microsoft  Moli  MySpace  Netaveillance  Networked Vehicle Systems  Online Privacy  Orkut  PORTIA  Paid Search  Perfect Search  Personal  Personalized Search  Privacy  Privacy in Public  Privacy on the Roads  Publications  Quaero  RFID  Reputation systems  Riya  Search Engine Bias  Search Engines  Search privacy  Social networks  Spyware  Street View  Surveillance  Talks  Technology  Technology & Society  TrackMeNot  Uncategorized  Values in Design  Web 2.0  Wi-fi  Wikipedia  Yahoo  YouTube  anonymity  eHealth  iPod 

Rss Feed




  • Powered by FeedBlitz
  • Campaigns

    Join EFF Today

    I support individual rights

    Stop Data Retention

    I am a hard bloggin' scientist. Read the Manifesto.

    Meta

    Creative Commons License

    AOL Proudly Releases Massive Amounts of Private Data

    Posted on Monday, August 7th, 2006 at 9:48 am

    [I've pasted this in its entirety from TechCrunch - unbelievable]

    AOL must have missed the uproar over the DOJ’s demand for “anonymized” search data last year that caused all sorts of pain for Microsoft and Google. That’s the only way to explain their release of data that includes 20 million web queries from 650,000 AOL users.

    The data includes all searches from those users for a three month period this year, as well as whether they clicked on a result, what that result was and where it appeared on the result page. It’s a 439 MB compressed download, expanded to just over 2 gigs. The data is available here [UPDATE:they've removed the file] and the output is in ten text files, tab delineated.

    The utter stupidity of this is staggering. AOL has released very private data about its users without their permission. While the AOL username has been changed to a random ID number, the abilitiy to analyze all searches by a single user will often lead people to easily determine who the user is, and what they are up to. The data includes personal names, addresses, social security numbers and everything else someone might type into a search box.

    The most serious problem is the fact that many people often search on their own name, or those of their friends and family, to see what information is available about them on the net. Combine these ego searches with porn queries and you have a serious embarrassment. Combine them with “buy ecstasy” and you have evidence of a crime. Combine it with an address, social security number, etc., and you have an identity theft waiting to happen. The possibilities are endless.

    Marketers are going nuts over the possibilities, users are calling for a boycott of AOL, and others are just enraged:

    User 491577 searches for “florida cna pca lakeland tampa”, “emt school training florida”, “low calorie meals”, “infant seat”, and “fisher price roller blades”. Among user 39509’s hundreds of searches are: “ford 352″, “oklahoma disciplined pastors”, “oklahoma disciplined doctors”, “home loans”, and some other personally identifying and illegal stuff I’m going to leave out of here. Among user 545605’s searches are “shore hills park mays landing nj”, “frank william sindoni md”, “ceramic ashtrays”, “transfer money to china”, and “capital gains on sale of house”. Compared to some of the data, these examples are on the safe side. I’m leaving out the worst of it - searches for names of specific people, addresses, telephone numbers, illegal drugs, and more. There is no question that law enforcement, employers, or friends could figure out who some of these people are.

    There is some really scary stuff in this data.

    I am assuming that AOL will take this page and the data down soon, but as of the time of this post it has been downloaded 809 times already. People I’ve spoken with are already building a web interface to the data. If you are an AOL customer, I feel sorry for you.

    Note that Microsoft has proposed releasing similar data to researchers, although with an important difference - the data is not associated with a user. Excite released data very similar to what AOL has done here, with user associations, in 1999.

    [More coverage here: siliconbeat, digg, reddit, zoli's blog]

    Related Posts:

    3 Responses to “AOL Proudly Releases Massive Amounts of Private Data”

    1. Privacy Digest: Privacy News (Civil Rights, Encryption, Free Speech, Cryptography) Says:

      More (disturbing) AOL profiles….

    2. ty Says:

      A site where you can search the data is here:

      http://www.datablunder.com/logitems/query/

    3. Privacy Digest: Privacy News (Civil Rights, Encryption, Free Speech, Cryptography) Says:

      TrackMeNot Firefox Extension Obfuscates Your Search History….

    Leave a Reply