|Image courtesy: The Bureau of Investigative Journalism|
The paparazzi hide in bushes and use telephoto lenses to snap pictures of celebrities. The “cyberazzi” parachute into web browsers and sneak up behind mobile phones to spy on ordinary people. Nine such data mining companies must report what personal information they gather for sale by next week.
The U.S. Federal Trade Commission (FTC) has placed a deadline of February 1st for the nine companies - Acxiom, Corelogic, Datalogix, eBureau, ID Analytics, Intelius, Peekyou, Rapleaf and Recorded Future - to answer a series of questions that what data they gather from online activities, how often they gather the data and whether they get permission first and how they resell it ie with any kind of identifying data.
While the names of the companies may sound obscure, these companies silently lurk inside popular websites and games like Facebook and Farmville. Take RapLeaf – based in San Francisco, California – that has over a billion names and associated data harvested from unsuspecting visitors to websites like About.com which it sells to political campaigns. A recent Wall Street Journal article described how the company provided data to a Republican campaign for a Senate campaign in New Hampshire.
Acxiom of Conway, Arkansas, claims to have data on 500 million active consumers around the world, with about 1,500 data “points” per person derived from over 50 trillion data “transactions” a year. “Do you really know your customers?” an Acxiom sales pitch asks. “Simply asking for name and address information poses many challenges: transcription errors, increased checkout time and, worse yet, losing customers who feel that you’re invading their privacy.”
The company – which was ranked as the top advertising agency in the United States by Advertising Age magazine – has managed databases for 47 out of the Fortune 100 companies. One of the products it offers is a “race model” that a report in the New York Times noted “provides information on the major racial categories: Caucasians, Hispanics, African-Americans, or Asians.”
Recorded Future of Cambridge, Massachusetts – which has the dubious distinction of being funded by both the Central Intelligence Agency and Google – mines articles, blogs and Twitter for information that it analyzes. “Our customers are some of the largest corporations in the world that are interested in world events, hedge funds who do political risk-trading and even government agencies,” founder Christopher Ahlberg told the Financial Times.
In 2011, Jon Leibowitz, the chairman of the FTC, coined the word “cyberazzi” to describe these data companies. Last year his agency issued a report titled “Protecting Consumer Privacy in an Era of Rapid Change” which set out a series of recommendations including recommending “targeted legislation to provide greater transparency for, and control over, the practices of information brokers.”
While one arm of the U.S. government is concerned about protecting the privacy of consumer, at least two other government agencies are looking to hire such companies to help them spy on citizens.
Last January the Federal Bureau of Investigation posted a request for an application that would allow it to “provide an automated search and scrape capability of social networks including Facebook and Twitter … and (i)mmediately translate foreign language tweets into English.”
And about ten days ago the Transportation Security Administration asked data broker companies to propose applications “to generate an assessment of the risk to the aviation transportation system that may be posed by a specific individual” using “specific sources of current, accurate, and complete non-governmental data.” The initial plan is to use it to screen volunteer flyers who will be offered the benefits of “expedited screening lanes … leave on their shoes, light outerwear and belts, as well as leave laptops and … compliant liquids in carry-on bags.”
The biggest problem with mining online social media is the likelihood that it could make major mistakes, Jennifer Granick, director of civil liberties for Stanford University’s Center for Internet and Society, told the NextGov website. “You can have 15 percent accuracy for advertising” she said. “But if you are getting 85 percent of it wrong when you are denying people government benefits or sending out police to interview them, that would be completely wasteful and dangerous.”