Drone, Inc. - I. Social Network Analysis
Battalion commanders are well aware that the inadequate quality of video, thermal imagery, radar data, and phone location systems aboard drones provide little more than a voyeuristic look at remote locations—even with motion detection and geolocation algorithms to help point cameras.
Instead, analysts rely heavily on human intelligence (such as interrogations and tips from sources) as well as social network analysis computer software that claim to offer a sophisticated way to spot suspicious connections between people.145 Algorithms are applied to detect patterns inside the large quantities of raw data gathered.
Before the advent of these computer programs, the military conducted “link analysis.”146 Based on information derived from interviews, documents, and other intelligence sources, it involved physically sketching possible relationships between individuals. Results from such analyses are inherently subjective. They depend on the analyst’s intuition as well as the quality of the original data.
If any one source lies or errs, the entire map of relationships can be thrown off completely.
Social network analysis, by contrast, purports to provide objective answers.
The idea is that patterns in large data collections can help provide a visual depiction of the person’s objective importance within a given social network. Someone who is connected to many people may not be as important as someone who is connected to a few important people. The software also looks for factors like “betweenness” and “closeness.”147 By using social network analysis in combination with traditional link analysis, network experts like Vladimir Krebs say it is possible to “uncloak” a hidden network.148
After social network analysis has painted a picture of a network to be targeted, “pattern-of-life analysis” becomes a key technique to find specific targets. People perceived to be engaging in indicators of suspicious behavior are considered legitimate targets, and the analyst is assigned to see where they go, with whom they socialize, and other potentially suspicious activities.149
But unless the data set is verified, social network analysis and pattern-of-life analysis are prone to error and to confirmation bias (the tendency to use new data to support an unproven theory).
After 2001, the NSA conducted social network analysis on U.S. phone records and other electronic communications metadata with the help of a secret program, “Stellar Wind.”150 It pinpointed common phone numbers believed to be central nodes in a possible terrorist network. Instead, the numbers often turned out to be takeaway food outlets, prompting the FBI to dub them “Pizza Hut cases.” Bureau Director Robert Mueller estimated that 99 percent of Stellar Wind tips “wash[ed] out.”151
Despite these errors, such mapping is the military’s main way to try to identify secret networks152 within large populations. Tragically, when such methods are applied from a great distance, it becomes very easy to confuse two politicians or a reporter with a military target, even after correcting for blatant errors like the Pizza Hut example.
Two examples illustrate the confusion and its consequences.
The case of Muhammad Amin
On September 2, 2010, the NATO-led International Security Assistance Force (ISAF) in Kabul, Afghanistan, announced that a “precision air strike” had killed Muhammad Amin, a senior Taliban figure who was acting as shadow governor of Takhar province in northern Afghanistan. The military claimed that Amin and 8-12 other insurgents had been traveling in a convoy of six cars in a rural area.153
A jet dropped a large bomb on the car allegedly carrying Amin and his security detail. A helicopter gunship then swooped in to shoot other members of the convoy, some of whom were seen carrying weapons.
But Amin was not killed in the attack. He was tracked down later in Pakistan, where he was not only alive, but gave an interview to Michael Semple, a professor at Harvard University.154 An investigation by the Afghan Analysts Network revealed that the convoy really belonged to an Afghan parliamentary candidate campaigning for upcoming elections.155 The principal victim turned out to be the candidate’s uncle, Zabet Amanullah, a well-known public and respected member of the local community.
Kate Clarke, a former BBC reporter, who personally knew Amanullah says that the U.S. military told her that an Afghan detainee in U.S. custody provided interrogators with a mobile phone number for Muhammad Amin, to whom the detainee said he was related.
Clark says that the military told her that they never verified the original claim. “They didn’t do any background checks on either person. They had almost no knowledge about Amin and they hadn’t bothered to get any knowledge about Amanullah,” Clark told Andrew Cockburn, author of Kill Chain.156 Had they watched local TV news broadcasts, for that matter, they could have quickly realized that the two men were different individuals.
Instead, intelligence analysts created a network map of the calls made by Amanullah—including some to the real Amin in Pakistan—and those made by the recipients of his calls. Not surprisingly, this network corresponded with some of the most important players in the province, given that the owner was, indeed, a prominent political figure.
But rather than following up in person, the U.S. military waited for a moment when they had a clear shot at Amanullah and then killed him.
The case of Ahmed Zaidan
Ahmed Zaidan, the former Islamabad bureau chief for the Al Jazeera television network, grew up in Syria. A fluent Arabic speaker, he met Osama bin Laden in Kabul, Afghanistan, in November 2000, while on assignment as a reporter.157 A couple of months later, he was invited to attend the wedding of bin Laden’s son in Kandahar, on which he also reported on for the television network. After the U.S. invaded Afghanistan in 2001, Zaidan became one of the few recipients of Al Qaeda video tapes released over the next ten years, on which he also reported.158
Zaidan quickly became recognized as an expert on Al Qaeda and wrote the 2002 book, Bin Laden, Unmasked.159 He was not the only writer to capitalize on such meetings with the Al Qaeda leader. Peter Bergen of the New America Foundation published Holy War, Inc: Inside the Secret World of Osama Bin Laden and The Osama Bin Laden I Know.160
In May 2015, The Intercept released a leaked NSA document that identified Zaidan as a “member of Al Qa’ida.” The undated document explains how the SKYNET computer program examined some 80 variables like travel behavior, social networks and “patterns of life” from a trove of 55 million Pakistani mobile phone records that was gathered via an NSA collection program named Demonspit.161
SKYNET was trained by applying Random Decision Forests on 100,000 of these records. We don’t know exactly what bundles that the NSA created, but hypothetically they might have combined data from phone calls between Karachi and Waziristan, together with the ages of the users. Another bundle might have joined together data on frequent travelers together with unusual patterns of phone usage.
Inside that group of 100,000, the NSA included seven individuals alleged to be terrorists. Since SKYNET had originally provided only six identities, the spy agency was ecstatic when the software identified the seventh.
But this method has come in for severe criticism. “First, there are very few ‘known terrorists’ to use to train and test the model,” Patrick Ball, director of research at the Human Rights Data Analysis Group told the Ars Technica website.162 “If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullshit. The usual practice is to hold some of the data out of the training process so that the test includes records the model has never seen before. Without this step, their classification fit assessment is ridiculously optimistic.”
Further compromising the model, the NSA assumed that none of the other 100,000 individuals were terrorists. In real life, if the training data contained such individuals, the NSA would effectively be training the algorithm to ignore them.
Zaidan condemned the analysis of his mobile phone calls. “It is interesting to point [out] that the document also mentioned that I have the telephone numbers of very important people,” the reporter wrote on Al Jazeera’s website. “Was I supposed to have the phone numbers, with all due respect, only of garbage collectors, for example? Am I supposed to only have the contacts of unimportant people?”163
Zaidan pointed out that the NSA analysis neglected to consider the obvious: It “ignored my taped reports on Al Jazeera television that showed where I was and with whom I was meeting between 2001 and 2011.”
Last but not least, Zaidan pointed out that the analysis contained glaring factual errors: It claimed, for example, that he was simultaneously a member of al-Qaeda and the Muslim Brotherhood, which are sworn enemies.
It is still not clear if the NSA was using Zaidan simply as a case study on SKYNET or if it was convinced of his guilt. Either way, he has been luckier than Amanullah. As a member of the media, his innocence was vouched for by colleagues like Bergen. Still, deciding to err on the safe side after the document was leaked, Zaidan left Pakistan to work out of the United Arab Emirates.164
Graph Databases & Semantic Wikis
Many companies sell social network analysis tools to the Pentagon to help the government mine the vast silos of sensor and related surveillance data gathered on a daily basis. These “Big Data” tools attempt to quantify uncertainty in complex problems. While mathematicians and statisticians who design such tools are wary of promising that their data models can identify individual criminals or potential attackers, corporate sales departments at military contractors are not shy about hyping their wares.
Palantir is one of the best known companies in this field, but others, including Modus Operandi and Leidos, offer add-on tools like Halogen and Wisdom to analyze information from the open internet, the deep web and social media accounts.
One of the key products on offer is “semantic wikis” because users can edit and update them like Wikipedia pages. But unlike Wikipedia, which is a collection of static text pages connected via hyperlinks, semantic software also classifies information inside data sets and attempts to interpret them.165
To begin with, users regularly add new information such as field reports from soldiers, news articles, social media posts, sales and bank records, floor plans, maps as well as video and phone location records from drones. The software then maps, tags, and stores this information in “triple store” databases (so named because they contain three elements: class, attribute, and value) or a “graph” database.166
The tagging is often done using natural language processing algorithms that examine the structure of words, phrases and sentences and assign specific values to each “object” in the database. Critical to this kind of search is the definition of the relationships among the various objects in the database.167
In many ways, these graph and triplestore databases are a sophisticated version of Google which uses proprietary algorithms to rank information and extract answers from vast databases of documents. The military databases, of course, have access to a parallel universe of classified data unavailable to public search engines like Google and Yahoo.
A major selling point of these tools is the ability to create multiple, different visual displays of stored data as a system of objects, properties and relationships to help military analysts—typically aged between 19 and 25—make sense of raw data from other countries, cultures, and languages and spot potential threats.
Modus Operandi, based close to Cape Canaveral in Florida, sells the military a product named Blade.168 It offers a Google-like query system for soldiers to look up information in intelligence databases indexed using the company’s Wave system and automatically generate tailored reports.
“If you input a new piece of information, like ‘This guy has a connection to this organization,’ that organization will appear on his page. And so you can click on that organization and it will take you to the page for the organization, then it gives you the links back to all the underlying reports where the information in the graph came from,” Eric Little, Modus Operandi’s chief scientist told Datanami. “Our system connects dots. We actually make the data smart and we make the data easily consumable for people to actually use for real decision-making.”169
Modus Operandi was awarded research contracts from the Navy to create wiki pages for military mobile devices called WISER (Wiki for Intelligent Semantic Event Reporting) and STAFF (Semantic Targeting and All-source Fusion Framework) to help track entities of interest and improve targeting effectiveness.170
Other research projects awarded to Modus Operandi include the Clear Heart project to analyze sensor data “to recognize adversarial intent in public areas,” and POLIS (Pattern Of Life Integrated System) “to find behavioral patterns that may indicate mal-intent.”171
“By analyzing data from many different sensors, this system will indicate—with visual alerts—if something deviates from expected patterns,” Peter Dyson, Modus Operandi CEO said in a press release about POLiS. “These red flags can be tremendously helpful in preventing malicious behavior in almost any setting, whether it’s a war front or an urban environment.”172
Modus Operandi now has a number of contracts for the DCGS computer network that forms the heart of the drone system, notably with the Army and the Marines.173
Easily the best known company in this field, Palantir of California was created with funding from In-Q-Tel, the investment arm of the CIA.174 Like Modus Operandi, Palantir sells database visualization products under brand names like Gotham and Raptor to the Pentagon and to industry.
One of Palantir’s first customers was the Joint Improvised Explosive Device Defeat Organization (JIEDDO) which bought a license to the Palantir software package to track down planters of roadside bombs in Iraq. Soldiers fell in love with the visually stunning reports that replaced the legacy spreadsheets with its rows and columns that soldiers had to search one at a time.
“It’s like plugging into the Matrix,” an anonymous Special Forces member stationed in Afghanistan told Bloomberg. “The first time I saw it, I was like, ‘Holy crap. Holy crap. Holy crap.’ ”175
“It supports the cops on the streets and the officers doing the investigations. They can now exactly see great information and the links between events and people,” Sgt. Peter Jackson of the Los Angeles Police Department was quoted in a company document. “Detectives love the type of information it provides. They can now do things that we could not do before.”176
Not everybody agrees. Many critics say that while Palantir software can create dazzling displays, it isn’t magic. Indeed, internal company documents leaked to Buzzfeed suggest that a number of big clients including American Express, Coca-Cola, and Nasdaq have canceled contracts for Palantir visualization software.
For example, in January 2015, after Palantir’s software failed to yield results, Coca-Cola backed out of a five-year project to create a data-sharing consortium between consumer packaged goods companies. American Express canceled a contract after 18 months. “We struggled from day 1 to make Palantir a sticky product for users and generate wins,” a Palantir employee said of the American Express contract. And Michele Buck, the North American president of Hershey’s, said she “did not see value from Palantir.”177
What does set Palantir apart is an aggressive and unusual marketing strategy. The company regularly gives away software to cultivate well-known clients like the International Consortium of Investigative Journalists.178 In addition it has also sued the Army to try to force the Pentagon to buy its software and replace the existing DCGS.179
Leidos Wisdom and Halogen
Leidos, which is based in Reston, Virginia, offers two data mining products: Halogen and Wisdom. Both products were originally developed by Lockheed Martin.180 (Lockheed also offers Dragon Dome, a full suite of intelligence tools for aircraft, drones, and satellites as well as GeoLAMP, which manages video and radar data from airborne sensors.)181
Wisdom, according to Lockheed’s original promotional literature, is “a predictive analytics and big data technology tool that monitors and analyzes rapidly changing open-source intelligence data.” Halogen provides security-cleared staff analysts to help compile reports that identify and analyze “human networks and their key components, to include leaders, facilitators, and influencers, as well as the threats and opportunities created by them.”182
Clients who have bought Wisdom include Walmart, which hired Lockheed to monitor the social media accounts of union activists with Organization United for Respect at Walmart (OUR Walmart) and to track protests planned for Walmart’s June 2013 week-long annual meeting in Bentonville, Arkansas.183
“With some assistance from LM [Lockheed Martin] we have created the attached map to track the caravan movements and approximate participants,” Kris Russell, a Lockheed risk program senior manager, wrote in an internal email that was revealed after activists sued the company for retaliating against employees who took part in protests.
Organizers who read the internal memos later said that Lockheed’s analysis was often wrong. Others noted that it was actually quite easy to mislead the surveillance team. “I sent a couple of fake tweets about where we would be or what we were doing,” Angela Williamson, an OUR Walmart organizer told Bloomberg. “I wonder how people feel about Walmart wasting money by hiring Lockheed Martin to read my tweets. I wouldn’t be happy about that if I was a shareholder.”