Can Congress — or Robert Mueller — Order Facebook to Direct Its Machine Learning?

The other day I pointed out that two articles (WSJ, CNN) — both of which infer that Robert Mueller obtained a probable cause search warrant on Facebook based off an interpretation that under Facebook’s privacy policy a warrant would be required — actually ignored two other possibilities. Without something stronger than inference, then, these articles do not prove Mueller got a search warrant (particularly given that both miss the logical step of proving that the things Facebook shared with Mueller count as content and not business records).

In response to that and to this column arguing that Facebook should provide more information, some of the smartest surveillance lawyers in the country discussed what kind of legal process would be required, but were unable to come to any conclusions.

Last night, WaPo published a story that made it clear Congress wanted far more than WSJ and CNN had suggested (which largely fell under the category of business records and the ads posted to targets, the latter of which Congress had been able to see but not keep). What Congress is really after is details about the machine learning Facebook used to identify the malicious activity identified in April and the ads described in its most recent report, to test whether Facebook’s study was thorough enough.

A 13-page “white paper” that Facebook published in April drew from this fuller internal report but left out critical details about how the Russian operation worked and how Facebook discovered it, according to people briefed on its contents.

Investigators believe the company has not fully examined all potential ways that Russians could have manipulated Facebook’s sprawling social media platform.

[snip]

Congressional investigators are questioning whether the Facebook review that yielded those findings was sufficiently thorough.

They said some of the ad purchases that Facebook has unearthed so far had obvious Russian fingerprints, including Russian addresses and payments made in rubles, the Russian currency.

Investigators are pushing Facebook to use its powerful data-crunching ability to track relationships among accounts and ad purchases that may not be as obvious, with the goal of potentially detecting subtle patterns of behavior and content shared by several Facebook users or advertisers.

Such connections — if they exist and can be discovered — might make clear the nature and reach of the Russian propaganda campaign and whether there was collusion between foreign and domestic political actors. Investigators also are pushing for fuller answers from Google and Twitter, both of which may have been targets of Russian propaganda efforts during the 2016 campaign, according to several independent researchers and Hill investigators.

“The internal analysis Facebook has done [on Russian ads] has been very helpful, but we need to know if it’s complete,” Schiff said. “I don’t think Facebook fully knows the answer yet.”

[snip]

In the white paper, Facebook noted new techniques the company had adopted to trace propaganda and disinformation.

Facebook said it was using a data-mining technique known as machine learning to detect patterns of suspicious behavior. The company said its systems could detect “repeated posting of the same content” or huge spikes in the volume of content created as signals of attempts to manipulate the platform.

The push to do more — led largely by Adam Schiff and Mark Warner (both of whom have gotten ahead of the evidence at times in their respective studies) — is totally understandable. We need to know how malicious foreign actors manipulate the social media headquartered in Schiff’s home state to sway elections. That’s presumably why Facebook voluntarily conducted the study of ads in response to cajoling from Warner.

But the demands they’re making are also fairly breathtaking. They’re demanding that Facebook use its own intelligence resources to respond to the questions posed by Congress. They’re also demanding that Facebook reveal those resources to the public.

Now, I’d be surprised (pleasantly) if either Schiff or Warner made such detailed demands of the NSA. Hell, Congress can’t even get NSA to count how many Americans are swept up under Section 702, and that takes far less bulk analysis than Facebook appears to have conducted. And Schiff and Warner surely would never demand that NSA reveal the extent of machine learning techniques that it uses on bulk data, even though that, too, has implications for privacy and democracy (America’s and other countries’). And yet they’re asking Facebook to do just that.

And consider how two laws might offer guidelines, but (in my opinion) fall far short of authorizing such a request.

There’s Section 702, which permits the government to oblige providers to provide certain data on foreign intelligence targets. Section 702’s minimization procedures even permit Congress to obtain data collected by the NSA for their oversight purposes.

Certainly, the Russian (and now Macedonian and Belarus) troll farms Congress wants investigated fall squarely under the definition of permissible targets under the Foreign Government certificate. But there’s no public record of NSA making a request as breathtaking as this one, that Facebook (or any other provider) use its own intelligence resources to answer questions the government wants answered. While the NSA does draw from far more data than most people understand (including, probably, providers’ own algorithms about individually targeted accounts), the most sweeping request we know of involves Yahoo scanning all its email servers for a signature.

Then there’s CISA, which permits providers to voluntarily share cyber threat indicators with the federal government, using these definitions:

(A) IN GENERAL.—Except as provided in subparagraph (B), the term “cybersecurity threat” means an action, not protected by the First Amendment to the Constitution of the United States, on or through an information system that may result in an unauthorized effort to adversely impact the security, availability, confidentiality, or integrity of an information system or information that is stored on, processed by, or transiting an information system.

(B) EXCLUSION.—The term “cybersecurity threat” does not include any action that solely involves a violation of a consumer term of service or a consumer licensing agreement.

(6) CYBER THREAT INDICATOR.—The term “cyber threat indicator” means information that is necessary to describe or identify—

(A) malicious reconnaissance, including anomalous patterns of communications that appear to be transmitted for the purpose of gathering technical information related to a cybersecurity threat or security vulnerability;

(B) a method of defeating a security control or exploitation of a security vulnerability;

(C) a security vulnerability, including anomalous activity that appears to indicate the existence of a security vulnerability;

(D) a method of causing a user with legitimate access to an information system or information that is stored on, processed by, or transiting an information system to unwittingly enable the defeat of a security control or exploitation of a security vulnerability;

(E) malicious cyber command and control;

(F) the actual or potential harm caused by an incident, including a description of the information exfiltrated as a result of a particular cybersecurity threat;

(G) any other attribute of a cybersecurity threat, if disclosure of such attribute is not otherwise prohibited by law; or

(H) any combination thereof.

Since January, discussions of Russian tampering have certainly collapsed Russia’s efforts on social media with their various hacks. Certainly, Russian abuse of social media has been treated as exploiting a vulnerability. But none of this language defining a cyber threat indicator envisions the malicious use of legitimate ad systems.

Plus, CISA is entirely voluntary. While Facebook thus far has seemed willing to be cajoled into doing these studies, that willingness might change quickly if they had to expose their sources and methods, just as NSA clams up every time you ask about their sources and methods.

Moreover, unlike the sharing provisions in 702 minimization procedures, I’m aware of no language in CISA that permits sharing of this information with Congress.

Mind you, part of the problem may be that we’ve got global companies that have sources and methods that are as sophisticated as those of most nation-states. And, inadequate as they are, Facebook is hypothetically subject to more controls than nation-state intelligence agencies because of Europe’s data privacy laws.

All that said, let’s be aware of what Schiff and Warner are asking for, however justified it may be from a investigative standpoint. They’re asking for things from Facebook that they, NSA’s overseers, have been unable to ask from NSA.

If we’re going to demand transparency on sources and methods, perhaps we should demand it all around?

Senate Intelligence Committee Doesn’t Think the Intelligence Community Inspector General Does Enough All-IC Oversight

The Intelligence Community Inspector General receives just two mentions in the Intelligence Authorization released earlier this month. First, in a standalone section that will permit it to hire expert auditors, as other Inspectors General can. The bill report explains that section this way.

Section 307. Inspector General of the Intelligence Community auditing authority

Section 307 permits the IC IG to hire contractor or expert auditors to meet audit requirements, similar to other Federal IGs. Section 307 responds to the Committee’s concerns that the IC Inspector General (IC IG) is at risk of failing to meet its legislative requirements due to its inability to hire qualified auditors by granting the IC IG independent hiring practices identical to other IGs.

Good to see that eight years after it was created, the ICIG will be able to start doing competent financial audits.

In addition, the unclassified portion of the Intel Authorization includes the ICIG among those Inspectors General that must see whether its agencies are classifying and declassifying things properly.

Which suggests this passage — which goes far beyond those two passages — may correspond to some language within the classified portion of the bill.

Inspector General of the Intelligence Community role and responsibilities

The Inspector General of the Intelligence Community (IC IG) was established by the Intelligence Authorization Act for Fiscal Year 2010 to initiate and “conduct independent reviews investigations, inspections, audits, and reviews on programs and activities within the responsibility and authority of the Director of National Intelligence” and to lead the IG community in its activities. The Committee is concerned that this intent is not fully exercised by the IC IG and reiterates the Congress’s intent that it consider its role as an IG over all IC-wide activities in addition to the ODNI. To support this intent, the Committee has directed a number of requirements to strengthen the IC IG’s role and expects full cooperation from all Offices of Inspector General across the IC.

The Committee remains concerned about the level of protection afforded to whistleblowers within the IC and the level of insight congressional committees have into their disclosures. It is the Committee’s expectation that all Offices of Inspector General across the IC will fully cooperate with the direction provided elsewhere in the bill to ensure both the Director of National Intelligence and the congressional committees have more complete awareness of the disclosures made to any IG about any National Intelligence Program funded activity.

Ron Wyden submitted — but then withdrew — language extending whistleblower protection to contractors. Instead there’s just this language nodding, yet again, to protecting those who whistleblow.

But I’m as interested in SSCI “reiterate[d] the Congress’s intent that [ICIG] consider its role as an IG over all IC-wide activities in addition to the ODNI.”

Going back to 2011, the ICIG refused to do a community-wide review of the way Section 702 works (or count how many Americans get sucked up). With EO 12333 sharing raw data with other agencies, it behooves the ICIG to review how that process works.

The Intel Authorization also requires a review to make sure all the agencies shared the data they should have on Russian tampering with the election. It turns out the interagency “Task Force” John Brennan set up in the summer was a CIA-led task force. It wasn’t until December that a broader set of analysts were permitted to review the intelligence, leading to new discoveries (including, it seems, new conversations between Trump officials and Russians of interest). And it seems highly likely that DHS was left out of the loop, which would be especially problematic given that that’s the agency that talks to state electoral officials.

As Mike Pompeo seems intent on politicizing Iran intelligence and killing diversity at CIA, I hope ICIG gets directed to review CIA’s approach to both of those issues.

There are likely more items of interest addressed in the “requirements to strengthen the IC IG’s role.” Which is a good thing.

UNITEDRAKE and Hacking under FISA Orders

As I noted yesterday, along with the encrypted files you have to pay for, on September 6, Shadow Brokers released the manual for an NSA tool called UNITEDRAKE.

As Bruce Schneier points out, the tool has shown up in released documents on multiple occasions — in the catalog of TAO tools leaked by a second source (not Snowden) and released by Jacob Appelbaum, and in three other Snowden documents (one, two, three) talking about how the US hacks other computers, all of which first appeared in Der Spiegel’s reporting (one, two, three). [Update: See ElectroSpaces comments about this Spiegel reporting and its source.]

The copy, as released, is a mess — it appears to have been altered by an open source graphics program and then re-saved as a PDF. Along with classification marks, the margins and the address for the company behind it appears to have been altered.

The NSA is surely doing a comparison with the real manual (presumably as it existed at the time it may have been stolen) in an effort to understand how and why it got manipulated.

I suspect Shadow Brokers released it as a message to those pursuing him as much as to entice more Warez sales, for the observations I lay out below.

The tool permits NSA hackers to track and control implants, doing things like prioritizing collection, controlling when an implant calls back and how much data is collected at a given time, and destroying an implant and the associated UNITEDRAKE code (PDF 47 and following includes descriptions of these functions).

It includes doing things like impersonating the user of an implanted computer.

Depending on how dated this manual is, it may demonstrate that Shadow Brokers knows what ports the NSA will generally use to hack a target, and what code might be associated with an implant.

It also makes clear, at a time when the US is targeting Russia’s use of botnets, that the NSA carries out its own sophisticated bot-facilitated collection.

Finally of particular interest to me, the manual shows that UNITEDRAKE can be used to hack targets of FISA orders.

To use it to target people under a FISA order, the NSA hacker would have to enter both the FISA order number and the date the FISA order expires. After that point, UNITEDRAKE will simply stop collecting off that implant.

Note, I believe that — at least in this deployment — these FISA orders would be strictly for use overseas. One of the previous references to UNITEDRAKE describes doing a USSID-18 check on location.

SEPI analysts validate the target’s identity and location (USSID-18 check), then provide a deployment list to Olympus operators to load a more sophisticated Trojan implant (currently OLYMPUS, future UNITEDRAKE).

That suggests this would be exclusively EO 12333 collection — or collection under FISA 704/705(b) orders.

But the way in which UNITEDRAKE is used with FISA is problematic. Note that it doesn’t include a start date. So the NSA could collect data from before the period when the court permitted the government to spy on them. If an American were targeted only under Title I (permitting collection of data in motion, therefore prospective data), they’d automatically qualify for 705(b) targeting with Attorney General approval if they traveled overseas. Using UNITEDRAKE on — say, the laptop they brought with them — would allow the NSA to exfiltrate historic data, effectively collecting on a person from a time when they weren’t targeted under FISA. I believe this kind of temporal problem explains a lot of the recent problems NSA has had complying with 704/705(b) collection.

In any case, Shadow Brokers may or may not have UNITEDRAKE among the files he is selling. But what he has done by publishing this manual is tell the world a lot of details about how NSA uses implants to collect intelligence.

And very significantly for anyone who might be targeted by NSA hacking tools under FISA (including, presumably, him), he has also made it clear that with the click of a button, the NSA can pretend to be the person operating the computer. This should create real problems for using data hacked by NSA in criminal prosecutions.

Except, of course, especially given the provenance problems with this document, no defendant will ever be able to use it to challenge such hacking.

EO 12333 Sharing Will Likely Expose Security Researchers Even More Via Back Door Searches

At Motherboard, I have piece arguing that the best way to try to understand the Marcus Hutchins (MalwareTech) case is not from what we see in his indictment for authoring code that appears in a piece of Kronos malware sold in 2015. Instead, we should consider why Hutchins would look different to the FBI in 2016 (when the government didn’t arrest him while he was in Las Vegas) and 2017 (when they did). In 2016, he’d look like a bit player in a minor dark market purchase made in 2015. In 2017, he might look like a guy who had his finger on the WannaCry malware, but also whose purported product, Kronos, had been incorporated into a really powerful bot he had long closely tracked, Kelihos.

Hutchins’ name shows up in chats obtained in an investigation in some other district. Just one alias for Hutchins—his widely known “MalwareTech”—is mentioned in the indictment. None of the four or more aliases Hutchins may have used, mostly while still a minor, was included in the indictment, as those aliases likely would have been if the case in chief relied upon evidence under that alias.

Presuming the government’s collection of both sets of chat logs predates the WannaCry outbreak, if the FBI searched on Hutchins after he sinkholed the ransomware, both sets of chat logs would come up. Indeed, so would any other chat logs or—for example—email communications collected under Section 702 from providers like Yahoo, Google, and Apple, business records from which are included in the discovery to be provided in Hutchins’ case in FBI’s possession at that time. Indeed, such data would come up even if they showed no evidence of guilt on the part of Hutchins, but which might interest or alarm FBI investigators.

There is another known investigation that might elicit real concern (or interest) at the FBI if Hutchins’s name showed up in its internal Google search: the investigation into the Kelihos botnet, for which the government obtained a Rule 41 hacking warrant in Alaska on April 10 and announced the indictment of Russian Pyotr Levashov in Connecticut on April 21. Eleven lines describing the investigation in the affidavit for the hacking warrant remain redacted. In both its announcement of his arrest and in the complaint against Levashov for operating the Kelihos botnet, the government describes the Kelihos botnet loading “a malicious Word document designed to infect the computer with the Kronos banking Trojan.”

Hutchins has tracked the Kelihos botnet for years—he even attributes his job to that effort. Before his arrest and for a period that extended after Levashov’s arrest, Hutchins ran a Kelihos tracker, though it has gone dead since his arrest. In other words, the government believes a later version of the malware it accuses Hutchins of having a hand in writing was, up until the months before the WannaCry outbreak—being deployed by a botnet he closely tracked.

There are a number of other online discussions Hutchins might have participated in that would come up in an FBI search (again, even putting aside more dated activity from when he was a teenager). Notably, the attack on two separate fundraisers for his legal defense by credit card fraudsters suggests that corner of the criminal world doesn’t want Hutchins to mount an aggressive defense.

All of which is to say that the FBI is seeing a picture of Hutchins that is vastly different than the public is seeing from either just the indictment and known facts about Kronos, or even open source investigations into Hutchins’ past activity online.

To understand why Hutchins was arrested in 2017 but not in 2016, I argue, you need to understand what a back door search conducted on him in May would look like in connection with the WannaCry malware, not what the Kronos malware looks like as a risk to the US (it’s not a big one).

I also note, however, that in addition to the things FBI admitted they searched on during their FBI Google searches — Customs and Border Protection data, foreign intelligence reports, FBI’s own case files, and FISA data (both traditional and 702) — there’s something new in that pot: data collected under EO 12333 shared under January’s new sharing procedures.

That data is likely to expose a lot more security researchers for behavior that looks incriminating. That’s because FBI is almost certainly prioritizing asking NSA to share criminal hacker forums — where security researchers may interact with people they’re trying to defend against in ways that can look suspicious if reviewed out of context. That’s true, first of all, because many of those forums (and other dark web sites) are overseas, and so are more accessible to NSA collection. The crimes those forums facilitate definitely impact US victims. But criminal hacking data — as distinct from hacking data tied to a group that the government has argued is sponsored by a nation-state — is also less available via Section 702 collection, which as far as we know still limits cybersecurity collection to the Foreign Government certificate.

If I were the FBI I would have used the new rules to obtain vast swaths of data sitting in NSA’s coffers to facilitate cybersecurity investigations.

So among the NSA-collected data we should expect FBI newly obtained in raw form in January is that from criminal hacking forums. Indeed, new dark web collection may have facilitated FBI’s rather impressive global bust of several dark web marketing sites this year. (The sharing also means FBI will no longer have to go the same lengths to launder such data it obtains targeting kiddie porn, which it appears to have done in the PlayPen case.)

As I think is clear, such data will be invaluable for FBI as it continues to fight online crime that operates internationally. But because back door searches happen out of context, at a time when the FBI may not really understand what it is looking at, it also risks exposing security researchers in new ways to FBI’s scrutiny.

 

Facebook’s Global Data: A Parallel Intelligence Source Rivaling NSA

In April, Facebook released a laudable (if incredible) report on Russian influence operations on Facebook during the election; the report found that just .1% of what got shared in election related activity go shared by malicious state-backed actors.

Facebook conducted research into overall civic engagement during this time on the platform, and determined that the reach of the content shared by false amplifiers was marginal compared to the overall volume of civic content shared during the US election.

[snip]

The reach of the content spread by these accounts was less than one-tenth of a percent of the total reach of civic content on Facebook.

Facebook also rather coyly confirmed they had reached the same conclusion the Intelligence Community had about Russia’s role in tampering with the election.

Facebook is not in a position to make definitive attribution to the actors sponsoring this activity. It is important to emphasize that this example case comprises only a subset of overall activities tracked and addressed by our organization during this time period; however our data does not contradict the attribution provided by the U.S. Director of National Intelligence in the report dated January 6, 2017.

While skeptics haven’t considered this coy passage (and Facebook certainly never called attention to it), it means a second entity with access to global data — like the NSA but private — believes Russia was behind the election tampering.

Yesterday, Facebook came out with another report, quantifying how many ads came from entities that might be Russian information operations. They searched for two different things. First, ads from obviously fake accounts. They found 470 inauthentic accounts paid for 3,000 ads costing $100,000. But most of those didn’t explicitly discuss a presidential candidate, and more of the geo-targeted ones appeared in 2015 than in 2016.

  • The vast majority of ads run by these accounts didn’t specifically reference the US presidential election, voting or a particular candidate.
  • Rather, the ads and accounts appeared to focus on amplifying divisive social and political messages across the ideological spectrum — touching on topics from LGBT matters to race issues to immigration to gun rights.
  • About one-quarter of these ads were geographically targeted, and of those, more ran in 2015 than 2016.
  • The behavior displayed by these accounts to amplify divisive messages was consistent with the techniques mentioned in the white paper we released in April about information operations.

Elsewhere Facebook has said some or all of these are associated with a troll farm, the Internet Research Agency, in Petersburg.

The Intelligence Community Report on the Russia hacks specifically mentioned the Internet Research Agency — suggesting it probably had close ties to Putin. But it also suggested there was significant advertising that was explicitly pro-Trump, which may be inconsistent with Facebook’s observation that the majority of these ads ran policy, rather than candidate ads.

Russia used trolls as well as RT as part of its influence efforts to denigrate Secretary Clinton. This effort amplified stories on scandals about Secretary Clinton and the role of WikiLeaks in the election campaign.

  • The likely financier of the so-called Internet Research Agency of professional trolls located in Saint Petersburg is a close Putin ally with ties to Russian intelligence.
  • A journalist who is a leading expert on the Internet Research Agency claimed that some social media accounts that appear to be tied to Russia’s professional trolls—because they previously were devoted to supporting Russian actions in Ukraine—started to advocate for President-elect Trump as early as December 2015.

The other thing Facebook did was measure how many ads that might have originated in Russia without mobilizing an obviously fake account. That added another $50,000 in advertising to the pot of potential Russian disinformation.

In this latest review, we also looked for ads that might have originated in Russia — even those with very weak signals of a connection and not associated with any known organized effort. This was a broad search, including, for instance, ads bought from accounts with US IP addresses but with the language set to Russian — even though they didn’t necessarily violate any policy or law. In this part of our review, we found approximately $50,000 in potentially politically related ad spending on roughly 2,200 ads.

Still, that’s not all that much — it may explain why Facebook found only .1% of activity was organized disinformation.

In its report, Facebook revealed that it had shared this information with those investigating the election.

We have shared our findings with US authorities investigating these issues, and we will continue to work with them as necessary.

Subsequent reporting has made clear that includes Congressional Committees and Robert Mueller’s team. I’m curious whether Mueller made the request (whether using legal process or no), and Facebook took it upon themselves to share the topline data publicly. If so, we should be asking where the results of similar requests to Twitter and Google are.

I’m interested in this data — though I agree with both those that argue we need to make sure this advertising gets reviewed in campaign regulations, and those who hope independent scholars can review and vet Facebook’s methodology. But I’m as interested that we’re getting it.

Facebook isn’t running around bragging about this; if too many people groked it, more and more might stop using Facebook. But what these two reports from Facebook both reflect is the global collection of intelligence. The intelligence is usually used to sell highly targeted advertisements. But in the wake of Russia’s tampering with last year’s election, Facebook has had the ability to take a global view of what occurred. Arguably, it has shared more of that intelligence than the IC has, and in the specific detail regarding whether Internet Research Agency focused more on Trump or on exacerbating racial divisions in the country, it has presented somewhat different results than the IC has.

So in addition to observing (and treating just as skeptically as we would data from the NSA) the data Facebook reports, we would do well to recognize that we’re getting reports from a parallel global intelligence collector.

[Photo: National Security Agency, Ft. Meade, MD via Wikimedia]

Rick Ledgett’s Straw Malware

For some reason, over a month after NotPetya and almost two months after WannaCry, former Deputy DIRNSA Rick Ledgett has decided now’s the time to respond to them by inventing a straw man argument denying the need for vulnerabilities disclosure. In the same (opening) paragraph where he claims the malware attacks have revived calls for the government to release all vulnerabilities, he accuses his opponents of oversimplification.

The WannaCry and Petya malware, both of which are partially based on hacking tools allegedly developed by the National Security Agency, have revived calls for the U.S. government to release all vulnerabilities that it holds.  Proponents argue this will allow for the development of patches, which will in turn ensure networks are secure.  On the face of it, this argument might seem to make sense, but it is actually a gross oversimplification of the problem, would not have the desired effect, and would in fact be dangerous.

Yet it’s Ledgett who is oversimplifying. What most people engaging in the VEP debate — even before two worms based, in part, on tools stolen from NSA — have asked for is for some kind of sense and transparency on the process by which NSA reviews vulnerabilities for disclosure. Ledgett instead poses his opponents as absolutists, asking for everything to be disclosed.

Ledgett then spends part of his column claiming that WannaCry targeted XP.

Users agree to buy the software “as is” and most software companies will attempt to patch vulnerabilities as they are discovered, unless the software has been made obsolete by the company, as was the case with Windows XP that WannaCry exploited.

[snip]

Customers who buy software should expect to have to patch it and update it to new versions periodically.

Except multiple reports said that XP wasn’t the problem, Windows 7 was. Ledgett’s mistake is all the more curious given reports that EternalBlue was blue screening at NSA when — while he was still at the agency — it was primarily focused on XP. That is, Ledgett is one of the people who might have expected WannaCry to crash XP; that he doesn’t even when I do doesn’t say a lot for NSA’s oversight of its exploits.

Ledgett then goes on to claim that WannaCry was a failed ransomware attack, even though that’s not entirely clear.

At least he understands NotPetya better, noting that the NSA component of that worm was largely a shiny object.

In fact, the primary damage caused by Petya resulted from credential theft, not an exploit.

The most disturbing part of Ledgett’s column, however, is that it takes him a good eight (of nine total) paragraphs to get around to addressing what really has been the specific response to WannaCry and NotPetya, a response shared by people on both sides of the VEP debate: NSA needs to secure its shit.

Some have made the analogy that the alleged U.S. government loss of control of their software tools is tantamount to losing control of Tomahawk missile systems, with the systems in the hands of criminal groups threatening to use them.  While the analogy is vivid, it incorrectly places all the fault on the government.  A more accurate rendering would be a missile in which the software industry built the warhead (vulnerabilities in their products), their customers built the rocket motor (failing to upgrade and patch), and the ransomware is the guidance system.

We are almost a full year past the day ShadowBrokers first came on the scene, threatening to leak NSA’s tools. A recent CyberScoop article suggests that, while government investigators now have a profile they believe ShadowBrokers matches, they’re not even entirely sure whether they’re looking for a disgruntled former IC insider, a current employee, or a contractor.

The U.S. government’s counterintelligence investigation into the so-called Shadow Brokers group is currently focused on identifying a disgruntled, former U.S. intelligence community insider, multiple people familiar with the matter told CyberScoop.

[snip]

While investigators believe that a former insider is involved, the expansive probe also spans other possibilities, including the threat of a current intelligence community employee being connected to the mysterious group.

[snip]

It’s not clear if the former insider was once a contractor or in-house employee of the secretive agency. Two people familiar with the matter said the investigation “goes beyond” Harold Martin, the former Booz Allen Hamilton contractor who is currently facing charges for taking troves of classified material outside a secure environment.

At least some of Shadow Brokers’ tools were stolen after Edward Snowden walked out of NSA Hawaii with the crown jewels, at a time when Rick Ledgett, personally, was leading a leak investigation into NSA’s vulnerabilities. And yet, over three years after Snowden stole his documents, the Rick Ledgett-led NSA still had servers sitting unlocked in their racks, still hadn’t addressed its privileged user issues.

Rick Ledgett, the guy inventing straw man arguments about absolutist VEP demands is a guy who’d do the country far more good if he talked about what NSA can do to lock down its shit — and explained why that shit didn’t get locked down when Ledgett was working on those issues specifically.

But he barely mentions that part of the response to WannaCry and NotPetya.

[Photo: National Security Agency, Ft. Meade, MD via Wikimedia]

The Problems with Rosemary Collyer’s Shitty Upstream 702 Opinion

This post took a great deal of time, both in this go-around, and over the years to read all of these opinions carefully. Please consider donating to support this work. 

It often surprises people when I tell them this, but in general, I’ve got a much better opinion of the FISA Court than most other civil libertarians. I do so because I’ve actually read the opinions. And while there are some real stinkers in the bunch, I recognize that the court has long been a source of some control over the executive branch, at times even applying more stringent standards than criminal courts.

But Rosemary Collyer’s April 26, 2017 opinion approving new Section 702 certificates undermines all the trust and regard I have for the FISA Court. It embodies everything that can go wrong with the court — which is all the more inexcusable given efforts to improve the court’s transparency and process since the Snowden leaks. I don’t think she understood what she was ruling on. And when faced with evidence of years of abuse (and the government’s attempt to hide it), she did little to rein in or even ensure accountability for those abuses.

This post is divided into three sections:

  • My analysis of the aspects of the opinion that deal with the upstream surveillance
    • Describing upstream searches
    • Refusing to count the impact
    • Treating the problem as exclusively about MCTs, not SCTs
    • Defining key terms
    • Failing to appoint (much less consider) appointing an amicus
    • Approving back door upstream searches
    • Imposing no consequences
  • A description of all the documents I Con the Record released — and more importantly, the more important ones it did not release (if you’re in the mood for weeds, start there)
  • A timeline showing how NSA tried to hide these violations from FISC

Opinion

The Collyer opinion deals with a range of issues: an expansion of data sharing with the National Counterterrorism Center, the resolution of past abuses, and the rote approval of 702 certificates for form and content.

But the big news from the opinion is that the NSA discovered it had been violating the terms of upstream FISA collection set in 2011 (after violating the terms of upstream FISA set in 2007-2008, terms which were set after Stellar Wind violated FISA since 2002). After five months of trying and failing to find an adequate solution to fix the problem, NSA proposed and Collyer approved new rules for upstream collection. The collection conducted under FISA Section 702 is narrower than it had been because NSA can no longer do “about” searches (which are basically searching for some signature in the “content” of a communication). But it is broader — and still potentially problematic — because NSA now has permission to do the back door searches of upstream collected data that they had, in reality, been doing all along.

My analysis here will focus on the issue of upstream collection, because that is what matters going forward, though I will note problems with the opinion addressing other topics to the extent they support my larger point.

Describing upstream searches

Upstream collection under Section 702 is the collection of communications identified by packet sniffing for a selector at telecommunication switches. As an example, if the NSA wants to collect the communications of someone who doesn’t use Google or Yahoo, they will search for the email address as it passes across circuits the government has access to (overseas, under EO 12333) or that a US telecommunications company runs (domestically, under 702; note many of the data centers at which this occurs have recently changed hands). Stellar Wind — the illegal warrantless wiretap program done under Bush — was upstream surveillance. The period in 2007 when the government tried to replace Stellar Wind under traditional FISA was upstream surveillance. And the Protect America Act and FISA Amendments Act have always included upstream surveillance as part of the mix, even as they moved more (roughly 90% according to a 2011 estimate) of the collection to US-based providers.

The thing is, there’s no reason to believe NSA has ever fully explained how upstream surveillance works to the FISC, not even in this most recent go-around (and it’s now clear that they always lied about how they were using and processing a form of upstream collection to get Internet metadata from 2004 to 2011). Perhaps ironically, the most detailed discussions of the technology behind it likely occurred in 2004 and 2010 in advance of opinions authorizing collection of metadata, not content, but NSA was definitely not fully forthcoming in those discussions about how it processed upstream data.

In 2011, the NSA explained (for the first time), that it was not just collecting communications by searching for a selector in metadata, but it was also collecting communications that included a selector as content. One reason they might do this is to obtain forwarded emails involving a target, but there are clearly other reasons. As a result of looking for selectors as content, NSA got a lot of entirely domestic communications, both in what NSA called multiple communication transactions (“MCTs,” basically emails and other things sent in bundles) and in single communication transactions (SCTs) that NSA didn’t identify as domestic, perhaps because they used Tor or a VPN or were routed overseas for some other reason. The presiding judge in 2011, John Bates, ruled that the bundled stuff violated the Fourth Amendment and imposed new protections — including the requirement NSA segregate that data — for some of the MCTs. Bizarrely, he did not rule the domestic SCTs problematic, on the logic that those entirely domestic communications might have foreign intelligence value.

In the same order, John Bates for the first time let CIA and NSA do something FBI had already been doing: taking US person selectors (like an email address) and searching through already collected content to see what communications they were involved in (this was partly a response to the 2009 Nidal Hasan attack, which FBI didn’t prevent in part because they were never able to pull up all of Hasan’s communications with Anwar al-Awlaki at once). Following Ron Wyden’s lead, these searches on US person content are often called “back door searches” for the way they let the government read Americans’ communications without a warrant. Because of the newly disclosed risk that upstream collection could pick up domestic communications, however, when Bates approved back door searches in 2011, he explicitly prohibited the back door searching of data collected via upstream searches. He prohibited this for all of it — MCTs (many of which were segregated from general repositories) and SCTs (none of which were segregated).

As I’ve noted, as early as 2013, NSA knew it was conducting “many” back door searches of upstream data. The reasons why it was doing so were stupid: in part, because to avoid upstream searches analysts had to exclude upstream repositories from the search query (basically by writing “NOT upstream” in a Boolean query), which also required them realizing they were searching on a US person selector. For whatever reason, though, no one got alarmed by reports this was going on — not NSA’s overseers, not FISC (which reportedly got notices of these searches), and not Congress (which got notices of them in Semiannual reports, which is how I knew they were going on). So the problem continued; I noted that this was a persistent problem back in August, when NSA and DOJ were still hiding the extent of the problems from FISC.

It became clear the problem was far worse than known, however, when NSA started looking into how it dealt with 704 surveillance. Section 704 is the authority the NSA uses to spy on Americans who are overseas. It basically amounts to getting a FISC order to use EO 12333 spying on an American. An IG Report completed in January 2016 generally found 704 surveillance to be a clusterfuck; as part of that, though, the NSA discovered that there were a whole bunch of 704 backdoor searches that weren’t following the rules, in part because they were collecting US person communications for periods outside of the period when the FISC had authorized surveillance (for 705(b) communication, which is the spying on Americans who are simply traveling overseas, this might mean NSA used EO 12333 to collect on an American when they were in the US). Then NSA’s Compliance people (OCO) did some more checking and found still worse problems.

And then the government — the same government that boasted about properly disclosing this to FISC — tried to bury it, basically not even telling FISC about how bad the problem was until days before Collyer was set to approve new certificates in October 2016. Once they did disclose it, Judge Collyer gave NSA first one and then another extension for them to figure out what went wrong. After 5 months of figuring, they were still having problems nailing it down or even finding where the data and searches had occurred. So, finally, facing a choice of ending “about” collection (only under 702 — they can still accomplish the very same thing under EO 12333) or ending searches of upstream data, they chose the former option, which Collyer approved with almost no accountability for all the problems she saw in the process.

Refusing to count the impact

I believe that (at least given what has been made public) Collyer didn’t really understand the issue placed before her. One thing she does is just operate on assumptions about the impact of certain practices. For example, she uses the 2011 number for the volume of total 702 collection accomplished using upstream collection to claim that it is “a small percentage of NSA’s overall collection of Internet communications under Section 702.” That’s likely still true, but she provides no basis for the claim, and it’s possible changes in communication — such as the increased popularity of Twitter — would change the mix significantly.

Similarly, she assumes that MCTs that involve “a non-U.S. person outside the United States” will be “for that reason [] less likely to contain a large volume of information about U.S. person or domestic communications.” She makes a similar assumption (this time in her treatment of the new NCTC raw take) about 702 data being less intrusive than individual orders targeted at someone in the US, “which often involve targets who are United States persons and typically are directed at persons in the United States.” In both of these, she repeats an assumption John Bates made in 2011 when he first approved back door searches using the same logic — that it was okay to provide raw access to this data, collected without a warrant, because it wouldn’t be as impactful as the data collected with an individual order. And the assumption may be true in both cases. But in an age of increasingly global data flows, that remains unproven. Certainly, with ISIS recruiters located in Syria attempting to recruit Americans, that would not be true at all.

Collyer makes the same move when she makes a critical move in the opinion, when she asserts that “NSA’s elimination of ‘abouts’ collection should reduce the number of communications acquired under Section 702 to which a U.S. person or a person in the United States is a party.” Again, that’s probably true, but it is not clear she has investigated all the possible ways Americans will still be sucked up (which she acknowledges will happen).

And she does this even as NSA was providing her unreliable numbers.

The government later reported that it had inadvertently misstated the percentage of NSA’s overall upstream Internet collection during the relevant period that could have been affected by this [misidentification of MCTs] error (the government first reported the percentage as roughly 1.3% when it was roughly 3.7%.

Collyer’s reliance on assumptions rather than real numbers is all the more unforgivable given one of the changes she approved with this order: basically, permitting the the agencies to conduct otherwise impermissible searches to be able to count how many Americans get sucked up under 702.  In other words, she was told, at length, that Congress wants this number (the government’s application even cites the April 22, 2106 letter from members of the House Judiciary Committee asking for such a number). Moreover, she was told that NSA had already started trying to do such counts.

The government has since [that is, sometime between September 26 and April 26] orally notified the Court that, in order to respond to these requests and in reliance on this provision of its minimization procedures, NSA has made some otherwise-noncompliant queries of data acquired under Section 702 by means other than upstream Internet collection.

And yet she doesn’t then demand real numbers herself (again, in 2011, Bates got NSA to do at least a limited count of the impact of the upstream problems).

Treating the problem as exclusively about MCTs, not SCTs

But the bigger problem with Collyer’s discussion is that she treats all of the problem of upstream collection as being about MCTs, not SCTs. This is true in general — the term single communication transaction or SCT doesn’t appear at all in the opinion. But she also, at times, makes claims about MCTs that are more generally true for SCTs. For example, she cites one aspect of NSA’s minimization procedures that applies generally to all upstream collection, but describes it as only applying to MCTs.

A shorter retention period was also put into place, whereby an MCT of any type could not be retained longer than two years after the expiration of the certificate pursuant to which it was acquired, unless applicable criteria were met. And, of greatest relevance to the present discussion, those procedures categorically prohibited NSA analysts from using known U.S.-person identifiers to query the results of upstream Internet collection. (17-18)

Here’s the section of the minimization procedures that imposed the two year retention deadline, which is an entirely different section than that describing the special handling for MCTs.

Similarly, Collyer cites a passage from the 2015 Hogan opinion stating that upstream “is more likely than other forms of section 702 collection to contain information of or concerning United States person with no foreign intelligence value” (see page 17). But that passage cites to a passage of the 2011 Bates opinion that includes SCTs in its discussion, as in this sentence.

In addition to these MCTs, NSA likely acquires tens of thousands more wholly domestic communications every year, given that NSA’s upstream collection devices will acquire a wholly domestic “about” SCT if it is routed internationally. (33)

Collyer’s failure to address SCTs is problematic because — as I explain here — the bulk of the searches implicating US persons almost certainly searched SCTs, not MCTs. That’s true for two reasons. First, because (at least according to Bates’ 2011 guesstimate) NSA collects (or collected) far more entirely domestic communications via SCTs than via MCTs. Here’s how Bates made that calculation in 2011 (see footnote 32).

NSA ultimately did not provide the Court with an estimate of the number of wholly domestic “about” SCTs that may be acquired through its upstream collection. Instead, NSA has concluded that “the probability of encountering wholly domestic communications in transactions that feature only a single, discrete communication should be smaller — and certainly no greater — than potentially encountering wholly domestic communications within MCTs.” Sept. 13 Submission at 2.

The Court understands this to mean that the percentage of wholly domestic communications within the universe of SCTs acquired through NSA’s upstream collection should not exceed the percentage of MCTs within its statistical sample. Since NSA found 10 MCTs with wholly domestic communications within the 5,081 MCTs reviewed, the relevant percentage is .197% (10/5,081). Aug. 16 Submission at 5.

NSA’s manual review found that approximately 90% of the 50,440 transactions in the same were SCTs. Id. at 3. Ninety percent of the approximately 13.25 million total Internet transactions acquired by NSA through its upstream collection during the six-month period, works out to be approximately 11,925,000 transactions. Those 11,925,000 transactions would constitute the universe of SCTs acquired during the six-month period, and .197% of that universe would be approximately 23,000 wholly domestic SCTs. Thus, NSA may be acquiring as many as 46,000 wholly domestic “about” SCTs each year, in addition to the 2,000-10,000 MCTs referenced above.

Assuming some of this happens because people use VPNs or Tor, then the amount of entirely domestic communications collected via upstream would presumably have increased significantly in the interim period. Indeed, the redaction in this passage likely hides a reference to technologies that obscure location.

If so, it would seem to acknowledge NSA collects entirely domestic communications using upstream that obscure their location.

The other reason the problem is likely worse with SCTs is because — as I noted above — no SCTs were segregated from NSA’s general repositories, whereas some MCTs were supposed to be (and in any case, in 2011 the SCTs constituted by far the bulk of upstream collection).

Now, Collyer’s failure to deal with SCTs may or may not matter for her ultimate analysis that upstream collection without “about” collection solves the problem. Collyer limits the collection of abouts by limiting upstream collection to communications where “the active user is the target of acquisition.” She describes “active user” as “the user of a communication service to or from whom the MCT is in transit when it is acquired (e.g., the user of an e-mail account [half line redacted].” If upstream signatures are limited to emails and texts, that would seem to fix the problem. But upstream wouldn’t necessarily be limited to emails and texts — upstream collection would be particularly valuable for searching on other kinds of selectors, such as an encryption key, and there may be more than one person who would use those other kinds of selectors. And when Collyer says, “NSA may target for acquisition a particular ‘selector,’ which is typically a facility such as a telephone number or e-mail address,” I worry she’s unaware or simply not ensuring that NSA won’t use upstream to search for non-typical signatures that might function as abouts even if they’re not “content.” The problem is treating this as a content/metadata distinction, when “metadata” (however far down in the packet you go) could include stuff that functions like an about selector.

Defining key terms terms

Collyer did define “active user,” however inadequately. But there are a number of other terms that go undefined in this opinion. By far the funniest is when Collyer notes that the government’s March 30 submission promises to sequester upstream data that is stored in “institutionally managed repositories.” In a footnote, she notes they don’t define the term. Then she pretty much drops the issue. This comes in an opinion that shows FBI data has been wandering around in repositories it didn’t belong and indicating that NSA can’t identify where all its 704 data is. Yet she’s told there is some other kind of repository and she doesn’t make a point to figure out what the hell that means.

Later, in a discussion of other violations, Collyer introduces the term “data object,” which she always uses in quotation marks, without explaining what that is.

Failing to appoint (or even consider) amicus

In any case, this opinion makes clear that what should have happened, years ago, is a careful discussion of how packet sniffing works, and where a packet collected by a backbone provider stops being metadata and starts being content, and all the kinds of data NSA might want to and does collect via domestic packet sniffing. (They collect far more under EO 12333.) As mentioned, some of that discussion may have taken place in advance of the 2004 and 2010 opinions approving upstream collection of Internet metadata (though, again, I’m now convinced NSA was always lying about what it would take to process that data). But there’s no evidence the discussion has ever happened when discussing the collection of upstream content. As a result, judges are still using made up terms like MCTs, rather than adopting terms that have real technical meaning.

For that reason, it’s particularly troubling Collyer didn’t use — didn’t even consider using, according to the available documentation — an amicus. As Collyer herself notes, upstream surveillance “has represented more than its share of the challenges in implementing Section 702” (and, I’d add, Internet metadata collection).

At a minimum, when NSA was pitching fixes to this, she should have stopped and said, “this sounds like a significant decision” and brought in amicus Amy Jeffress or Marc Zwillinger to help her think through whether this solution really fixes the problem. Even better, she should have brought in a technical expert who, at a minimum, could have explained to her that SCTs pose as big a problem as MCTs; Steve Bellovin — one of the authors of this paper that explores the content versus metadata issue in depth — was already cleared to serve as the Privacy and Civil Liberties Oversight Board’s technical expert, so presumably could easily have been brought into consult here.

That didn’t happen. And while the decision whether or not to appoint an amicus is at the court’s discretion, Collyer is obligated to explain why she didn’t choose to appoint one for anything that presents a significant interpretation of the law.

A court established under subsection (a) or (b), consistent with the requirement of subsection (c) and any other statutory requirement that the court act expeditiously or within a stated time–

(A) shall appoint an individual who has been designated under paragraph (1) to serve as amicus curiae to assist such court in the consideration of any application for an order or review that, in the opinion of the court, presents a novel or significant interpretation of the law, unless the court issues a finding that such appointment is not appropriate;

For what it’s worth, my guess is that Collyer didn’t want to extend the 2015 certificates (as it was, she didn’t extend them as long as NSA had asked in January), so figured there wasn’t time. There are other aspects of this opinion that make it seem like she just gave up at the end. But that still doesn’t excuse her from explaining why she didn’t appoint one.

Instead, she wrote a shitty opinion that doesn’t appear to fully understand the issue and that defers, once again, the issue of what counts as content in a packet.

Approving back door upstream searches

Collyer’s failure to appoint an amicus is most problematic when it comes to her decision to reverse John Bates’ restriction on doing back door searches on upstream data.

To restate what I suggested above, by all appearances, NSA largely blew off the Bates’ restriction. Indeed, Collyer notes in passing that, “In practice, however, no analysts received the requisite training to work with the segregated MCTs.” Given the persistent problems with back door searches on upstream data, it’s hard to believe NSA took that restriction seriously at all (particularly since it refused to consider a technical fix to the requirement to exclude upstream from searches). So Collyer’s approval of back door searches of upstream data is, for all intents and purposes, the sanctioning of behavior that NSA refused to stop, even when told to.

And the way in which she sanctions it is very problematic.

First, in spite of her judgment that ending about searches would fix the problems in (as she described it) MCT collection, she nevertheless laid out a scenario (see page 27) where an MCT would acquire an entirely domestic communication.

Having laid out that there will still be some entirely domestic comms in the collection, Collyer then goes on to say this:

The Court agrees that the removal of “abouts” communications eliminates the types of communications presenting the Court the greatest level of constitutional and statutory concern. As discussed above, the October 3, 2011 Memorandum Opinion (finding the then-proposed NSA Minimization Procedures deficient in their handling of some types of MCTs) noted that MCTs in which the target was the active user, and therefore a party to all of the discrete communications within the MCT, did not present the same statutory and constitutional concerns as other MCTs. The Court is therefore satisfied that queries using U.S.-person identifiers may now be permitted to run against information obtained by the above-described, more limited form of upstream Internet collection, subject to the same restrictions as apply to querying other forms of Section

This is absurd! She has just laid out that there will be some exclusively domestic comms in the collection. Not as much as there was before NSA stopped collecting abouts, but it’ll still be there. So she’s basically permitting domestic communications to be back door searched, which, if they’re found (as she notes), might be kept based on some claim of foreign intelligence value.

And this is where her misunderstanding of the MCT/SCT distinction is her undoing. Bates prohibited back door searching of all upstream data, both that supposedly segregated because it was most likely to have unrelated domestic communications in it, and that not segregated because even the domestic communications would have intelligence value. Bates’ specific concerns about MCTs are irrelevant to his analysis about back door searches, but that’s precisely what Collyer cites to justify her own decision.

She then applies the 2015 opinion, with its input from amicus Amy Jeffress stating that NSA back door searches that excluded upstream collection were constitutional, to claim that back door searches that include upstream collection would meet Fourth Amendment standards.

The revised procedures subject NSA’s use of U.S. person identifiers to query the results of its newly-limited upstream Internet collection to the same limitations and requirements that apply to its use of such identifiers to query information acquired by other forms of Section 702 collection. See NSA Minimization Procedures § 3(b)(5). For that reason, the analysis in the November 6, 2015 Opinion remains valid regarding why NSA’s procedures comport with Fourth Amendment standards of reasonableness with regard to such U.S. person queries, even as applied to queries of upstream Internet collection. (63)

As with her invocation of Bates’ 2011 opinion, she applies analysis that may not fully apply to the question — because it’s not actually clear that the active user restriction really equates newly limited upstream collection to PRISM collection — before her as if it does.

Imposing no consequences

The other area where Collyer’s opinion fails to meet the standards of prior ones is in resolution of the problem. In 2009, when Reggie Walton was dealing with first phone and then Internet dragnet problems, he required the NSA to do complete end-to-end reviews of the programs. In the case of the Internet dragnet, the report was ridiculous (because it failed to identify that the entire program had always been violating category restrictions). He demanded IG reports, which seems to be what led the NSA to finally admit the Internet dragnet program was broken. He shut down production twice, first of foreign call records, from July to September 2009, then of the entire Internet dragnet sometime in fall 2009. Significantly, he required the NSA to track down and withdraw all the reports based on violative production.

In 2010 and 2011, dealing with the Internet dragnet and upstream problems, John Bates similarly required written details (and, as noted, actual volume of the upstream problem). Then, when the NSA wanted to retain the fruits of its violative collection, Bates threatened to find NSA in violation of 50 USC 1809(a) — basically, threatened to declare them to be conducting illegal wiretapping — to make them actually fix their prior violations. Ultimately, NSA destroyed (or said they destroyed) their violative collection and the fruits of it.

Even Thomas Hogan threatened NSA with 50 USC 1809(a) to make them clean up willful flouting of FISC orders.

Not Collyer. She went from issuing stern complaints (John Bates was admittedly also good at this) back in October…

At the October 26, 2016 hearing, the Court ascribed the government’s failure to disclose those IG and OCO reviews at the October 4, 2016 hearing to an institutional “lack of candor” on NSA’s part and emphasized that “this is a very serious Fourth Amendment issue.”

… to basically reauthorizing 702 before using the reauthorization process as leverage over NSA.

Of course, NSA still needs to take all reasonable and necessary steps to investigate and close out the compliance incidents described in the October 26, 2016 Notice and subsequent submissions relating to the improper use of U.S.-person identifiers to query terms in NSA upstream data. The Court is approving on a going-foward basis, subject to the above-mentioned requirements, use of U.S.-person identifiers to query the results of a narrower form of Internet upstream collection. That approval, and the reasoning that supports it, by no means suggest that the Court approves or excuses violations that occurred under the prior procedures.

That is particularly troubling given that there is no indication, even six months after NSA first (belatedly) disclosed the back door search problems to FISC, that it had finally gotten ahold of the problem.

As Collyer noted, weeks before it submitted its new application, NSA still didn’t know where all the upstream data lived. “On March 17, 2017, the government reported that NSA was still attempting to identify all systems that store upstream data and all tools used to query such data.” She revealed that  some of the queries of US persons do not interact with “NSA’s query audit system,” meaning they may have escaped notice forever (I’ve had former NSA people tell me even they don’t believe this claim, as seemingly nothing should be this far beyond auditability). Which is presumably why, “The government still had not ascertained the full range of systems that might have been used to conduct improper U.S.-person queries.” There’s the data that might be in repositories that weren’t run by NSA, alluded to above. There’s the fact that on April 7, even after NSA submitted its new plan, it was discovering that someone had mislabeled upstream data as PRISM, allowing it to be queried.

Here’s the thing. There seems to be no way to have that bad an idea of where the data is and what functions access the data and to be able to claim — as Mike Rogers, Dan Coats, and Jeff Sessions apparently did in the certificates submitted in March that didn’t get publicly released — to be able to fulfill the promises they made FISC. How can the NSA promise to destroy upstream data at an accelerated pace if it admits it doesn’t know where it is? How can NSA promise to implement new limits on upstream collection if that data doesn’t get audited?

And Collyer excuses John Bates’ past decision (and, by association, her continued reliance on his logic to approve back door searches) by saying the decision wasn’t so much the problem, but the implementation of it was.

When the Court approved the prior, broader form of upstream collection in 2011, it did so partly in reliance on the government’s assertion that, due to some communications of foreign intelligence interest could only be acquired by such means. $ee October 3, 2011 Memorandum Opinion at 31 & n. 27, 43, 57-58. This Opinion and Order does not question the propriety of acquiring “abouts” communications and MCTs as approved by the Court since 2011, subject to the rigorous safeguards imposed on such acquisitions. The concerns raised in the current matters stem from NSA’s failure to adhere fully to those safeguards.

If problems arise because NSA has failed, over 6 years, to adhere to safeguards imposed because NSA hadn’t adhered to the rules for the 3 years before that, which came after NSA had just blown off the law itself for the 6 years before that, what basis is there to believe they’ll adhere to the safeguards she herself imposed, particularly given that unlike her predecessors in similar moments, she gave up any leverage she had over the agency?

The other thing Collyer does differently from her predecessors is that she lets NSA keep data that arose from violations.

Certain records derived from upstream Internet communications (many of which have been evaluated and found to meet retention standards) will be retained by NSA, even though the underlying raw Internet transactions from which they are derived might be subject to destruction. These records include serialized intelligence reports and evaluated and minimized traffic disseminations, completed transcripts and transcriptions of Internet transactions, [redacted] information used to support Section 702 taskings and FISA applications to this Court, and [redacted].

If “many” of these communications have been found to meet retention standards, it suggests that “some” have not. Meaning they should never have been retained in the first place. Yet Collyer lets an entire stream of reporting — and the Section 702 taskings that arise from that stream of reporting — remain unrecalled. Effectively, even while issuing stern warning after stern warning, by letting NSA keep this stuff, she is letting the agency commit violations for years without any disincentive.

Now, perhaps Collyer is availing herself of the exception offered in Section 301 of the USA Freedom Act, which permits the government to retain illegally obtained material if it is corrected by subsequent minimization procedures.

Exception.–If the Government corrects any deficiency identified by the order of the Court under subparagraph (B), the Court may permit the use or disclosure of information obtained before the date of the correction under such minimization procedures as the Court may approve for purposes of this clause.

Except that she doesn’t cite that provision, nor is there any evidence deficiencies have been corrected.

Which should mean, especially given the way Collyer depends on the prior opinions of Bates and Hogan, she should likewise rely on their practice of treating this as a potential violation of 50 USC 1809(a) to ensure the harm to Americans doesn’t persist. She did no such thing, basically sanctioning the illegal use of back door searches to spy on Americans.

Up until this opinion, I was generally willing to argue for the efficacy of the FISC (even while arguing the job could and should be devolved to district courts for more rigorous testing of the law). But not now. This opinion discredits the entire court.

Last April when Collyer became presiding FISC judge, I pointed to what I considered Rosemary Collyer’s worst FISC decision, which was actually a District Court opinion that permitted the NSA to keep aspects of its upstream problems secret from EFF, which is suing over those same issues. I predicted then that, “I fear she will be a crummy presiding judge, making the FISC worse than it already is.”

In my opinion — as a civil libertarian who has been willing to defend the FISC in the past — with this opinion she has done real damage to any credibility or legitimacy the FISC has.

Update: Latter for former fixed in which choice the Administration picked, h/t CS.

The Documents

Here’s what I Con the Record released.

January 7, 2016 IG Report

This heavily redacted report describes a review of NSA’s compliance with 704/705b of Title VII of FISA, the authority NSA uses to spy on Americans who are located overseas (see my report on the 704 problems here). It was conducted from March through August 2015 and reviewed data from January through March 2015. It basically showed there were no compliance mechanisms in place for 704/705b, and NSA couldn’t even reliably identify the queries that had been conducted under the authority. This report is relevant to the reauthorization, because Americans targeted in individual FISA orders are approved (and almost certainly tasked) by default for 702 back door searches. Though the report was obviously done well before the 702 certifications were submitted on September 26, was not noticed to FISC until days before the court would otherwise have approved the certifications in conjunction with the upstream problems.

September 26, 2016 702 Certification Package 

ICTR released much if not all of the materials submitted for 702 reauthorization on September 2016. The package includes:

Certification cover filing: This is basically the application, which the metadata reveals is actually two parts merged. It describes the changes to the certificates from the past year, most notably a request to share raw 702 data directly from NSA or FBI to NCTC, some tweaks to the FBI targeting and minimization procedures, and permission for NSA, FBI, and CIA to deviate from minimization procedures to develop a count of how many US persons get collected under 702.

The report also describes how the government has fulfilled reporting requirements imposed in 2015. Several of the reports pertain to destroying data it should not have had. The most interesting one is the report on how many criminal queries of 702 data FBI does that result in the retrieval and review of US person data; as I note in this post, the FBI really didn’t (and couldn’t, and can’t, given the oversight regime currently in place) comply with the intent of the reporting requirement.

Very importantly: this application did not include any changes to upstream collection, in large part because NSA did not tell FISC (more specifically, Chief Judge Rosemary Collyer) about the problems they had always had preventing queries of upstream data in its initial application. In NSA’s April statement on ending upstream about collection, it boasts, “Although the incidents were not willful, NSA was required to, and did, report them to both Congress and the FISC.” But that’s a load of horse manure: in fact, NSA and DOJ sat on this information for months. And even with this disclosure, because the government didn’t release the later application that did describe those changes, we don’t actually get to see the government’s description of the problems; we only get to see Collyer’s (I believe mis-) understanding of them.

Procedures and certifications accepted: The September 26 materials also include the targeting and minimization procedures that were accepted in the form in which they were submitted on that date. These include:

Procedures and certificates not accepted: The materials include the documents that the government would have to change before approval on April 26. These include,

Note, I include the latter two items because I believe they would have had to be resubmitted on March 30, 2017 with the updated NSA documents and the opinion makes clear a new DIRNSA affidavit was submitted (see footnote 10), but the release doesn’t give us those. I have mild interest in that, not least because the AG/DNI one would be the first big certification to FISC signed by Jeff Sessions and Dan Coats.

October 26, 2016 Extension

The October 26 extension of 2015’s 702 certificates is interesting primarily for its revelation that the government waited until October 24, 2016 to disclose problems that had been simmering since 2013.

March 30, 2017 Submissions

The release includes two of what I suspect are at least four items submitted on March 30, which are:

April 26, 2017 Opinion

This is the opinion that reauthorized 702, with the now-restricted upstream search component. My comments below largely lay out the problems with it.

April 11, 2017 ACLU Release

I Con the Record also released the FOIAed documents released earlier in April to ACLU, which are on their website in searchable form here. I still have to finish my analysis of that (which includes new details about how the NSA was breaking the law in 2011), but these posts cover some of those files and are relevant to these 702 changes:

Importantly, the ACLU documents as a whole reveal what kinds of US persons are approved for back door searches at NSA (largely, but not exclusively, Americans for whom an individual FISA order has already been approved, importantly including 704 targets, as well as more urgent terrorist targets), and reveal that one reason NSA was able to shut down the PRTT metadata dragnet in 2011 was because John Bates had permitted them to query the metadata from upstream collection.

Not included

Given the point I noted above — that the application submitted on September 26 did not address the problem with upstream surveillance and that we only get to see Collyer’s understanding of it — I wanted to capture the documents that should or do exist that we haven’t seen.

  • October 26, 2016 Preliminary and Supplemental Notice of Compliance Incidents Regarding the Querying of Section 702-Acquired Data
  • January 3, 2017: Supplemental Notice of Compliance Incidents Regarding the Querying of Section 702-Acquired Data
  • NSA Compliance Officer (OCO) review covering April through December 2015
  • OCO review covering April though July of 2016
  • IG Review covering first quarter of 2016 (22)
  • January 27, 2017: Letter In re: DNI/AG 702(g) Certifications asking for another extension
  • January 27, 2017: Order extending 2015 certifications (and noting concern with “important safeguards for interests protected by the Fourth Amendment”)
  • March 30, 2017: Amendment to [Certificates]; includes (or is) second explanatory memo, referred to as “March 30, 2017 Memorandum” in Collyer’s opinion; this would include a description of the decision to shut down about searches
  • March 30, 2017 AG/DNI Certification (?)
  • March 30, 2017 DIRNSA Certification
  • April 7, 2017 preliminary notice

Other Relevant Documents

Because they’re important to this analysis and get cited extensively in Collyer’s opinion, I’m including:

Timeline

November 30, 2013: Latest possible date at which upstream search problems identified

October 2014: Semiannual Report shows problems with upstream searches during period from June 1, 2013 – November 30, 2013

October 2014: SIGINT Compliance (SV) begins helping NSD review 704/705b compliance

June 2015: Semiannual Report shows problems with upstream searches during period from December 1, 2013 – May 31, 2014

December 18, 2015: Quarterly Report to the FISC Concerning Compliance Matters Under Section 702 of FISA

January 7, 2016: IG Report on controls over §§704/705b released

January 26, 2016: Discovery of error in upstream collection

March 9, 2016: FBI releases raw data

March 18, 2016: Quarterly Report to the FISC Concerning Compliance Matters Under Section 702 of FISA

May and June, 2016: Discovery of querying problem dating back to 2012

May 17, 2016: Opinion relating to improper retention

June 17, 2016: Quarterly Report to the FISC Concerning Compliance Matters Under Section 702 of FISA

August 24, 2016: Pre-tasking review update

September 16, 2016: Quarterly Report to the FISC Concerning Compliance Matters Under Section 702 of FISA

September 26, 2016: Submission of certifications

October 4, 2016: Hearing on compliance issues

October 24, 2016: Notice of compliance errors

October 26, 2016: Formal notice, with hearing; FISC extends the 2015 certifications to January 31, 2017

November 5, 2016: Date on which 2015 certificates would have expired without extension

December 15, 2016: James Clapper approves EO 12333 Sharing Procedures

December 16, 2016: Quarterly Report to the FISC Concerning Compliance Matters Under Section 702 of FISA

December 29, 2016: Government plans to deal with indefinite retention of data on FBI systems

January 3, 2017: DOJ provides supplemental report on compliance programs; Loretta Lynch approves new EO 12333 Sharing Procedures

January 27, 2017: DOJ informs FISC they won’t be able to fully clarify before January 31 expiration, ask for extension to May 26; FISC extends to April 28

January 31, 2007: First extension date for 2015 certificates

March 17, 2017:Quarterly Report to the FISC Concerning Compliance Matters Under Section 702 of FISA; Probable halt of upstream “about” collection

March 30, 2016: Submission of amended NSA certifications

April 7, 2017: Preliminary notice of more query violations

April 28, 2017: Second extension date for 2015 certificates

May 26, 2017: Requested second extension date for 2015 certificates

June 2, 2017: Deadline for report on outstanding issues

I Rarely Say I Told You So, Section 704 I Told You So Edition

Since 2014, I have been trying to alert anyone who would listen about Section 704.

That’s a part of FISA Title VII — the part of FISA that will be reauthorized this year. When Congress passed FISA Amendments Act in 2008, they promised they’d protect US persons overseas by requiring an order to surveil them. Almost always, the section that accomplished that was referred to Section 703, which is basically PRISM for Americans overseas.

Except I discovered when I (briefly) worked at the Intercept that NSA never uses 703. Ever. Which meant that what they use to surveil Americans overseas is somewhat looser Section 704 (or, for Americans against whom there is a traditional domestic FISA order, 705b). Except no one — and I mean literally no one, not in the NGO community nor on the Hill — understood how Section 704 was used.

Exactly a year ago, I laid all this out in a post and suggested that, as part of the Section 702 reauthorization this year, Congress should finally figure out how 704 works and whether there are any particular concerns about it.

It turns out, four months before I wrote that, NSA’s Inspector General had finalized a report showing that in the seven and a half years since Section 704 was purportedly protecting Americans overseas, it wasn’t. The report is heavily redacted, but what isn’t redacted showed that the NSA had never set up a means to identify all 704/705b queries, and so couldn’t reliably oversee whether analysts were following the rules. The report showed that Signals Intelligence Compliance and Oversight only started helping DOJ and ODNI do their compliance reviews of 704/705b in October 2014, by providing the queries they could identify to the reviewers. But not all queries can be audited, because not all the feeds in question can be sent to NSA’s auditing and logging system.

The review itself — conducted from March to August of 2015 on data from the first quarter of that year — showed a not insignificant amount of querying non-compliance.

The 704 compliance problems are a part of the problem with NSA’s decision to shut down upstream surveillance (because 704 collection authorization is one of the things that automatically gets a US person approved for upstream searches]. Though, in her most biting comment in an otherwise pathetic opinion, Chief FISC judge Rosemary Collyer note the failure to tell her about this when 702 certificates were submitted in September or in an October 4 hearing showed a lack of candor.

At the October 26, 2016 hearing, the Court ascribed the government’s failure to disclose those IG and OCO reviews at the October 4, 2016 hearing to an institutional “lack of candor” on NSA’s part and emphasized that “this is a very serious Fourth Amendment issue.”

A review that post-dated the IG Report revealed the problem was even bigger than that. In the compliance section of the report, Collyer noted that 85% of the 704/705b queries conducting using one particular tool (which was rolled out in 2012) were non-compliant.

NSA examined all queries using identifiers for “U.S. persons targeted pursuant to Sections 704 and 705(b) of FISA using the tool [redacted] in [redacted] . . . from November 1, 2015 to May 1, 2016.” Id. at 2-3 (footnote omitted). Based on that examination, “NSA estimates that approximately eighty-five percent of those queries, representing [redacted] queries conducted by approximately [redacted] targeted offices, were not compliant with the applicable minimization procedures.” Id. at 3. Many of these non-compliant queries involved use of the same identifiers over different date ranges. Id. Even so, a non-compliance rate of 85% raises substantial questions about the propriety of using of [redacted] to query FISA data. While the government reports that it is unable to provide a reliable estimate of the number of non-compliant queries since 2012, id., there is no apparent reason to believe the November 2015-April 2016 period coincided with an unusually high error rate.

And NSA was unable to chase down the reporting based off this non-compliant querying.

The government reports that NSA “is unable to identify any reporting or other disseminations that may have been based on information returned by [these] non-compliant queries” because “NSA’s disseminations are sourced to specific objects,” not to the queries that may have presented those objects to the analyst. Id. at 6. Moreover, [redacted] query results are generally retained for just [redacted].

All of which is to say that the authority that the government has been pointing to for years to show how great Title VII is is really a dumpster fire of compliance problems.

And still, we know very little about how this authority is used.

The number of Americans affected is not huge — roughly 80 people approved under 704 plus anyone approved for domestic FISA order that goes overseas (though that would almost certainly include Carter Page). Still, if this is supposed to be the big protection Americans overseas receive, it hasn’t been providing much protection.

The Curious Silence about the Mostly Unremarked Russian BGP Hijack

These days, it seems that NYT-approved columnists and self-appointed THREADsters can start a conspiracy theory about anything just by slapping the label “Russia” on it. Which is why I find it so curious that the BGP hijack last week of a bunch of finance companies (and some other interesting targets) by Russian telecom Rostelecom has gone generally unnoticed, except by Ars’ Dan Goodin.

Here’s a great description of what the Border Gateway Protocol is — and why it’s ripe for hijacking.

Such is the story of the “three-napkins protocol,” more formally known as Border Gateway Protocol, or BGP.

At its most basic level, BGP helps routers decide how to send giant flows of data across the vast mesh of connections that make up the Internet. With infinite numbers of possible paths — some slow and meandering, others quick and direct — BGP gives routers the information they need to pick one, even though there is no overall map of the Internet and no authority charged with directing its traffic.

The creation of BGP, which relies on individual networks continuously sharing information about available data links, helped the Internet continue its growth into a worldwide network. But BGP also allows huge swaths of data to be “hijacked” by almost anyone with the necessary skills and access.

The main reason is that BGP, like many key systems on the Internet, is built to automatically trust users — something that may work on smaller networks but leaves a global one ripe for attack.

As BGPstream first noted, the data streams for 37 entities were rerouted by Rostelecom manually last Wednesday for a 6 minute period.

Starting at April 26 22:36 UTC till approximately 22:43 UTC AS12389 (PJSC Rostelecom) started to originate 50 prefixes for numerous other Autonomous systems. The 50 hijacked prefixes included 37 unique autonomous systems

The victims include Visa, Mastercard, Verisign, and Symantec.

Oh — and according to BGPmon, the victims also include Alfa bank — the bank that got mentioned in Christopher Steele’s dossier, that had some weird behavior involving a Trump marketing server last summer, and one of two banks for which the FBI allegedly got a FISA order as part of the investigation into Russia’s interference in the US election.

BGPmon provides one possible innocent explanation (which is, in fact, the analogue of the innocent explanation offered for the Alfa-Trump traffic): it could be BGP advertising gone wrong.

It’s also worth noting that at the same time as the hijacks we did see many (78) new advertisements originated by 12389 for prefixes by ‘other’ Rostelecom telecom ASns (29456,21378,13056,13118,8570). So something probably went wrong internally causing Rostelecom to start originating these new prefixes.

Never attribute to malice that which is adequately explained by… well let’s say an innocent misconfiguration. If this was in-fact an attempt to on purpose redirect traffic for some of these financial institutions, it was done in a very visible and large scale manner, so from that perspective perhaps not too likely. Then again, given the number of high value prefixes of all the same category (financial institutions and credit card processors) it seems a bit more than an innocent accidental hijack, especially considering the fact that new more specific prefixes were introduced.

But Goodin provides some reasons why the hijack should be treated with suspicion. First, Rostelcom — the company that hijacked this traffic — is considered an official Russian government entity.

According to shareholder information provided by Rostelecom, the Russian government owns 49 percent of the telecom’s ordinary shares. The US Department of Commerce lists Rostelecom as a state-owned enterprise and reports that one or more senior government officials have seats on Rostelecom’s board of directors. Rostelecom officials didn’t respond to e-mail seeking comment for this post.

He  cites Dyn’s Doug Madory explaining why the targeted nature of this hijack should rouse suspicion.

“I would classify this as quite suspicious,” Doug Madory, director of Internet analysis at network management firm Dyn, told Ars. “Typically accidental leaks appear more voluminous and indiscriminate. This would appear to be targeted to financial institutions. A typical cause of these errors [is] in some sort of internal traffic engineering, but it would seem strange that someone would limit their traffic engineering to mostly financial networks.”

As Goodin notes, and as I have before, one reason an entity (especially a government) might want to hijack traffic is to make it cross a router where it has the ability to collect it for spying purposes. That process was described in some presentations from an NSA hacker that the Intercept published last year.

As Goodin notes, given that the victims here should be presumed to be using the best encryption, it would take some work for Rostelecom to obtain the financial and other data in the traffic it hijacked.

Such interception or manipulation would be most easily done to data that wasn’t encrypted, but even in cases when it was encrypted, traffic might still be decrypted using attacks with names such as Logjam and DROWN, which work against outdated transport layer security implementations that some organizations still use.

Madory said that even if data couldn’t be decrypted, attackers could potentially use the diverted traffic to enumerate what parties were initiating connections to MasterCard and the other affected companies. The attacker could then target those parties, which may have weaker defenses.

But there’s at least one other reason someone might hijack traffic. If you were able to pull traffic off of switches you knew to be accessible to an adversary that was spying on you, you might succeed in detasking that spying, even if only for 6 minutes.

One of my all-time favorite Snowden disclosures revealed that the NSA was forced to detask from some IRGC Yahoo accounts because they were being spammed and the data was flooding NSA’s systems. That happened at precisely the moment that the FBI was trying to catch some IRGC figures in trying to assassinate then Saudi Ambassador to the US (and current Foreign Secretary) Adel al-Jubeir, which I find to be a mighty interesting coinkydink.

This hypothetically could be something similar: a very well-timed effort to thwart surveillance by making it inaccessible to the switches from which the NSA was collecting it (though honestly, it would take some doing to pull traffic off all collection points accessible to the NSA, and I’m not even sure that would be possible for transatlantic traffic).

Don’t get me wrong. Accidental or not, this was a foot-stomping event. I’m sure the competent and responsible authorities at both the victim companies and the NSA have taken notice of this event, and are working to understand why it happened and if anything was compromised by it.

But I find it striking that the thousands of people spending all their time fervently creating conspiracies where none exist have not even noticed this event which, whatever it explains it, was a real event, and one involving the bank that has been at the center of so many real and imagined conspiracies.

I Con the Record Transparency Bingo (4): How 151 Million Call Events Can Look Reasonable But Is Besides the Point

Other entries in I Con the Record Transparency Bingo:

(1) Only One Positive Hit on a Criminal Search

(2): The Inexplicable Drop in PRTT Numbers

(3): CIA Continues to Hide Its US Person Network Analysis

If your understanding of the phone dragnet replacing the old USA Freedom dragnet came from the the public claims of USA Freedom Act boosters or from this NYT article on the I Con the Record report, you might believe 42 terrorist suspects and their 3,150 friends made 48,000 phone calls last year, which would work out to 130 calls a day … or maybe 24,000 perfectly duplicative calls, which works out to about 65 calls a day.

That’s the math suggested by these two entries in the I Con the Record Transparency Report — showing that the 42 targets of the new phone dragnet generated over 151 million “call detail records.” But as I’ll show, the impact of the 151 million [corrected] records collected last year is in some ways far lower than collecting 65 calls a day, which is a good thing! But it supports a claim that USAF has an entirely different function than boosters understood.

 

Here’s the math for assuming these are just phone calls. There were 42 targets approved for use in the new phone dragnet for some part of last year. Given the data showing just 40 orders, they might only be approved for six months of the year (each order lasts for 180 days), but we’ll just assume the NSA gets multiple targets approved with each order and that all 42 targets were tasked for the entirety of last year (for example, you could have just two orders getting 42 targets approved to cover all these people for a year).

In its report on the phone dragnet, PCLOB estimated that each target might have 75 total contacts. So a first round would collect on 42 targets, but with a second round you would be collecting on 3,192 people. That would mean each of those 3,192 people would be responsible for roughly 48,000 calls a year, every single one of which might represent a new totally innocent American sucked into NSA’s maw for the short term [update: that would be up to a total of 239,400 2nd-degree interlocutors]. The I Con the Record report says that, “the metric provided is over‐inclusive because the government counts each record separately even if the government receives the same record multiple times (whether from one provider or multiple providers).” If these were phone calls between just two people, then if our terrorist buddies only spoke to each other, each would be responsible for 24,000 calls a year, or 65 a day, which is certainly doable, but would mean our terrorist suspects and their friends all spent a lot of time calling each other.

The number becomes less surprising when you remember that even with traditional telephony call records can capture calls and texts. All of a sudden 65 becomes a lot more doable, and a lot more likely to have lots of perfectly duplicative records as terrorists and their buddies spend afternoons texting back and forth with each other.

Still, it may mean that 65 totally innocent people a day get sucked up by NSA.

All that said, there’s no reason to believe we’re dealing just with texts and calls.

As the report reminds us, we’re actually talking about session identifying information, which in the report I Con the Record pretends are “commonly referred to” as “call events.”

Call Detail Records (CDR) – commonly referred to as “call event metadata” – may be obtained from telecommunications providers pursuant to 50 U.S.C. §1861(b)(2)(C). A CDR is defined as session identifying information (including an originating or terminating telephone number, an International Mobile Subscriber Identity (IMSI) number, or an International Mobile Station Equipment Identity (IMEI) number), a telephone calling card number, or the time or duration of a call. See 50 U.S.C. §1861(k)(3)(A). CDRs do not include the content of any communication, the name, address, or financial information of a subscriber or customer, or cell site location or global positioning system information. See 50 U.S.C. §1861(k)(3)(B). CDRs are stored and queried by the service providers. See 50 U.S.C. §1861(c)(2).

Significantly, this parenthesis — “(including an originating or terminating telephone number, an International Mobile Subscriber Identity (IMSI) number, or an International Mobile Station Equipment Identity (IMEI) number)” — suggests that so long as something returns a phone number, a SIM card number, or a handset number, that can be a “call event.” That is, a terrorist using his cell phone to access a site, generating a cookie, would have the requisite identifiers for his phone as well as a time associated with it. And I Con the Record’s transparency report says it is collecting these “call event” records from “telecommunications” firms, not phone companies, meaning a lot more kinds of things might be included — certainly iMessage and WhatsApp, possibly Signal. Indeed, that’s necessarily true given repeated efforts in Congress to get a list of all electronic communications service providers company that don’t keep their “call records” 18 months and to track any changes in retention policies. It’s also necessarily true given Marco Rubio’s claim that we’re sending requests out to a “large and significant number of companies” under the new phone dragnet.

The fine print provides further elements that suggest both that the 151 million events collected last year are not that high. First, it suggests a significant number of CDRs fail validation at some point in the process.

This metric represents the number of records received from the provider(s) and stored in NSA repositories (records that fail at any of a variety of validation steps are not included in this number).

At one level, this means NSA’s results resulted in well more than 151 million events collected. But it also means they may be getting junk. One thing that in the past might have represented a failed validation is if the target no longer uses the selector, though the apparent failure at multiple levels suggests there may be far more interesting reasons for failed validation, some probably technically more interesting.

In addition, the fine print notes that the 151 million call events include both historical events collected with the first order as well as the prospective events collected each day.

CDRs covered by § 501(b)(2)(C) include call detail records created before, on, or after the date of the application relating to an authorized investigation.

So these events weren’t all generated last year — if they’re from AT&T they could have been generated decades ago. Remember that Verizon and T-Mobile agreed to a handshake agreement to keep their call records two years as part of USAF, so for major providers providing just traditional telephony, a request will include at least two years of data, plus the prospective collection. That means our 3,192 targets and friends might only have had 48 calls or texts a day, without any duplication.

Finally, there’s one more thing that suggests this huge number isn’t that huge, but that also it may be a totally irrelevant measure of the privacy impact. In NSA’s document on implementing the program from last year, it described first querying the NSA Enterprise Architecture to find query results, and then sending out selectors for more data.

Once the one-hop results are retrieved from the NSA’s internal holdings, the list of FISC-approved specific selection terms, along with NSA’s internal one-hop results, are submitted to the provider(s).

In other words — and this is a point that was clear about the old phone dragnet but which most people simply refused to understand — this program is not only designed to interact seamlessly with EO 12333 collected data (NSA’s report says so explicitly, as did the USAF report), but many of the selectors involved are already in NSA’s maw.

Under the old phone dragnet, a great proportion of the phone records in question came from EO 12333. NSA preferred then — and I’m sure still prefers now — to rely on queries run on EO 12333 because they came with fewer limits on dissemination.

Which means we need to understand the 65 additional texts — or anything else available only in the US from a large number of electronic communications service providers that might be deemed a session identifier — a day from 42 terrorists and their 3150 buddies on top of the vast store of EO 12333 records that form the primary basis here.

Because (particularly as the rest of the report shows continually expanding metadata analysis and collection) this is literally just the tip of an enormous iceberg, 151 million edge cases to a vast sea of data.

Update: Charlie Savage, who has a really thin skin, wrote me an email trying to dispute this post. In the past, his emails have almost universally devolved into him being really defensive while insisting over and over that stuff I’ve written doesn’t count as reporting (he likes to do this, especially, with stuff he claims a scoop for three years after I’ve written about it). So I told him I would only engage publicly, which he does here.

Fundamentally, Charlie disputes whether Section 215 is getting anything that’s not traditional telephony (he says my texts point is “likely right,” apparently unaware that a document he obtained in FOIA shows an issue that almost certainly shows they were getting texts years ago). Fair enough: the law is written to define CDRs as session identifiers, not telephony calls; we’ll see whether the government is obtaining things that are session identifiers. The I Con the Record report is obviously misleading on other points, but Charlie relies on language from it rather than the actual law. Charlie ignores the larger point, that any discussion of this needs to engage with how Section 215 requests interact with EO 12333, which was always a problem with the reporting on the topic and remains a problem now.

So, perhaps I’m wrong that it is “necessarily” the case that they’re getting non-telephony calls. The law is written such that they can do so (though the bill report limits it to “phone companies,” which would make WhatsApp but not iMessage a stretch).

What’s remarkable about Charlie’s piece, though, is that he utterly and completely misreads this post, “About half” of which, he says, “is devoted to showing how the math to generate 151 million call events within a year is implausible.”

The title of this post says, “151 Million Call Events Can Look Reasonable.” I then say, “But as I’ll show, the impact of the 131 [sic, now corrected] million records collected last year is in some ways far lower than collecting 65 calls a day, which is a good thing!” I then say, “The number becomes less surprising when you remember that even with traditional telephony call records can capture calls and texts. All of a sudden 65 becomes a lot more doable, and a lot more likely to have lots of perfectly duplicative records as terrorists and their buddies spend afternoons texting back and forth with each other.” I go on to say, “The fine print provides further elements that suggest both that the 151 million events collected last year are not that high.” I then go on to say, “So these events weren’t all generated last year — if they’re from AT&T they could have been generated decades ago.”

That is, in the title, and at least four times after that, I point out that 151 million is not that high. Yet he claims that my post aims to show that the math is implausible, not totally plausible.  (He also seems to think I’ve not accounted for the duplicative nature of this, which is curious, since I quote that and incorporate it into my math.)

In his email, I noted that this post replied not just to him, but to others who were alarmed by the number. I said specifically with regards the number, “yes, you were among the people I subtweeted there. But not the only one and some people did take this as just live calls. It’s not all about you, Charlie.”

Yet having been told that that part of the post was not a response to him, Charlie nevertheless persisted in completely misunderstanding the post.

I guess he still believed it was all about him.

Maybe Charlie should spend his time reading the documents he gets in FOIA more attentively rather than writing thin-skinned emails assuming everything is about him?

Update: Once I pointed out that Charlie totally misread this post he told me to go back on my meds.

Since he’s being such a douche, I’ll give you two more pieces of background. First, after I said that I knew CIA wasn’t tracking metadata (because it’s all over public records), Charlie suggested he knew better.

Here’s me twice pointing out that the number of call events was not (just) calls (as he had claimed in his story), a point he mostly concedes in his response.

Here’s the lead of his story: