On Metadata and Manipulation: the First Guccifer 2.0 Documents
In the AP’s (very worthwhile) coverage of the data it obtained from Secureworks it reveals at least the fifth piece of deception pertaining to the first documents released by Guccifer 2.0 on June 15, 2016. It revealed that Guccifer 2.0 added the word “confidential” (possibly as both the watermark shown on the front page and in the footer) to this document.
But there were signs of dishonesty from the start. The first document Guccifer 2.0 published on June 15 came not from the DNC as advertised but from Podesta’s inbox, according to a former DNC official who spoke on condition of anonymity because he was not authorized to speak to the press.
The official said the word “CONFIDENTIAL” was not in the original document.
Guccifer 2.0 had airbrushed it to catch reporters’ attention.
Here’s that watermark, which would have made reporters obtaining the document to ascribe it more value than it had.
On top of that change, we know that Guccifer 2.0 deliberately used the name Felix Edmundovich, invoking Iron Felix, the founder of the KGB (though another document invoked Che Guevaro in the same way) in the metadata of the document.
This analysis and this analysis compellingly shows, in my opinion, that the other Russian metadata in the documents was also deliberately placed there.
Finally, I believe that the addition of Warren Flood as author was also deliberate.
In addition, Guccifer 2.0 released these documents as DNC documents when in fact they are either Podesta documents or have not yet been sourced.
Now, Guccifer 2.0 in fact didn’t hide some of these alterations. Some were identified the same day the documents were released. But at the time they were interpreted as OpSec failures, rather than intentional deception. To this day, skeptics try to argue that the intentional deception of the rest of the metadata is somehow different than the tribute to Iron Felix (which is a mirror to the assumption in the early days that the Iron Felix was deliberate but the other Russian metadata was not, which I criticized here), without explaining why that would be the case.
In this post, I talked about how some of the other deception — pitching these Podesta (and other) documents as DNC documents — would have been a way to taunt the DNC and Crowdstrike for their false claims downplaying the hack. (Note, in the post, I ask why Guccifer 2.0 harped on VAN so much; the AP piece reveals that VAN officials and those working on voter registration were targeted, which suggests maybe the Russians did get VAN data and we simply don’t know about it.)
So contrary to the belief of some commentators, it has long been known that Guccifer 2.0 altered these documents. But I don’t think there has been a full accounting of all the ways that it worked (it’s not even clear we know the full extent of the deception).
For now, I’m going to leave these multiple layers of deception laid out (I’d add, that whatever cutout led Julian Assange to believe — or at least to claim — the documents were sourced to Americans is another layer of deception, a different kind of metadata.)
There were multiple layers of deception built into these first documents, alternately taunting the Democrats who would have known them to be deception, the analysts who mistook them as mistakes, and the press who took them to indicate real value. I suspect there are at least two more layers of deception here.
But it’s worth noting that no one was immune from this deception, and it’s likely there are still a few layers that we’re missing here.
Update: As Thomas Rid notes on Twitter, one of the first five documents Guccifer 2.0 released is a version of one that Guccifer 1.0 had released.
As I said OT in the previous thread, I’d assume that the Bears would have or want some kind of toolkit to collate separately-gathered but overlapping email corpuses, harvest email addresses, reassemble threads from multiple sides, etc. Stuff like that exists in open source form, but I’m sure there are more complex forensic packages around that assist paralegals and others trying to reconstruct email chains from disparate sources for cases, and that the Bears would have access to them if necessary.
That’s what got my ears twitching about the supposed outreach from Cambridge Analytica to Assange for “help with indexing”. Maybe that was garbled, because subsequent reporting said Mercer wanted a public searchable database of released HRC emails, contacted Nix, and Nix had said he’d already reached out to Assange in June about the deleted HRC server mails, supposedly before the first G2.0 release.
I’ve actually got a half finished thread laying out why I think the current reporting on the CA outreach is unreliable. Your theory might very much align with that.
That said, I think there were two production strains: the G2 one, which did alter documents, and the Assange one, which is not known to (but is known to have parts of datasets missing, but I suspect that may suggest a separate sourcing of those documents as well).
There are, I think, still two more layers of plausible deniability built into these documents that hasn’t publicly been sorted out yet. And that is all built in from these production levels.
Yeah, the CNN follow-up last week seems plausible on what Nix actually did and to what he was responding and ungarbled some of the earlier garble.
I’m still trying to think through what the Bears actually do with the material they acquire, and what kind of workflow they’d adopt to get maximum value out of it (which would require some kind of collation as well as obvious stuff like identifying accounts as priority phishing targets, as Secureworks described) while maintaining plausible deniability along the way.
I like your term “production strain” a lot; it reminds me of the scholarship that tries to explain why different early editions of Shakespeare plays often have very different texts.
a couple of gucifer traits stand out, whether of an actual individual or of a puppet some other person(s) constructed.
– he is very loquacious. i personally can’t imagine doing what he has done and not keeping quiet as a mouse to avoid detection (snowden does, and put cell phones in the fridge). and may be lonely.
– along with this, is remarkably highly risk taking (and probably willing to be caught eventually) .
– is attention craving and fame craving.
– genuinely loves to taunt.
– loves to brag about exploits
– per adam carter’s analysis, works hard to deceive.
i just don’t have the background or patience to go thru adam carter’s work critically, but i read it top to bottom and it had the feel of a meticulous, unbiased, unsparing (hard-nosed) analysis. gucifer, carter believes, is a puppet construct of crowdstrike honchos henry and alperovitch. the motive was to counter or compete with in the media an impending wikileaks disclosure (which i assume, based on other ew posts, was the state dept emails released thru judicial watch foia’s) .
i just don’t know if it is possible to construct a puppet which would read (only) as complexly human as gucifer reads.
apart from the character, the one weakness in proof, though it is a plausible assumption, is that these two were in fact hired by dnc or clinton campaign to carry out the gucifer deception. that would seem like an enormously risky, stupid actually, move by any campaign organizations so early in a campaign – initiated in april or may, 2016; active beginning early june, 2016) :
“…As of June 12th, they were in a position where Julian Assange had just announced WikiLeaks’ upcoming release of Clinton’s emails, Clinton was still under FBI investigation, Trump was attacking Clinton for her use of a private server with his supporters frequently chanting “lock her up!” at rallies).
The campaign and the DNC were in a desperate position and really needed something similar to a Russian hacker narrative (something that leaks have since shown the DNC had started building a month or two prior to the hacking claims) and one where they would be fortunate to have a seemingly clumsy hacker that leaves lots of ‘fingerprints’ tainting files and bringing the reputation of leaks into question. – Sure enough, 2-3 days later, Guccifer2.0, the world’s weirdest hacker was spawned and started telling lies in an effort to attribute himself to the malware discoveries & to Wikileaks… ”
anyway, thanks emptywheel for providing this material to readers.
ew –
“… This analysis and this analysis compellingly shows, in my opinion, that the other Russian metadata in the documents was also deliberately placed there.
Finally, I believe that the addition of Warren Flood as author was also deliberate…”
yours is a more limited conclusion than cater’s – no names, no organization mentioned. you don’t say whether you think the crowdstrike honchos were involved, only that there was manipulation of the gucifer docs. do you believe that part of carter’s analysis? or believe that some other individuals/organization tied to the clinton/sanders/trump campaigns were involved?
why was the addition of flood deliberate? to cast suspicion on biden or his group? any guesses as to how wood’s laptop would have ended up in the hands of gucifer’s puppeteers?
finally, apart from carter’s analysis, why have none of the perpetrators or gucifer been tracked down on the internet and exposed in the last 11/2 yrs? surely nsa can do that.
Adam’s conclusions are nuts (and, as I’ve pointed out to him repeatedly, that’s based in part on reporting that he seems to claim doesn’t exist).
My conclusions will have to wait though.
whew! thanks for taking time to respond. i woke up at 5:30 this morning with my head spinning around this stuff. whatever else this gucifer investigation is, it is a post-graduate level study in misdirection – an accending and descending halls of mirrors.
adam carter sure sounds like a straightforward, nothing-to-hide investigator-analyst laying everything out for all to review. i guess it matters more whether one has a good hypothesis to start with.
i can appreciate you would want to be deadsure about conclusions. now i’m getting curiouser and curiouser about how the guccifer matter will end up.
i also am coming to appreciate the inordinate amount of time and careful thinking spent on unravelling this mass of threads.
re: “Adam Carter”
Is that a website? an actual person? or a pseudonym named after the fictional character on the BBC’s series “Spooks” (also known as “MI-5” with BBC America in the U.S.)?
In the series the character Adam Carter is a ‘good guy’ who goes under cover to save England each week.
i’m confident adam carter is a real person, not a character or psuedonym.
see ew’s highlighted cites in the text – very detailed, logical, open, and valuable analysis.
“This analysis and this analysis compellingly shows, in my opinion, that the other Russian metadata in the documents was also deliberately placed there….”
check out the highlighted “this” and “this”.
orion whiffed :)
adam carter is a psuedonym.
Blog is g-2.space , twitter is https://twitter.com/with_integrity , reddit https://www.reddit.com/user/d3fi4nt on twitter profile says Pseudonymous. IMHO seems to have some agendas but more grounded by far than most. Been picking bones with EW recently, recently criticized some contents of the AP article.
It’s a pseudonym. Someone with very similar views occasionally comments here under a different name.
That’ll be me, but I’m not Adam. He’s real, a one-time Blackhat hacker turned Whitehat more than a decade ago. ‘Adam Carter’ is a psuedonym but not modelled on the British TV series Spooks, it’s just a common English name. ‘Adam’ didn’t even realise he’d chosen the same name as the television character until someone pointed it out on Twitter.
I’m British, too. Way, way, way more attention needs to be given to the UK’s election meddling in 2016, especially as to why the head of GCHQ stepped down “for personal reasons” three days after Trump was inaugurated. Very sudden, with no prior notice and after only a year and a half in the job. There was Director to Director-level (ie Robert Hannigan to John Brennan) handling of the Steele dossier.
arbed –
“… There was Director to Director-level (ie Robert Hannigan to John Brennan) handling of the Steele dossier…”
that doesn’t surprise me.
of interest, might there have been collaboration between the two orgs in initially recruiting a competent individual to work for fusion gps on russia/trump?
“the other Russian”
“the Russians”
“other Russian”
So much Russian dressing, so little salad.
Always good to hear from the croutons!
crunch!
:)))
“reveals that VAN officials and those working on voter registration were targeted, which suggests maybe the Russians did get VAN data and we simply don’t know about it” – what’s the chance that happened with the databash glitch and Bernie’s team find themselves with access. Would be a twofer – a data leak and a political fight between the 2 candidates…
Any balanced discussion of metadata alteration by Guccifer 2 has to begin with the fact that 99.9% of the thousands of documents eventually released (most in two zip files) were unaltered in any way.
Some documents included in later G2 blogposts can be matched in the zipfiles. There is also convincing evidence that the underlying archive from which the G2 zipfiles were constucted was much, much larger than presented to the public. While the Trump dossier was also an attachment in a Podesta email, it is logical that it would also have been on the DNC server. It’s thinner evidence than you interpret.
In the first G2 tranche on June 15, 2016, five documents were altered to add “Russian” metadata. The metadata so altered was not DIRECTORY metadata, but metadata internal to Word and/or Excel documents. To alter such metadata, the documents have to be opened and edited, not simply uploaded. In the case of the Trump oppo research, it was cut-and-pasted from its original Word document into a second document, originally created by Warren Flood but emptied of its original contents. The metadata options of this document had been altered prior to the pasting to “Russify” the metadata: the language option in Word was set to Russian. The addition of a CONFIDENTIAL watermark could have been done at the same time. These changes were done by a user identifying himself, as you observe, as Felix Edmundovich, i.e. Felix Dzerzhinsky, the founder of the Cheka.
If a user self-identified as J.Edgar Hoover (exactly equivalent), no sane analyst would interpret that as an “OpSec failure” by an FBI agent and evidence that the document had been altered by the FBI. To say it is to laugh. A more plausible interpretation of the user name and Russified metadata is that someone is playing – to what end is hard to say.
It was ludicrous that so many specialists e.g. Matt Tait, Thomas Rid, interpreted the “Russian” metadata of five Guccifer 2 documents as “OpSec failure”. It is even more ludicrous that such analyses were adoped by the intel community in their attribution of G2 with “high confidence”.
Also, more coherent analyses of metadata indicate that Guccifer 2 was operating in an Eastern timezone. It is obviously possible that this metadata has been gamed. However, one of the key lines of evidence that Fancy Bear/APT28 was Russian was an analysis of timezones indicated by metadata. If such timezone information is unreliable, then logically it should impact both attributions, not just one.
Steve,
Thanks for joining us.
I trust most of your comment doesn’t apply to me because it reflects no awareness of either what I say in this post or my year of work on this (including, way earlier than any of the skeptics, pointing out that it’s silly to consider the metadata accidental).
As for the East Coast timezone, I invite you to read the WSJ article from earlier in the week. There’s a nugget in there that indicates the government — the one that is sure Russians did the hack — doesn’t disagree about some metadata deriving from ET.
The notion that metadata from ET means not Russian is every bit as idiotic as those who say Russian metadata is accidental. Both sides should think a lot more about what those datapoints say.
As I’ve noted before, you can spin up an east coast VPS (with default TZ settings reflecting the physical location) in seconds at a reputable provider* if you want to use a US backbone to minimise network transfer time and the possibility of dropped connections: say, for pulling files from an east coast server. It adds the risk of US-based logging if you’re up to no good, but it’s the sort of thing I’d do to pull files in a hurry for an offsite backup even if the intended final destination was elsewhere.
All that’s to say that there are more plausible technical explanations for the presence of ET metadata than it having been manipulated into place.
* or minutes at a less reputable one, god I can recognise their IP ranges at one glance now.
i’ll just toss this in with respect to time zones. i have no idea if it is relevant. it is an item that has been discussed at emptywheel before:
http://www.baltimoresun.com/news/maryland/bs-md-russian-hacking-maryland-base-20161229-story.html
when I made this comment, I had discourteously not read your earlier comment to which you clearly linked. You’ve clearly (and anomalously) have been attentive to this issue. I apologize for any slight.
Ah, thanks. In any case, welcome to our discussions.
I love you guys. Ever since Sweet Judy blew Lies. Absolutely the best sleuthing in da worl’!!!
The other two documents constructed from the Warren Flood template (see Adam Carter, g-2.space on this) also contain CONFIDENTIAL watermark (as well as being set to Russian language default. So it’s a property of the fake template, nor the specific memo.
Makes sense. Have we found the code for that?
Really? Where? I found one of the docs had the word “confidential” in it, but it wasn’t a watermark.
When 1.doc, 2.doc and presumably 3.doc are rendered in Word, there is a watermark CONFIDENTIAL (UPPERCASE) and footer Confidential (TitleCase). Did you look at both occurrences?
https://i.imgur.com/KqjYki0.png
The code rendering CONFIDENTIAL is shown here (following Adam Carter’s recipe for locating metadata via .txt):
https://i.imgur.com/Lne9BzL.png
The code for rendering the footer “Confidential”:
https://i.imgur.com/qysBNTy.png
TY
2.doc’s original https://wikileaks.org/podesta-emails/fileid/55782/15051
No watermarks.
I only diffed the raw text, not the code/metadata because it was all altered by the template. Looks like “confidential” was a part of that alteration.
4.doc couldn’t be found on Wikileaks. The confidential watermark I completely missed. Also in 1.doc those links were naked and some broken, and this made me think it might have been an earlier draft. ??? Many (most) of the other documents couldn’t be found on the Podesta archive for comparison. To suppose that only a few came from the Podesta leak and were then altered (why include 4.doc?) doesn’t make much sense when the Trump op research would have been so widely available. Possibly maybe.
BTW 1.doc was the only one textually altered, and those alterations were index refactoring and naked urls (I missed the confidential bit because it was buried in a bunch of other smaller formatting tags I dismissed as being part of the template). Other docs had “secret” and “confidential” headers, and were not altered. I think the idea that “confidential” was added to the first one seems a little flimsy, but then again the rest of the copy-pasted docs could have been a cover for that one small alteration. I’mma sticking to my blowback absorption hypotheses (that seems to have misfired–I seem to be the only one on the entire internet to have put this forward; Carter is NOT a fan of it)
Obviously the Warren addition could have been some other signal. I went full tinfoil.
Hey, good to see you.
Did you see Tom Rid’s Tweet about Doc 4 being a different version of a doc released by G1?
I think it’s about adding layers of plausible deniability to the operation. Which is a similar motive.
Thanks for the links to Matt Tait’s 15 June 2016 tweets on “Russian” metadata and your 25 July 2016 criticism of the attributions based on metadata. Tait’s time line has Peter W Smith contacting him around the time of Wikileaks DNC reveal July 22 and Trumps July 27 call for Russia to reveal more Clinton emails. Now how does that relate to Roger Stone’s August 5 breitbart article ….”stop blaming Russia”, which does not mention the “fake Russian” metadata. Did team Trump execute two different scripts or were they just not coordinated/micromanaged. In other words was Charles Johnson’s “firewall” so effective that Stone had not understood Smith’s concerns or did not even think about the metadata to deny it? In a world where even Guccifer 2.0 is reading EW how come the ratfuckers pretend to know nothing.
Aw cmon. All the best read me. The ratfuckers are scared of me.
Answering
OrionATL at November 4, 2017 at 10:10 am
“might there have been collaboration between the two orgs in initially recruiting a competent individual to work for fusion gps on russia/trump?”
No, that’s not what I meant. It was Fusion/Perkins Coie who hired Steele, perhaps at the recommendation of the FBI as he’d already worked with the latter.
Brennan had already seen the Steele Dossier around mid-summer of 2016 (I believe he testified to this early timing but perhaps didn’t specify exactly which report he’d seen). What I’m suggesting instead is that Brennan laundered what he realised was just unverified and unverifiable opposition research by getting Steele to submit it to his ex-bosses at MI6 (even though officially ‘retired’ he’d still maintain many contacts), then it was passed across to Hannigan at GCHQ to be passed back to Brennan as – Voila! – “classified 5eyes intelligence”. Laundering opposition research into classified intel in this way was what necessitated the Director-level-only handling within GCHQ.
i understood your intial point. i was just inquiring whether you thought american intell might have early-on involved british intell to begin the process of checking out russia/trump at arms lenth from u. s.
seperately.
brennan is (was) irish. are you saying that at one time he worked for m16?
Brennan has been clear that the June report is something else. Which I’ve heard via other sources, plural.
I think the concern that MI6 laundered Steele claims back into the CIA is real. But I also know there are a goodly number of other claims that don’t match the (pretty delayed and inaccurate) Steele ones.
emptywheel wrote:
“… November 4, 2017 at 2:45 pm
Brennan has been clear that the June report is something else. Which I’ve heard via other sources, plural.
I think the concern that MI6 laundered Steele claims back into the CIA is real. But I also know there are a goodly number of other claims that don’t match the (pretty delayed and inaccurate) Steele ones… ”
emptywheel,
i did not know there were other “reports”, and possibly earlier reports known to brennen, et al., that said roughly what steele said or sort of corroborated steele, especially not officially recognized reports.
probably you’ve covered this. could you please tell me where and when. i would love to have this as background.
Regarding the use of the dossier for getting FISA order(s). http://www.cnn.com/2017/04/18/politics/fbi-dossier-carter-page-donald-trump-russia-investigation/ https://www.washingtonpost.com/world/national-security/fbi-obtained-fisa-warrant-to-monitor-former-trump-adviser-carter-page/2017/04/11/620192ea-1e0e-11e7-ad74-3a742a6e93a7_story.html This story reads like a Trump/Putin.org press release.
I see the Trump Tower meeting and dossier as both agitprop initiatives/responses of the Putin/oligarchy attempts to compromise all and poison all wells. Coincidence that Akhmetshin and Veselnitskaya are tied to the birthings of both?
Ugh, the trope that the dossier was the basis for FISA warrants is just bullshit and needs to die. First off, the first warrant was long before the “dossier’. Secondly if you know anything about the FISA app process, and court scrutiny, you know it is laughable that it was the basis for even the second warrant.
Twitter profile photo is of young Dzierżyński:
https://twitter.com/Guccifer2
Yup. Another level of the game.
I couldn’t get the Reply button to work for a comment upthread. Thanks for the info re: Adam Carter.
I think I miss a lot of the conversation by not being on Twitter. Playing catch up here.
pdaly,
How I get the comment section to work.
Also, a lot goes on on Twitter, and I’m also not on it, but you can still read people’s feeds [correct word?]. I regularly have about 15 tabs open. [@bmaz and @emptywheel are always the first I check in the AM] When there’s a new tweet, it will register on the tab. [ie: (2) means two new tweets.] Quite often it’s very frustrating not to be able to add my thoughts, but I just don’t think I’d be able to survive on there.