On Gauss

If you pay attention to this sort of thing, you’ve probably heard about the new state-sponsored malware that’s making the rounds in the Middle East. It’s called ‘Gauss’, and like its big brother Flame, it was discovered by Kaspersky Labs (analysis at the link).

I don’t have much to say about Gauss that hasn’t been covered elsewhere. Still, for those who don’t follow this stuff routinely, I thought I might describe a couple of the neat things we’ve learned about it.

Here’s the nutshell summary: Gauss is your basic run-of-the-mill government-issued malware, highly modularized and linked to the same C&C infrastructure that Flame used. It seems mainly focused on capturing banking data, but (as I’ll mention in a second) it may do other things. Unlike Flame, there’s no evidence that Gauss uses colliding MD5 certificates to get itself onto a host system. Though, in fairness, we may not yet have the complete picture at this point.

So if there are no colliding certificates, what’s interesting about Gauss? So far as I can tell, only two things. First, it installs a mystery font. Second — and far more interesting — it contains an encrypted payload.

Palida Narrow. Every Gauss-infected system gets set up with a new font called Palida Narrow, which appears to be a custom-generated variant of Lucida Bright with some unusual glyphs in it. From Kaspersky’s report:

[Gauss] creates a new TrueType font file “%SystemRoot%\fonts\pldnrfn.ttf” (62 668 bytes long) from a template and using randomized data from the ShutdownInterval key.

Now this looks exciting! Unfortunately Kaspersky has not explained how the randomization works, or indeed if the data is truly random. This leaves us with nothing to do but speculate.

And plenty of folks have. Theories range from the practical (remote host detection) to the slightly wild (on-site vulnerability fuzzing). My favorite is the speculation that Palida is used to steganographically fingerprint the author of certain printed materials. While this theory is almost certainly wrong, it’s not completely nuts, and even has some precedent in the research literature.

The ‘Godel’ payload. For all the excitement about fonts, the big news of Gauss is the presence of an encrypted module called ‘Godel’.

Godel should be setting your hair on fire, if only because it attempts to replicate itself via a vulnerability in the code that Windows uses to handle USB sticks. This the very same vector that Stuxnet used to infect the air-gapped centrifuge controllers at Natanz. It’s a good indicator that Godel is targeted at a similarly air-gapped system.

Of course the question is: which system? Godel goes to great lengths to ensure that we don’t know.

Presumably the designers made this decision based on some bitter experience with Stuxnet, which didn’t protect its code at all. The result in Stuxnet’s case was that researchers quickly decompiled the payload and identified the parameters that it looked for in a target systems. Somebody — presumably Stuxnet’s handlers — were unhappy about this: on July 15, 2010 a distributed denial of service attack crippled the industrial control mailing listservs where this was being discussed.

To avoid a repeat of this episode, Gauss’s designers chose to encrypt the Godel payload under a key derived from a specific configuration on the targeted computer.

The details can be found in this Kaspersky post. To make a long story short, Godel derives an encryption key by repeatedly MD5 hashing a series of (salted) executable filenames and paths located on the target system. Only a valid entry will unlock the program sections, allowing Godel to do its job.

gauss
Example of the salted file path/executable name pair. From Kaspersky.

The key derivation process is performed using 1000 10,000 iterations of MD5. The resulting key is fed to  RC4. While the use of RC4 and MD5 may seem a little bit archaic (c’mon guys, get with the 21st century!), it likely reflects a decision to use the Microsoft CryptoAPI across a broad range of Windows versions rather than some sort cryptographic retro-fetish.

The real question is: how well does it work?

Probably very well, with caveats. Kaspersky says they’re looking for a world-class cryptographer to help them crack the code. What they should really be looking for is someone with a world-class GPU.

As best I can see, the only limitation of Gauss’s approach is that the designers should have used a more time-intensive function to derive their keys. 1000 10,000 iterations of MD5 sounds like a lot, but really it isn’t; not in a world with efficient GPUs that you can rent. This code won’t be broken based on weaknesses in RC4 or MD5. It will be broken by exhaustively searching the file path/name space using a GPU (or even FPGA)-based system, or possibly just getting lucky.

No doubt Kaspersky is working on such a project right now. If we learn anything more about the mystery of Godel, it will almost certainly come from that work.

Flame, certificates, collisions. Oh my.

Update 6/6: Microsoft has given us more details on what’s going on here. If these collision attacks are truly novel, this tells us a lot about who worked on Flame and how important it is. 

Update 6/7: Yes, it’s officially a novel MD5 collision attack, with the implication that top cryptographers were involved in Flame’s creation. We truly are living in the future.

See detailed updates (and a timeline) at the bottom of this post.

If you pay attention to these things, you’ve heard that there’s a new piece of government-issued malware making the rounds. It’s called Flame (or sometimes Skywiper), and it’s basically the most sophisticated — or most bloated — piece of malicious software ever devised. Kaspersky gets the credit for identifying Flame, but the task of tearing it to pieces has fallen to a whole bunch of different people.

Normally I don’t get too excited about malware, cause, well, that kind of thing is somebody else’s problem. But Flame is special: if reports are correct, this is the first piece of malware ever to take advantage of an MD5 certificate collision to enable code signing. Actually, it may be the first real anything to use MD5 certificate collisions. Neat stuff.

What we know is pretty sketchy, and is based entirely on the information-sparse updates coming from Microsoft’s security team. Here’s the story so far:

Sometime yesterday (June 3), Microsoft released an urgent security advisory warning administrators to revoke two Intermediate certificates hanging off of the Microsoft root cert. These belong to the Terminal Services Licensing service. Note that while these are real certificates — with keys and everything — they actually have nothing to do with security: their sole purpose was to authorize, say, 20 seats on your Terminal Services machine.

Despite the fact that these certs don’t serve any real security purpose, and shouldn’t be confused with anything that matters, Microsoft hung them off of the same root as the real certificates it uses for, well, important stuff. Stuff like signing Windows Update packages. You see where this is going?

Actually, this shouldn’t have been a real problem, because these licensing certs (and any new certificates beneath them) should have been recognizably different from code-signing certificates. Unfortunately, someone in Product Licensing appears to have really screwed the pooch. These certificates were used to generate Licensing certificates for customers. And each of those certificates had the code signing bit set.

The result? Anyone who paid for a Terminal Services license could (apparently) sign code as if they were Microsoft. If you’re wondering what that looks like, it looks like this.

This is obviously a bad thing. For many reasons. Mikko Hypponen points out that many organizations whitelist software signing by Microsoft, so even if institutions were being paranoid, this malware basically had carte blanche.

Ok, so far this is really bad, but just a garden-variety screwup. I mean, a huge one, to be sure. But nothing really interesting.

Here’s where that changes. Just today, Microsoft released a new update. This is the new piece:

The Flame malware used a cryptographic collision attack in combination with the terminal server licensing service certificates to sign code as if it came from Microsoft. However, code-signing without performing a collision is also possible.

Now, I’m not sure why the ‘collision attack’ is necessary if code-signing is possible without a collision. But who cares! A collision attack! In the wild! On top-secret government-issued Malware! Dan Brown couldn’t hold a candle to this.

If we look at the Licensing certificates in question, we do indeed see that at least one — created in 2009 — uses the MD5 hash function, something everyone knows is a bad idea:

Certificate:    Data:        Version: 3 (0x2)        Serial Number:            3a:ab:11:de:e5:2f:1b:19:d0:56    Signature Algorithm: md5WithRSAEncryption        Issuer: OU=Copyright (c) 1997 Microsoft Corp., OU=Microsoft Corporation, CN=Microsoft Root Authority        Validity            Not Before: Dec 10 01:55:35 2009 GMT            Not After : Oct 23 08:00:00 2016 GMT        Subject: C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, OU=Copyright (c) 1999 Microsoft Corp., CN=Microsoft Enforced Licensing Intermediate PCA        Subject Public Key Info:            Public Key Algorithm: rsaEncryption                Public-Key: (2048 bit)

And so… Well, I don’t really know. That’s where we run out of information. And end up with a lot of questions.

For one thing, why did the Flame authors need a collision? What extra capabilities did it get them? (Update: the best theory I’ve heard comes from Nate Lawson, who theorizes that the collision might have allowed the attackers to hide their identity. Update 6/6: The real reason has to do with certificate extensions and compatibility with more recent Windows versions. See the end of this post.)

More importantly, what kind of resources would it take to find the necessary MD5 collision? That would tell us a lot about the capabilities of the people government contractors who write malware like this. In 2008 it took one day on a supercomputer, i.e., several hundred PS3s linked together.

And just out of curiosity, what in the world was Microsoft doing issuing MD5-based certificates in 2009? I mean, the code signing bit is disastrous, but at least that’s a relatively subtle issue — I can understand how someone might have missed it. But making MD5 certificates one year after they were definitively shown to be insecure, that’s hard to excuse. Lastly, why do licensing certificates even need to involve a Certificate Signing Request at all, which is the vector you’d use for a collision attack?

I hope that we’ll learn the answers to these questions over the next couple of days. In the mean time, it’s a fascinating time to be in security. If not, unfortunately, a very good time for anyone else.

***

Update 6/5: For those who aren’t familiar with certificate collision attacks, I should clear up the basic premise. Most Certificate Authorities sign a Certificate Signing Request (CSR) issued by a possibly untrustworthy customer. The idea of an MD5 collision-finding attack is to come up with two different CSRs that hash to the same thing. One would be legitimate and might contain your Terminal Services licensing number. The other would contain… something else. By signing the first CSR, Microsoft would be implicitly creating a certificate on the second batch of information.

Finding collisions is a tricky process, since it requires you to muck with the bits of the public key embedded in the certificate (see this paper for more details). Also, some CAs embed a random serial number into the certificate, which really messes with the attack. Microsoft did not.

Finally, some have noticed that Microsoft is still using a root certificate based on MD5, one that doesn’t expire until 2020, and hasn’t been revoked. What makes this ok? The simple answer is that it’s not ok. However, Microsoft probably does not sign arbitrary CSRs with that root certificate, meaning that collision attacks are not viable against it. It’s only the Intermediate certs, the ones that actually sign untrusted CSRs you need to worry about — today.

***

Update 6/6: Microsoft has finally given us some red meat. Short summary: the Terminal Services Licensing certs do work to sign code on versions of Windows prior to Vista, with no collision attack needed. All in all, that’s a heck of a bad thing. But it seems that more recent versions of Windows objected to the particular X.509 extensions that were in the TS licensing certs. The government (or their contractors) therefore used a collision attack to make a certificate that did not have these extensions. From what I’m reading on mailing lists, the collision appears to be “new, and of scientific interest”.

If this pans out, it’s a big deal. Normally the government doesn’t blow a truly sophisticated cryptanalytic attack on some minor spying effort. Getting MD5 collisions up and running is a big effort; it’s not something you can do from a cookbook. But developing novel collision techniques is a step beyond that. I take this to mean that Flame’s authors were breaking out the big guns, which tells us something about its importance to our government. It also tells us that the creation of Flame may have involved some of the top mathematicians and cryptographers in (our?) intelligence services. This is an important datapoint.

***

Update 6/7: It looks like it does pan out. Marc Stevens — one of the authors of the earlier MD5 certificate collision papers — confirms that Flame’s certificate uses a novel collision-finding technique.

“Flame uses a completely new variant of a ‘chosen prefix collision attack’ to impersonate a legitimate security update from Microsoft. The design of this new variant required world-class cryptanalysis”

We can only speculate about why this would be necessary, but a good guess is that Flame is old (or, at very least, the crypto techniques are). If the collision technique was in the works prior to 2009 (and why wouldn’t it be?), Stevens et al. wouldn’t have had much relevance. For those who like details, a (very incomplete) timeline looks something like this:

  1. Mid-1990s: MD5 is effectively obsolete. Switch to a better hash function!
  2. August 2004: Wang and Yu present first (‘random’), manually-found collision on MD5.
  3. 2004-2007: Collisions extended to ‘meaningful’ documents. Lots of stuff omitted here.
  4. May 2007: (CWI) Stevens et al. publish a first impractical technique for colliding certificates.
  5. Late 2008: (CWI) Stevens et al. develop a practical certificate collision.
  6. December 2008: Vendors notified.
  7. Feburary 2009: Paper submitted to CRYPTO.
  8. Summer 2009: Source code released.

Even assuming that CWI team had perfect security with their result, conference program committees are hardly built for secrecy — at least, not from goverments. So it’s reasonable to assume that the Stevens et al. technique was known by February 2009, possibly earlier. Which means that the Flame developers may have had their own approach under development sometime before that date.

The really interesting question is: when did secret collision research get ahead of the public, academic stuff? Post-2007? Post-2004? Pre-2004? I doubt we’ll ever know the answer to this question. I sure would love to.

Update 6/12: Alex Sotirov has a great Summercon presentation with many of the details. The word from those who know is: don’t take his $20,000 figure too literally. We know nothing about the techniques used to find this collision.