A tale of two patches

Nick Mathewson over at the Tor project points me to the latest vulnerability in OpenSSL’s implementation of TLS and DTLS (though only in some OpenSSL versions, YMMV). For those of you keeping score at home, this makes two serious flaws in the same small chunk of DTLS code in about four months.

I should note that Tor doesn’t use DTLS. In fact, based on a quick Google search, there may be more people with radioactive-spider-based superpowers than there are projects actually using OpenSSL DTLS (these folks being one possible exception). Moreover, it’s not clear that this bug (unlike the previous one) is useful for anything more than simple DoS.

But that doesn’t make it useless. Even if these things aren’t exploitable, bugs like this one provide us with the opportunity to learn something about how even ‘simple’ decryption code can go wrong. And that’s what we’re going to do in this post.

So let’s go take a look at the unpatched code in question.

This code handles the decryption of DTLS packets, and comes in a slightly older OpenSSL version, v1.0.0e (release 9/2011). To give you a flavor of how one could miss the first (less serious) issue, I was going to reproduce the entire 161-line megafunction in which the bug occurs. But it’s just too huge and awkward. Here’s the relevant bit:

int dtls1_enc(SSL *s, int send)
{
SSL3_RECORD *rec;
EVP_CIPHER_CTX *ds;
unsigned long l;
int bs,i,ii,j,k,n=0;
const EVP_CIPHER *enc; 

        
if ((bs != 1) && !send)
  {
ii=i=rec->data[l-1]; /* padding_length */
i++;
if (s->options&SSL_OP_TLS_BLOCK_PADDING_BUG)
   {
/* First packet is even in size, so check */
if ((memcmp(s->s3->read_sequence,
                            “\0\0\0\0\0\0\0\0”,8) == 0) && !(ii & 1))
s->s3->flags|=TLS1_FLAGS_TLS_PADDING_BUG;
if (s->s3->flags & TLS1_FLAGS_TLS_PADDING_BUG)
i–;
   }
/* TLS 1.0 does not bound the number of padding bytes by the block size.
* All of them must have value ‘padding_length’. */
if (i > (int)rec->length)
   {
/* Incorrect padding. SSLerr() and ssl3_alert are done
* by caller: we don’t want to reveal whether this is
* a decryption error or a MAC verification failure
* (see http://www.openssl.org/~bodo/tls-cbc.txt
*/
return -1;
   }
for (j=(int)(l-i); j<(int)l; j++)
   {
if (rec->data[j] != ii)
      {
/* Incorrect padding */
return -1;
      }
   }
rec->length-=i;
            
rec->data += bs;    /* skip the implicit IV */
rec->input += bs;
rec->length -= bs;
   }
 }
return(1);
}

Now I could waste this whole post ranting about code quality, pointing out that variable names like ‘l’ and ‘bs’ were cute when Kernighan and his buddies were trying to get AWK small enough so they could write it down and type it in every day (or whatever they did for storage back then), but not so cute in 2012 inside of the most widely-used crypto library on earth.

I could also ask why this function is named “dtls1_enc” when in fact it actually contains decryption code as well as encryption code. I might even suggest that combining both operations into one massive, confusing function might have sounded like a better idea than it actually is.

But I’m not going to do all that, since that would be dull and the point of this post is to find a bug. So, do you see it? Hint: it’s in the boldfaced part.

Maybe I should give you a little more. Every DTLS packet begins with an ‘implicit’ Initialization Vector of length ‘bs’. The packet also comes with a bit of padding at the end, to bring the packet up to an even multiple of the cipher’s block size. Let ‘Mx’ represent message bytes and let ‘LL’ be the number of padding bytes expressed as a byte. Then the tail of the packet will look something like:

0x M1 M2 M3 M4 LL LL LL LL LL LL LL LL LL LL LL LL

An important note is that TLS 1.0 allows up to 255 padding bytes even if the blocksize is much less than that. So the last twenty-something lines of the boldfaced decryption code are basically checking the last byte of the decrypted message, then working backward through (possibly) multiple blocks of ‘bs’ bytes each, in order to strip off all of that ugly padding.

Do you see it now?

Ok, I’ll spoil it for you. The entirety of the bug comes in the statement “if (i > (int)rec->length)”, which ensures that the message is as long as the padding claims to be. What this doesn’t do is check that the Initialization Vector is there as well (update: I’ve edited this post a bit, since I got confused the first time!). Because of the missing check it’s possible to submit a short message that consists of only padding, in which case rec->length will be less than ‘bs’. At which point we flow through and hit this statement:

rec->length -= bs;

Oops. See what happens there? It might be helpful if I told you that rec->length is an unsigned long. So predictable consequences.

The OpenSSL folks describe this as a DoS vulnerability, not an exploitable one. I’m perfectly happy to take them at their word. But, yuck. This sort of stuff happens, I realize, but should it have to?

Since you’re still reading this, you’re obviously some kind of glutton for punishment. That probably means you’re going to stick around for the second (and somewhat older) issue, which was patched back in January. In fact, I wrote about it on this blog. To get there, you have to climb one level up to the caller of the function we just looked at.

This is easier said than done since — this being OpenSSL — of course the previous function isn’t called directly, it’s accessed via a function pointer! Oh lord. Ah, here we are:

static int
dtls1_process_record(SSL *s)
{


/* decrypt in place in ‘rr->input’ */
rr->data=rr->input;


enc_err = s->method->ssl3_enc->enc(s,0);  <<<<****** Call to the previous function!
if (enc_err <= 0)
 {
/* decryption failed, silently discard message */
if (enc_err < 0)
  {
rr->length = 0;
s->packet_length = 0;
  }
goto err;  <<<<****** Send an error message
}

/* check the MAC for rr->input (it’s in mac_size bytes at the tail) */
if (rr->length < mac_size)
  {
#if 0 /* OK only for stream ciphers */
al=SSL_AD_DECODE_ERROR;
SSLerr(SSL_F_DTLS1_PROCESS_RECORD,SSL_R_LENGTH_TOO_SHORT);
goto f_err;
#else
goto err;
#endif
  }
rr->length-=mac_size;
i=s->method->ssl3_enc->mac(s,md,0);
if (i data[rr->length]),mac_size) != 0)
  {
goto err;  <<<<****** Send an error message
  }
}
}
See it? Oh, maybe it will help if we revisit one of the bigger comments in the first function, the one that gets called:
 

/* Incorrect padding. SSLerr() and ssl3_alert are done
* by caller: we don’t want to reveal whether this is
* a decryption error or a MAC verification failure
* (see http://www.openssl.org/~bodo/tls-cbc.txt
*/
return -1;

Does that help? The people who wrote the first function knew full well how dangerous it is to reveal whether you’re failing due to a padding verification error or a MAC verification error. They even provided a link to the reason why!

Unfortunately the people writing the one-level up code didn’t understand the full implications of this, or didn’t realize that the information could leak out even if you don’t reveal exactly why you’ve barfed. People can learn why you failed even if there’s just a timing difference in when you give them the error code! This was known and mitigated in the TLS code, but not in the DTLS version.

And the end result is that there are two error conditions, occurring a significant distance from one another in the code. An attacker can monitor the decryption of chosen packets and use this information to slowly decrypt any plaintext. Ouch.

So what does it all mean?

You wanted meaning? I just wanted to look at some code.

But if I was to assign some arbitrary meaning to these mistakes, it’s that cryptographic code is frigging hard to write. And that more eyes does not necessarily mean fewer bugs, particularly if the code is messy and you have two versions doing substantially similar things (DTLS and TLS). It doesn’t help if your coding standard was basically invented on a dare.

OpenSSL has gotten to where it is today not for being a great library, but for basically being the library. And it’s certainly a good one, now that it’s been subjected to several years of unrelenting scrutiny. But the DTLS code is new and hasn’t received that same level of attention, which is why it’s fun to watch the bugs get shaken out all over again (unfortunately the first bug also appears in TLS code, which is harder to explain). I’m sure we haven’t seen the last of it.