Tuesday, April 8, 2014

Attack of the week: OpenSSL Heartbleed

Ouch. (Logo from heartbleed.com)
I start every lecture in my security class by asking the students to give us any interesting security or crypto news they've seen recently, preferably with a focus on vulnerabilities. The start of my last class was pretty lame, which meant either (1) we'd finally learned how to make our crypto software secure, or (2) something terrible was about to happen.

I'll let you take a guess which one it was.

The news today is all about the Heartbleed bug in OpenSSL (nb: a truly awful name that makes me glad there are no other vital organs referenced in the TLS code). Unlike all the fancy crypto-related attacks we've seen recently -- BEAST, CRIME, Lucky13, etc. -- Heartbleed doesn't require any interesting crypto. In fact it's the result of a relatively mundane coding error. And predictably, this makes it more devastating than all of those fancy attacks put together.

I'm going to talk about Heartbleed using the 'fun' question/answer format I save for this kind of post.

What's Heartbleed and why should I care about OpenSSL?

In case you haven't read the Heartbleed website, go do that. Here I'll just give a quick overview.

Heartbleed is a surprisingly small bug in a piece of logic that relates to OpenSSL's implementation of the TLS 'heartbeat' mechanism. The bug is present in OpenSSL versions 1.0.1 through 1.0.1f (and not in other versions). Sadly, these versions have seen a great deal of adoption lately, because security professionals have been urging developers to implement more recent versions of TLS (1.1 and 1.2). Deployment of TLS 1.2 currently stands at about 30% of the SSL Pulse data set. Many of those servers are likely vulnerable.

The problem is fairly simple: there's a tiny vulnerability -- a simple missing bounds check -- in the code that handles TLS 'heartbeat' messages. By abusing this mechanism, an attacker can request that a running TLS server hand over a relatively large slice (up to 64KB) of its private memory space. Since this is the same memory space where OpenSSL also stores the server's private key material, an attacker can potentially obtain (a) long-term server private keys, (b) TLS session keys, (c) confidential data like passwords, (d) session ticket keys.

Alleged Yahoo user credentials visible due to Heartbleed (source: Mark Loman).
Any of the above may allow an attacker to decrypt ongoing TLS sessions or steal useful information. However item (a) above is by far the worst, since an attacker who obtains the server's main private keys can potentially decrypt past sessions (if made using the non-PFS RSA handshake) or impersonate the server going forward. Worst of all, the exploit leaves no trace.

You should care about this because -- whether you realize it or not -- a hell of a lot of the security infrastructure you rely on is dependent in some way on OpenSSL. This includes many of the websites that store your personal information. And for better or for worse, industry's reliance on OpenSSL is only increasing.

What's the remedy?

Unfortunately it's pretty nasty.

You can test if a given server is vulnerable using one of these tools (note: use at your own risk). Having identified a problem, the first step is to patch OpenSSL. Fortunately this is relatively easy. The 1.0.1g version is not vulnerable, and Debian has a patch. You can also recompile OpenSSL with the -DOPENSSL_NO_HEARTBEATS option.

Sadly, this is only the beginning. Since there's no way to tell whether a server has been exploited (and exploit code is now in the wild) you need to assume that it is. This means the safe move is to revoke your certificate and get a new one. Have fun.

What's the bug?

The TLS Heartbeat mechanism is designed to keep connections alive even when no data is being transmitted. Heartbeat messages sent by one peer contain random data and a payload length. The other peer is suppose to respond with a mirror of exactly the same data.

   When a HeartbeatRequest message is received and sending a
   HeartbeatResponse is not prohibited as described elsewhere in this
   document, the receiver MUST send a corresponding HeartbeatResponse
   message carrying an exact copy of the payload of the received
   HeartbeatRequest.

The data structure of the request looks like this. Note the two-byte payload length:

   struct {
      HeartbeatMessageType type;
      uint16 payload_length;
      opaque payload[HeartbeatMessage.payload_length];
      opaque padding[padding_length]; 
   } HeartbeatMessage;

Which brings us to the bug. The original bug was introduced in this Git commit. The code appears in different files (for DTLS and TLS). Here we'll look at the file t1_lib.c:

2412         /* Read type and payload length first */
2413         hbtype = *p++;
2414         n2s(p, payload);
2415         pl = p;
2417         if (s->msg_callback)
2418                 s->msg_callback(0, s->version, TLS1_RT_HEARTBEAT,
2419                         &s->s3->rrec.data[0], s->s3->rrec.length,
2420                         s, s->msg_callback_arg);
2422         if (hbtype == TLS1_HB_REQUEST)
2423                 {
2424                 unsigned char *buffer, *bp;
2425                 int r;
2427                 /* Allocate memory for the response, size is 1 bytes
2428                  * message type, plus 2 bytes payload length, plus
2429                  * payload, plus padding
2430                  */
2431                 buffer = OPENSSL_malloc(1 + 2 + payload + padding);
2432                 bp = buffer;
2433                 
2434                 /* Enter response type, length and copy payload */
2435                 *bp++ = TLS1_HB_RESPONSE;
2436                 s2n(payload, bp);
2437                 memcpy(bp, pl, payload);
2438                 
2439                 r = ssl3_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);

As you can see, the incoming (adversarially-generated) data contains a payload length ("payload") which is trusted without bounds checks. OpenSSL then allocates a buffer for its response, and copies "payload" data bytes from the pointer "pl" into it. Unfortunately, there's no check to make sure that there are actually "payload" bytes in data, or that this is in bounds. Hence the attacker gets a slice of data from main memory -- one that's up to 64KB in length.

The fix is equally simple. Just add a bounds check:

+    /* Read type and payload length first */
+    if (1 + 2 + 16 > s->s3->rrec.length)
+        return 0; /* silently discard */
+    hbtype = *p++;
+    n2s(p, payload);
+    if (1 + 2 + payload + 16 > s->s3->rrec.length)
+        return 0; /* silently discard per RFC 6520 sec. 4 */
+    pl = p;
+

Wasn't that dull?

Come on guys, next time please: use a cache timing weakness in AES-GCM encryption. This whole 'effectively breaking OpenSSL by finding simple C coding bugs' isn't even challenging.

Should we rail against the OpenSSL developers for this?

Don't even think about it. The OpenSSL team, which is surprisingly small, has been given the task of maintaining the world's most popular TLS library. It's a hard job with essentially no pay. It involves taking other folks' code (as in the case of Heartbeat) and doing a best-possible job of reviewing it. Then you hope others will notice it and disclose it responsibly before disasters happen.

The OpenSSL developers have a pretty amazing record considering the amount of use this library gets and the quantity of legacy cruft and the number of platforms (over eighty!) they have to support. Maybe in the midst of patching their servers, some of the big companies that use OpenSSL will think of tossing them some real no-strings-attached funding so they can keep doing their job.

92 comments:

  1. Thanks for the explanation. I read somewhere this heartbeat stuff of OpenSSL isn't a widely used feature. If this were true then I wonder why is a rarely used feature enabled by default? Wouldn't it be easy -- and prudent -- to only enable features by default which are used most often? Just to reduce the attack surface.

    ReplyDelete
  2. I whole heartily agree with the last comment. Major companies that rely on the use of OpenSSL (and not one of the other implementations i.e. PolarSSL) should start to contribute financially to its development, maintenance, and evaluation! From all the outcomes of the heartbleed fiasco, this would be the best.

    ReplyDelete
    Replies
    1. OpenSSL should pull the plug of free commercial use, now that they have a case where they clearly were not up to the task of integrating a rather mundane software component in a secure fashion with the resources available to them.

      Not to mention that heartbeat services in general are bandwidth abuse.

      Delete
    2. I do not see the "clearly" part, neither the "not up to the task" one. I heard Microsoft had bugs (would you had guessed..?), Linux has exploits (NSA benefits), so why would you smear a team of dedicated professionals?

      Delete
    3. @Serban: don't taken each chunk by itself and misinterpret the original intent. I read that paragraph as the team can't do the right thing with the very, very limited resources they have.

      Delete
    4. JoeOvercoat, you missed the boat...go back to Microsoft.

      Delete
    5. @justaSQLguy: Serban was replying to JoeOvercoat, not to the OP.

      Delete
    6. I agree Joe. They should ask for money for commercial use coz they need some money to live, right

      Delete
  3. Thanks for all the great reporting and explanations!

    ReplyDelete
  4. "This whole 'effectively breaking OpenSSL by finding simple C coding bugs' isn't even challenging."

    Yet it took a few years to find...

    ReplyDelete
    Replies
    1. We don't know that it took that long to find this exploit. We only know that it took that long to find-and-admit-finding it.

      Delete
    2. You can bet that whatever hackers and government agencies have *not* done this before, they're doing it now. They are going to go over all that code with a fine comb to find more exploits.

      Seems to be a general problem with open source and crypto: The incentives and rewards for finding and using exploits are *much* higher than those for finding and publishing exploits.

      As an independent, I can probably sell any exploit for good money. As a hacker group, I can turn it into cash. As an intelligence service, that's what I do. As a security researcher, I get a pat on the shoulder, well done, thanks.

      This is the problem.

      Delete
    3. Do we really know it took a few years to find? Are we sure it wasn't already known by the bad guys like those hack-for-hire or NSA types? It took years to be come public but I really doubt it was previously completely unknown.

      Delete
    4. I agree on the comment on "This whole 'effectively breaking OpenSSL by finding simple C coding bugs' isn't even challenging." it is very easy now to make this statement, when somebody else has revealed the bug. remember, it is just one in among millions of lines. finding such errors has always been very hard and hence a challenging task for program analysis people.

      Delete
  5. Hopefully agnesaduboaten at yahoo.com is not getting to many emails due to this post.

    ReplyDelete
  6. Perhaps, heartbleed is how NSA was breaking SSL?

    http://blog.cryptographyengineering.com/2013/12/how-does-nsa-break-ssl.html

    ReplyDelete
  7. Why do we even need an explicit heartbeat record/protocol in TLS? Wouldn't sending zero-length fragments of application data (which is explicitly allowed by the RFC) essentially solve the same problem, e.g. firewalls dropping idle TCP/SSL connections?

    I can see that for DTLS, heartbeats can be used for path MTU discovery, but for regular, TCP-based TLS, I don't see any benefit.

    ReplyDelete
    Replies
    1. Yes. This.

      Shouldn't security-oriented protocols be based on minimalist designs? This doesn't strike me as at all minimalist.

      Delete
    2. I'm not convinced heartbeat is that useful as well, but the RFC does describe some scenarios:

      https://tools.ietf.org/html/rfc6520

      Some of those scenarios do require non-zero payload. Heartbeat is not only used for keep-alive.

      Delete
  8. When the machine breaks down, WE break down.
    -- Barnes, Platoon

    https://www.youtube.com/watch?v=6cyXO5tO6kw

    ReplyDelete
  9. Great explanation. Loved the code snippets. It took me back to when I was a systems programmer in straight C. K & R are my heroes ... but now I have revealed my age ;)

    ReplyDelete
    Replies
    1. And I know immediately who you mean by K&R. So I am just as old. Still have a copy of their book on my shelf, though after 25 years of C programming, I don't much need it.

      Delete
  10. NSA guys must be furious right now

    ReplyDelete
    Replies
    1. All apart from the guy who just won the "when will they spot this one?" betting pool ...

      Delete
  11. Excellent article. Is there any discussion regarding big companies starting funding OpenSSL?

    ReplyDelete
  12. Time to start using better tools for C/C++ software development, e.g http://msdn.microsoft.com/en-us/library/hh916382.aspx would have caught this.

    ReplyDelete
    Replies
    1. Well, IIRC Microsoft actually expanded their impl of the CRT to include non-standard function extensions for routines like strcpy_s, memcpy_s, etc that were intended to greatly reduce memory overrun issues by bound checking parameters. That, along with what you point out, are well-intentioned proposals, but it's a little hard to take MS seriously these days.

      Delete
    2. That's a ridiculous statement. This bug (as well as Apple's recent SSL bug) simply would not have made it through MS's dev process. What's hard to take seriously is hackers like yourself who seem to think your above making these kinds of errors while refusing to learn from MS who has taken this kind of bug very seriously for a long time.

      Delete
    3. Holy Smokes! Did you just both defend Microsoft AND use common sense in one paragraph?? Don't you know that's not allowed in the current technosphere??!

      Seriously, congrats. The incessant regurgitation that MS screws up absolutely everything is starting to wear thin. Something about those who ignore history and something about repeating it...

      Delete
    4. For the benefit of the MS fanboy above:

      Blaster, Sasser, Conficker, SQL Slammer, Stux on and on and on.

      For you to claim this defect wouldn't have made it through the MS dev process is completely blind and deaf to past history. Many, many security breaches have slipped by MS' control processes, and they will again.

      Look, I use MS dev tools every single day. They do an OK job. My broader issue with them is that they can't stay focused on a platform, and I will no longer invest financial & intellectual capital in the future to keep up with their race from Win32 to COM to .NET to Silverlight to Metro to God knows what next. They are like a dead, dying star, collapsing under their own bloated mass, desperately trying to push out offerings as fast as possible to sustain themselves. Their OS portfolio is going nowhere fast.

      It's not that I don't think MS doesn't take QC seriously, it's just that I don't want them driving standards & practices of languages and dev platforms. I give them credit for their "safe" extensions to the CRT, but what they should do is lobby to get these forms of safeguards through the standards committees instead of going it alone to try to drive revenue.

      Delete
  13. How about a Kickstarter to fund someone really good to do a full code review of the OpenSSL codebase?

    ReplyDelete
  14. Good guy Matthew Green. Bug discovered in OpenSSL, suggests funding.

    ReplyDelete
  15. Can someone explain the 1 + 2 + 16 stuff for me?

    ReplyDelete
    Replies
    1. All of that comes from RFC 6520:
      [...] the length of the type field is 1 byte, and the length of the payload_length is 2. [...]. The padding_length MUST be at least 16.

      Delete
    2. Thanks for the explanation as I was thinking why not just use 19

      Delete
  16. I use lint for my development, in particular PC-Lint by Gimpel.
    This bug and the "goto fail" bug are both trivial bugs found by lint.
    Doesn't any one use Lint any more?

    ReplyDelete
    Replies
    1. I fail to see how -- maybe I'm dumb.

      As I read the code, it seems the issue is not with the bp argument in the "memcpy(bp, pl, payload);" statement, rather its with pl.

      A static code analyzer would know bp can't be over-flowed by copying payload bytes to it because it can look at how it was allocated using payload -- no warnings/bugs there to warn about.

      However, I fail to see how it could tell with certainty that pl will be over-read. In fact, pl may not even be over-read (I didn't check to see how it was allocated), it may just be "under-filled". Moreover, the value of the variable (outside the scope of the method) controlling how much data was read into pl is unknown at compile time. Rather it is provided in the packet sent to the server (and ultimately stored in the length member of the SSL3_RECORD type). No static analyzer could tell that payload > rec.length (more accurately "s->s3->rrec.length - 19") to issue a warning.

      I suppose, though, a really good static analyzer may be able to see the entire code-flow and recognize that memcpy/read into the buffer to which pl points has acted with variables that may not equal payload and issue a warning....?

      That'd be a damn impressive static analyzer of a quality I'm unfamiliar with. Though, maybe valgrind would find it, lol.

      Delete
    2. an analyser can find that the copy's source may cross from the area occupied by one structure/variable into another area, which may belong to another variable or be unallocated. indeed that'd likely need dynamic analysis during a run, and/or advanced reasoning about how the two structures and length fields may interact to define the actual sizes.

      my girlfriend comments: "This advanced reasoning is also lacking in many of those commenting here." i'm a lucky girl (=

      Delete
    3. > In fact, pl may not even be over-read (I didn't check to see how it was allocated), it may just be "under-filled".

      oops, i forgot to account for this. it's true. my point about the interaction of length fields holds, but due to this, the reasoning has to find out that (as i described) areas may be crossed, or not properly initialised data may be read by the copy. trickier, but can be found still. even without the full abstract understanding of how the length fields define the possible/claimed lengths of the heartbeat structure.

      (if that over-read data is still in the SSL3_RECORD structure's allocation *and* was eg zero-initialised, then it'll be harder to detect, but then again this won't leak any interesting data if it always covered the full length of what can leak (nearly 64 KiB here).)

      Delete
    4. > if that over-read data is still in the SSL3_RECORD structure's allocation
      If you read the code more thoroughly, you'll see that it's never going to be within the SSL3_RECORD structure -- it's a separate buffer pointed at by the SSL3_RECORD's data member. Whether or not that buffer is allocated every time or pooled I didn't bother to dig deeply enough to tell. In either case, where the over-read data comes from is certainly not compile-time deterministic. Not that this matters to the static analyzer so much as to better answering the "what could be leaked" question.

      > *and* was eg zero-initialised
      If that were the case there'd be nothing to talk about here at all, lol -- if they zero-init'd the buffer every time before reading into it the only thing that could every be "leaked" would be extra zeros. Of course, they don't do this; presumably for performance reasons.

      > my girlfriend comments: "This advanced reasoning is also lacking in many of those commenting here." i'm a lucky girl (=
      I don't know if that's directed at me, others, or anything else -- I'm stuck on whether you're a lucky girl with a girlfriend or you made a typo, lol.

      Delete
    5. first point: yes, you're right. despite my error, the basic crosses allocation (over-read of area) and/or reads not initialised data (in allocation but can leak sensitive data from prev usage) remains. i think most of what i commented on reasoning holds, too. alas, i believe i'd noticed the pointing-to-buffer briefly before, but at least failed to keep that in mind.

      second point: yes, that's precisely what i wrote. "this won't leak any interesting data". and yes, for performance reasons it won't be done - it'd use a 64 KiB part to an allocation where only the actual length of the heartbeat packet's transported data would be needed, and needs initialisation .

      however, if you have plenty of address space and ideally fancy byte-granular memory protection for your process's memory, this is feasible. that'd be one measure (not perfect but something) that the software stack may take if the application's programming language is compiled to a vm's input. the vm implementation can easily provide fancy address space tricks.

      third point: her comment was directed at others' basic misunderstanding (eg thinking that this was an malloc-usage error when the inner length field claim and over-read by the copy is the exploit), and i guess retrospectively at my wrong assumption as well. (in her defence, i'd mostly only told her about this page.) so, in case you're the previous standard-language anon in this chain, not at you.

      and yes, we're both women and in a relationship with one another. almost seems rude to consider a typo next to the possibility of queer women with some understanding of program logic =p

      (obviously i hope to be stereotyped as making "a good impression" of my skills now that i acknowledged, you corrected me on the one point and restated something else i'd stated before.)

      Delete
    6. Well played, :-).

      Delete
  17. Very interesting, thanks. So organisations are renewing certificates all around the world - but there is nothing stopping them from using the same private key as before is there? Meaning you could still be compromised if an attacker has previously been able to obtain said private key before.

    Is there any way of telling (from the client side) whether the server's private key has been changed?

    ReplyDelete
    Replies
    1. You are not "choosing" private key when creating certificate. Renewing certificate means that you always get a new private key.

      Delete
    2. In Windows it is totally possible to renew a certificate using the same private key.

      What little I know about using OpenSSL is you can use the genrsa option to create a private key, from this you can generate a CSR using the req option.

      If you renew the certificate using the same key as before you are potentially still compromised???

      Delete
    3. AFAICT the best practice is to re-key the certificates, which will create new public/private key pairs for the certificate, and depending on the CA, may also cause revocation of the old cert.

      Delete
  18. Hello,

    I have a problem understanding the statement about vulnerable versions. Maybe it's the problem is that English is not my native language, but still...

    https://www.openssl.org/news/secadv_20140407.txt

    says "Only 1.0.1 and 1.0.2-beta releases of OpenSSL are affected including
    1.0.1f and 1.0.2-beta1."
    I don't understand that all version from 1.0.1 through 1.0.1f are vulnerable, but just 1.0.1 and 1.0.1f.

    More on this, I have 1.0.1e on a Debian 7.0.4 (installed as it come from the repository + nginx webserver)

    root@~# openssl version
    OpenSSL 1.0.1e 11 Feb 2013
    root@~# cat /etc/debian_version
    7.4

    Testing with check-ssl-hearbleed.pl (https://github.com/noxxi/p5-scripts/blob/master/check-ssl-heartbleed.pl) and ssltest.py shows that the server is most probably not vulnerable.

    Did I understand something wrong?

    Thanks!

    ReplyDelete
    Replies
    1. Călin, de vreme ce scrie versiunile 1.01 INCLUSIV 1.0.1f, înseamnă că le include pe toate pînă la F. Ceea ce înseamnă că şi versiunea E e vulnerabilă.

      As long as it writes "1.0.1 releases including 1.0.1f", it means that all versions are included. Which means that release E is also vulnerable.

      Delete
  19. Thanks for such a nice explanation !!

    I think that the Private Keys stored in FIPS 140-1/2 level 2 or 3 modules will remain protected and such servers do not need to get new Certificates.(though other information from memory may have leaked)

    Is my understanding correct?

    ReplyDelete
  20. This is not the first time OpenSSL team fails, they start to fail badly every 2 years or so now, and this last one is simply not using coding rules and/or reviews ... Yes errors do happen but a simple rule of boundary check is no surprise at all should they be telling devs every memory boundary should be checked before copy-move, this is very basic and simple to follow rule... And they have to check it. and yes someone financially good should fork the project do a good code review on it because nearly half of the servers are dependent on a handful devs which is no good ... not enforcing simple rules by say 10 people, now really makes half of the internet renew their ssl certs ... There is a really bad attitude in open source community : programming>engineering approach ...

    ReplyDelete
    Replies
    1. Why just blame OpenSSL Team which has done a great job.... FIPS had also validated OpenSSL Software... how this went through their review process?

      Delete
    2. I said errors do happen but a well documented c/c++ memory alloc-copy-move etc. problems should not be seen in a project this big.The problem is the attitude towards the software, if you are doing a security software you need to secure it first before implementing new stuff or racing for performance. I am pretty sure if they didn't neglect the "process" part of software engineering this wouldn't happen. This bug is worse than Windows-XP bug, in deed this is the biggest impact bug in software history... thx to what??? The earlier critical bugs were a sign to them but they wait the big hit like microsoft, will they do anything for the real cause, i dont see them doing-saying anything about that ... BTW validation does the validation nothing more ...

      Delete
    3. FIPS validation only checks the crypto routines, not the entire implementation. This bug is not part of the crypto package and so it was outside the scope of FIPS. Think about it - there's no way anyone could provide validation for all implementations.

      Delete
    4. This comment has been removed by the author.

      Delete
  21. I think OPENSSL_malloc should clean the memory after allocation, returning a piece of memory with nothing but 0's. It would be slower, but more secure.

    An OPENSSL_malloc_insecure could be added in addition, the name indicating the programmer should "read the manual" before using in an implementation.

    ReplyDelete
    Replies
    1. that doesn't affect this bug; the allocation size matches the actually used size. the entire allocation is initialised with data read from the received structure, whereever that currently resides. the problem is that the copy will read source data not actually belonging to the received structure, if that's shorter than the received length field indicates.

      Delete
    2. that's would not fix the bug! the problem is that in C/C++ you can read beyond a given struct

      Delete
  22. You say don't take it out on the OpenSSL team. But the bug was first known about two years ago.

    ReplyDelete
    Replies
    1. No, it was introduced two years ago.

      Delete
  23. Maybe if the payload buffer was named "payload" and the payload length was named "payload_len" the programmer would have said to himself, "Hey, maybe I should check the length before malloc'ing." Swapped-meaning names like "pl" and "payload" make it hard for reviewers to spot bugs like these.

    ReplyDelete
    Replies
    1. no. what you call payload_len is properly being checked by the code with bug. the problem is that this is read from only the received structure (as filled in by the other end of this heartbeat request). and that the length thus indicated in the heartbeat request may mismatch the length as indicated by an outer packet (which transports the heartbeat packet as its data payload).

      the outer packet length indicates the actual length that this heartbeat packet may take up. but the manipulated heartbeat packet claims in its own (inner packet) length field that it is longer than it can be (by the outer packet length specification). thus the fix's check doesn't trust only the inner packet length in the heartbeat structure; instead, it verifies that the length claimed by the heartbeat inner packet (set by sender) actually fits within the length defined by outer packet.

      Delete
    2. Exactly -- and if you really want to take shots at people, gaze towards the authors of the RFC. Having a length inside the heartbeat packet is a WTF of its own. In fact, the whole method upon which the extension is built is hokey, IMO. So, if you ask me, part of the blame lies on the protocol's designers.

      Delete
    3. RFC author and OpenSSL implementer are the same

      Delete
    4. > take shots at people, gaze towards the authors of
      > the RFC (who also wrote the iniital implementation):

      The RFC implements a function (HEARTBEAT) that SSL doesn't need. UDP path discovery is hardly as main stream and wouldn't be enabled by default in HTTPS. The only possible (and probably un-necessary) uses for this RFC have nothing to do with connection keep-alive.

      The RFC REQUIRES copying a variable-length data payload (with length stored in an internal 2-byte 64k max value field) sent by a protocol with a 16k maximum length message size. Gee, that's not asking for trouble.

      The RFC REQUIRES repeating back what was "sent in" (or the hidden stuff next to it). Why wouldn't a useful heartbeat require a newly generated signed message? What's the point of echoing next to a variable buffer with a brand-new unique bounds-check?!!

      The RFC REQUIRES DISCARDING whole packets when SOMETHING IS WRONG!!!

      The RFC REQUIRES DISCARDING extra information in EVERY SINGLE PACKET.

      Who the hell requires hiding/discarding/ignoring information in an RFC? Particularly a security related RFC?

      I should go on, but I'm already at overload. RFC 6520 is wrong in every way and an obvious plant in retrospect. I suggest a new RFC to simply force-remove everything about RFC 6520. If UDP needs PATH discovery and/or ECHO write a separate extension for that!!!

      Delete
  24. dont you think that an equally efficent,but more secure in the sense more easy to program language, should be found now days? C is very prone to this type of errors.. the problem is unchecked memory access... language based on VM as Java or C# simply dont allow these trivial but very easy to make errors..

    ReplyDelete
  25. People still write security software in C?

    ReplyDelete
  26. For me this raises more questions than it answers.

    Why is this function accepting a "payload length" value at all? Don't network I/O functions tell you how many bytes they passed to your buffer when you read the data?

    Isn't the OPENSSL_malloc() function grabbing old, i.e., unused memory? Why wasn't that memory securely zeroed before being freed? Why isn't OPENSSL_malloc() zeroing the memory before passing the pointer back to the caller?

    ReplyDelete
    Replies
    1. yes, on your first point. this protocol (!) design is a cause of the problem. and as others have pointed out, this double-length-specification and trusting only the sender-supplied length is the cause of the problem.

      no on your second set of questions. zeroing allocated memory in malloc or to-be-freed memory in free will *not* fix this bug. (slightly less data may leak especially with zero-before-free, but it's still about as severe an exploit.)

      the buffer allocated in that function is completely filled (written): the problem is that it is written, by the copy, from a source which may stretch beyond the sender-supplied heartbeat payload data (read beyond that buffer). and that it may thus copy other data from the heap, both from currently free or allocated areas.

      Delete
    2. Yes, I see what you mean. The issue isn't with malloc() allocating too much space, but with memcpy() copying too much data. It's the classic problem that there's no real way to tell where a memory buffer ends, so the caller is responsible for checking the bounds. An example of a buffer "underflow" attack, as it were.

      Thanks for clearing that up.

      Delete
  27. Why isn't OPENSSL_free() zeroing the memory before returning it to the heap.
    this would fix this kind of bugs from the ground

    ReplyDelete
    Replies
    1. no, it wouldn't. it may lessen the amount of data that will leak, but the read-over-buffer in the copy that's at the /heart/ of this matter isn't /beat/ by your suggestion. this read can still leak sensitive data in other current *allocations* on the heap or stack, whereever that buffer resides.

      (and don't even think of suggesting better malloc or better malloc usage, also a common misunderstanding. the malloc in this code is perfectly correct. it's plainly wrong to at all be allocating, filling, and returning a payload of the alleged size when a shorter payload was actually supplied by the sender in their malformed heartbeat.)

      Delete
  28. The concept of "security" and "open source" is an oxymoron. I encourage any interested readers to reread Ken Thompson's 1983 Turing Award lecture. No stored-program architecture can be secure.

    If I was a manager in the NSA, and I had the funds and people that the NSA has had for YEARS, one of the first things I would do is insinuate my guys into the open-source community. I would task them to use absolutely legitimate processes to install undetectable back-doors in as many "security" protocols as possible.

    I will not be surprised if we learn that heartbleed was NOT accidental. In my opinion it is likely to be just one of several or even many.

    ReplyDelete
    Replies
    1. oops, meant to put this as reply here: http://blog.cryptographyengineering.com/2014/04/attack-of-week-openssl-heartbleed.html?showComment=1397239232227&m=1#c511601402816295200

      Delete
  29. obviously "heartbleed", this bug, is not and was not "undetectable".

    thus the hint about needed funding and review. (i think that fine-grained tests may possibly have found this bug as well, if a test happened to enter an exploit-like set up to the function and expected an error return for that.)

    to mention, funding and payment to work on code _is_ compatible to open source and to free software (software-libre). the service of paid work related to software doesn't necessitate proprietary software. further, as has been demonstrated generally, a software-libre project can be run to pick and choose, both developers and contributions.

    (and with a governmental agency kind of sway, hardware backdoors yield better results in any case. and generally need work beyond software review to detect, and of course may be unfixable for a certain architecture even if suspected or detected. thus, it's easiest to remove your mobiles from your person when going for conspiratorial walks.)

    ReplyDelete
    Replies
    1. is a reply to the previous sub-thread, http://blog.cryptographyengineering.com/2014/04/attack-of-week-openssl-heartbleed.html?showComment=1397221114471&m=1#c8527366691738074657

      Delete
  30. Keep writing these, This was a great read. I'm sure you spend plenty of time on this article. Cheers man!

    ReplyDelete
  31. Wouldn't a simple memset to 0 of the allocated response buffer also prevent any information from being leaked?

    ReplyDelete
    Replies
    1. Nevermind. The problem is that pl only has data which is less than payload.

      memcpy(bp, pl, payload);

      Delete
    2. the problem is *not* the data you find when you allocate the response buffer. The problem is that you read beyond the hb request struct. Clearing the response buffer doesn't help as you'll fill such buffer with data coming from the hb struct and what resided beyond the struct itself

      Delete
  32. Don't you know if there is any possibility that password manager I am using (http://www.stickypassword.com/) has been affected? Of course they ensured me my data are safe but maybe there can be a chance...

    ReplyDelete
  33. First, thank you for the article! I have a few comments/questions. Admittedly, I have not the time to read the source code or the RFC other than what is here and other articles, which is why I'm asking.

    First, the check 1 + 2 + 16 > s->s3->rrec.length seems superfluous if payload is defined as uint16 as in the struct. Since if the first check fails (returns 0) so will the second regardless of the value of payload.

    Is it a fair assessment of the threat scenario that an intruder will send a malformed heartbeat as a first packet of the communication with the server (or otherwise most likely the socket memory copied from will be from his own previous packet data) and by chance land on sensitive data from other users? It's hard for me to imagine that server certificate keys can be stored as a query string like in the example right next to the socket data.

    Lastly, I don't know how the padding length is defined (or read) but unless the OPENSSL_malloc call does some zeroing, allocating payload+padding space but copying only payload space seems like still a security risk (the whole buffer is written to the socket), especially if padding can be much larger than 16.

    ReplyDelete
  34. In the last days, everybody talks about Heartbleed bug... I read someplace that users passwords of Sticky Password not affected by Heartbleed... what do you think about it? Is really safe when I change my password in this vulnerable time? I mean, is possible to "lost" again my passwords while I change?

    ReplyDelete
  35. With my team, at TrustInSoft, we have just performed a mathematical validation of an open source SSL stack. This SSL stack is now immune to security flaws similar to heartbleed. I encourage you to have a look at formal methods and see how they can help to build a more reliable internet: http://trust-in-soft.com/no-more-heartbleed

    ReplyDelete
  36. 977linux... A Ph.D computer Science major from Deutschland who has coded bug fixes in the past for openSSL, drafted RFC6520 heartbeat and cleverly prefixed the SSL private key with the heartbeat payload to render it visible via classic buffer overrun which is ubiquitous to junior coders. Those conditions above beg one to pose the question, how one can treat the coding for RFC6520 in openSSL like a first semester programming class of adding two numbers to print the sum? the answer is simple. It was designed to behave precisely as it manifested and it was a sneaky try to backdoor all SSL traffic to be visible to the 5 Eyes monster. Next question is. When was Seggelmann was ordered by Bundesnachrichtendienst whose was ordered by NSA to introduce RFC6520 and the coding to implement it which is useless anyway?

    ReplyDelete
  37. Why such a delicate and important library uses only simple plain C structs instead of OOP container object? Instead of using C structs we could use C++ objects with appropriate getters/setters, implement boundary checks, and *not* let the remote party to declare the length of the data coming in... Much much much simpler would be to impose the hearbeat payload data to 128bit which is long enough to detect if the remote server is alive or not...

    ReplyDelete
    Replies
    1. the bug was obviously placed on purpose... 1) an heartbeat protocol isn't concern of a crypto library and 2) it seems the classic error that every rookie programmer does.

      Delete
  38. simple programming rules:
    #1 never trust the client data - validate what client tells you
    #2 never have same thing in 2 different places
    ->result #2 protocol packets should not declare its size, #1 packet size should be validated... you dont need funding for that, i guess it is most likely "error for funding"...

    ReplyDelete
    Replies
    1. #1 is a very good principle for programming.
      #2 is not always possible: How should the recepient of a heartbeat message determine the end of the payload and the start of the padding without an explicit length field?

      Delete
    2. well.. there are many ways to enforce that... fixed field lengths, checksums and so on... however, if the remote party can request up to 2^16 bytes (as in the heartbeat ssl protocol) expect that! (as a worst scenario). The heartbeat struct should have been 64kb long irrespective of the actual user payload size.

      Delete
  39. The latest Heart Bleed has affected a large number of web services. I had a very good discussion on Quora. I would like to share some interesting points that may help in solving this issue.
    • Assess the data
    • Coordinate with others facing the same bug
    • Gather Information
    • Re-issue SSL certificates
    • Re-set confidential information
    Source: http://www.cloudways.com/blog/how-to-fix-openssl-heartbleed-bug/

    ReplyDelete
  40. Some time ago, a vulnerability was revealed in OpenSSL, and I guess there's no programmer who hasn't been talking about it since then. I knew that PVS-Studio could not catch the bug leading to this particular vulnerability, so I saw no reason for writing about OpenSSL. Besides, quite a lot of articles have been published on the subject recently. However, I received a pile of e-mails, people wanting to know if PVS-Studio could detect that bug. So I had to give in and write this article: http://www.viva64.com/en/b/0250/

    ReplyDelete
  41. Very interesting. Thanks for all the info. I will be able to hack millions now.

    ReplyDelete