We found a heap memory corruption vulnerability in Dovecot, the world’s most widely deployed email server implementation. Dovecot has a market share of about 75% with ~5 million live hosts . The vulnerability can be triggered remotely, it does not require authentication on the target server, and the vulnerability has been present for at least 30 years. This vulnerability—like so many others—results from a relatively simple string manipulation error in C.
The vulnerability was publicly disclosed in August 2019. It has been fixed in Dovecot version 220.127.116.11 and Pigeonhole version 18.104.22.168. The vulnerability is server-side, which means only your email provider needs to patch their software. Dovecot awarded us a $5k bounty on HackerOne for this vulnerability; it is rated 9.8/10 in severity by the NVD. This project was joint work with Rafi Rubin.
We found CVE-2019-11500 by fuzzing IMAP sessions with AFL. Our fuzzing sessions were run on a machine with 48 cores and 396GB RAM. We modified both AFL and Dovecot to build a workable and efficient fuzzing pipeline. Each fuzzing thread’s mail directory was mounted on a RAM disk that reset after each execution so that IMAP sessions were deterministic. For improved performance, we piped the output of the fuzzer directly to the IMAP parser to bypass the networking layer.
Part 1: The vulnerability
The bug behind the vulnerability is a relatively simple memory error related to handling strings. C commonly represents both strings and binary data with the
char* type. Strings are null-terminated by convention, while binary data (which might contain embedded null bytes) typically requires a programmer to maintain a separate length variable. In the code below, the
data variable is treated as a string in some contexts, but is loaded as simple binary data with no check for internal null bytes. This conflation, as we’ll see, results in an exploitable vulnerability.
This function scans the
data array, and if an escape character (backslash) is found, it saves the index of the first escape character found into the
str_first_escape field for later use.
The key here is that if an escape character is present in the
data array after a null byte, then the saved
str_first_escape index will be set to a value larger than the actual length of
data array, if and when it is actually treated as a string.
Part 2: Manipulating Memory
We’ve now seen how we can set the
str_first_escape index of the
data array to a value larger than the location of the first null byte. In the function below, we can see how this particular system state leads to problems. A copy of
data is allocated on the heap using
p_strndup(). This duplication function treats the data as a string, i.e., it calculates the size of the string based on the location of first null byte, allocates that much memory, then copies the source string into the new allocation.
After that, the copied string is unescaped. If
str_first_escape is not set, then there is nothing to unescape and this step is skipped. If
str_first_escape is set, then
str_unescape() is called on the copy; to save time, the saved
str_first_escape index is added to the start of the string so that the unescape logic is already positioned at the first escape character. However, as noted above, we’ve set this value larger than the actual length of the string, thus driving the sum of
str + parser->str_first_unescape out-of-bounds! Memory unsafety achieved. An attacker can set the distance between the null byte and the first escape character inside
data as desired for precise control over this invalid pointer.
We have now gained the ability to invoke
str_unescape()on a controlled, out-of-bounds heap pointer. Let’s look at this function to assess what kind of damage we can do:
This code scans through bytes, performing a single level of unescaping (escaped escape characters simply become single escape characters) until a null byte is encountered. Each such escape character is removed by effectively shifting all subsequent bytes one byte to the left (towards lower addresses). This means that if N escape characters are present in memory, then an entire region of memory will be destructively shifted to the left by N bytes; the shift operation ends only when a null byte is encountered.
An example of this corruption is depicted in the figure below with a shift of N = 4 bytes. The top row shows the initial memory layout, and the bottom row shows the result of running
str_unescape()on the first byte. As can be seen, the 4-byte region containing “BBBB” is corrupted to “CCCC” in a controlled fashion.
We’d like to thank Aki Tuomi of the Dovecot team for quick and professional responses.