A Remote Pre-Auth Memory Corruption Vulnerability in Dovecot

Summary
We found a heap memory corruption vulnerability in Dovecot, the world’s most widely deployed email server implementation. Dovecot has a market share of about 75% with ~5 million live hosts [1]. The vulnerability can be triggered remotely, it does not require authentication on the target server, and the vulnerability has been present for at least 30 years. This vulnerability—like so many others—results from a relatively simple string manipulation error in C.

The vulnerability was publicly disclosed in August 2019. It has been fixed in Dovecot version 2.3.7.2 and Pigeonhole version 2.2.36.4. The vulnerability is server-side, which means only your email provider needs to patch their software. Dovecot awarded us a $5k bounty on HackerOne for this vulnerability; it is rated 9.8/10 in severity by the NVD. This project was joint work with Rafi Rubin.

Fuzzing Harness
We found CVE-2019-11500 by fuzzing IMAP sessions with AFL. Our fuzzing sessions were run on a machine with 48 cores and 396GB RAM. We modified both AFL and Dovecot to build a workable and efficient fuzzing pipeline. Each fuzzing thread’s mail directory was mounted on a RAM disk that reset after each execution so that IMAP sessions were deterministic. For improved performance, we piped the output of the fuzzer directly to the IMAP parser to bypass the networking layer.

Part 1: The vulnerability
The bug behind the vulnerability is a relatively simple memory error related to handling strings. C commonly represents both strings and binary data with the char* type. Strings are null-terminated by convention, while binary data (which might contain embedded null bytes) needs a separate length variable. In the code below, the data variable is treated as a string in some contexts, but is loaded as simple binary data with no check for internal null bytes, resulting in an exploitable vulnerability.

This function scans the data array, and if an escape character (backslash) is found, it saves the index of the first escape character found into the str_first_escape field for later use. The key here is that if an escape character is present in the data array after a null byte, then the saved str_first_escape index will be set to a value larger than the actual length of data array, if and when it is actually treated as a string.

static bool imap_parser_read_string(struct imap_parser *parser,
                                    const unsigned char *data, size_t data_size)
{
	size_t i;

	/* read until we've found non-escaped ", CR or LF */
	for (i = parser->cur_pos; i < data_size; i++) {
		...
		if (data[i] == '\\') {
			/* save the first escaped char */
			if (parser->str_first_escape < 0)
				parser->str_first_escape = i;

			/* skip the escaped char */
			i++;
		}
		...
	...

Part 2: Manipulating Memory
We’ve now seen how we can set the str_first_escape index of the data array to a value larger than the location of the first null byte. In the function below, we can see how this particular system state leads to problems. A copy of data is allocated on the heap using p_strndup(). This duplication function treats the data as a string, i.e., it calculates the size of the string based on the location of first null byte, allocates that much memory, then copies the source string into the new allocation.

After that, the copied string is unescaped. If str_first_escape is not set, then there is nothing to unescape and this step is skipped. If str_first_escape is set, then str_unescape() is called on the copy; to save time, the saved str_first_escape index is added to the start of the string so that the unescape logic is already positioned at the first escape character. However, as noted above, we’ve set this value larger than the actual length of the string, thus driving the sum of str + parser->str_first_unescape out-of-bounds! Memory unsafety achieved. An attacker can set the distance between the null byte and the first escape character inside data as desired for precise control over this invalid pointer.

case ARG_PARSE_STRING:
        /* data is quoted and may contain escapes. */
        i_assert(size > 0);

        arg->type = IMAP_ARG_STRING;
        str = p_strndup(parser->pool, data+1, size-1);

        /* remove the escapes */
        if (parser->str_first_escape >= 0 &&
            (parser->flags & IMAP_PARSE_FLAG_NO_UNESCAPE) == 0) {
                /* -1 because we skipped the '"' prefix */
                (void)str_unescape(str + parser->str_first_escape-1);
        }
        arg->_data.str = str;
        arg->str_len = strlen(str);
        break;

We have now gained the ability to invoke str_unescape() on a controlled, out-of-bounds heap pointer. Let’s look at this function to assess what kind of damage we can do:

char *str_unescape(char *str)
{
	/* @UNSAFE */
	char *dest, *start = str;

	while (*str != '\\') {
		if (*str == '\0')
			return start;
		str++;
	}

	for (dest = str; *str != '\0'; str++) {
		if (*str == '\\') {
			str++;
			if (*str == '\0')
				break;
		}

		*dest++ = *str;
	}

	*dest = '\0';
	return start;
}

This code scans through bytes, performing a single level of unescaping (escaped escape characters simply become single escape characters) until a null byte is encountered. Each such escape character is removed by effectively shifting all subsequent bytes one byte to the left (towards lower addresses). This means that if N escape characters are present in memory, then an entire region of memory will be destructively shifted to the left by N bytes; the shift operation ends only when a null byte is encountered.

An example of this corruption is depicted in the figure below with a shift of N = 4 bytes. The top row shows the initial memory layout, and the bottom row shows the result of running str_unescape() on the first byte. As can be seen, the 4-byte region containing “BBBB” is corrupted to “CCCC” in a controlled fashion.

\A\A\A\ABBBBCCCC\0
AAAABBBBCCCC\0CCC\0
By combining heap arrangement techniques with the rich set of IMAP commands that are available to prepare live and freed heap memory, a clever attacker can arrange escape characters to achieve very sophisticated memory corruption capabilities. The vulnerability can be used to corrupt allocator metadata, code / data pointers, or any other memory structures in the heap. Additionally, the vulnerability can be repeatedly triggered within a single IMAP session. As a result, this is a very powerful memory corruption primitive. Remote code execution should be assumed to be possible, although we did not build a full end-to-end attack based on this vulnerability.

Thanks
We’d like to thank Aki Tuomi of the Dovecot team for quick and professional responses.

Thanks for reading! Find me on Twitter here.

Sources:
[1] https://www.open-xchange.com/about-ox/ox-blog/article/dovecots-global-market-share-grows-to-76/

Published by

Nick Roessler

I'm a PhD student doing research on computer security