Given the popularity of GDI Bitmap objects for exploitation of kernel vulnerabilities -due to the fact that almost any kind of memory corruption vulnerability (except for NULL-writes) could be used to reliably gain arbitrary R/W primitives over the kernel memory by abusing Bitmaps- Microsoft decided to kill exploitation techniques based on Bitmaps. In order to do this, Windows 10 Fall Creators Update (also known as Windows 10 1709) introduced the Type Isolation feature, an exploitation mitigation in the Win32k subsystem, which splits the memory layout of SURFACE objects, the internal representation of Bitmaps on the kernel side. This blogpost takes a deep dive into the details of how Type Isolation is implemented.
Analysis Notes
This analysis was initially performed using win32kbase.sys version 10.0.16288.1 on Windows 10 Fall Creators Update x64, which was one of the last Insider Preview builds before the general availability of Windows 10 1709 in October 2017. Doing a binary diff between said version and the latest win32kbase.sys version 10.0.16299.125 available as of this writing (end of January 2018), shows that the functions relevant to Type Isolation remain unchanged.
Context
Since mid-2015, Bitmaps [1], a type of GDI objects, have been the preferred choice of exploit developers when exploiting kernel vulnerabilities on Windows. The data structure representing this type of object on the Windows kernel turned out to have some very handy members, which when corrupted via a memory safety vulnerability, could provide an attacker with full-blown R/W access to the kernel address space.
On the kernel side, a Bitmap is represented by a SURFACE object using the following structure:
typedef struct _SURFACE {
BASEOBJECT BaseObject;
SURFOBJ surfobj;
[...]
}
BASEOBJECT is common to several types of objects, and it's defined this way:
typedef struct _BASEOBJECT {
HANDLE hHmgr;
ULONG ulShareCount;
USHORT cExclusiveLock;
USHORT BaseFlags;
PVOID Tid;
} BASEOBJECT, *PBASEOBJECT;
But we are interested in the SURFACE-specific structure, which is called SURFOBJ, and it's defined like this:
typedef struct _SURFOBJ {
DHSURF dhsurf;
HSURF hsurf;
DHPDEV dhpdev;
HDEV hdev;
SIZEL sizlBitmap;
ULONG cjBits;
PVOID pvBits;
PVOID pvScan0;
LONG lDelta;
ULONG iUniq;
ULONG iBitmapFormat;
USHORT iType;
USHORT fjBitmap;
} SURFOBJ, *PSURFOBJ;
The two most interesting members of this structure are pvScan0, which points to the buffer holding the pixel data of the Bitmap, and sizlBitmap, which holds the dimensions (width and height) of the Bitmap.
There are two main ways to take advantage of SURFACE objects for exploitation purposes, by corrupting the members mentioned before:
Both the GetBitmapBits and SetBitmapBits GDI APIs operate on the pixel data buffer pointed by the pvScan0 member of the SURFOBJ structure. Overwriting this pointer provides arbitrary R/W of kernel memory from user mode.
The sizlBitmap member of the SURFOBJ structure holds the width and height properties of the Bitmap. By overwriting sizlBitmap.cx or sizlBitmap.cy it's possible to "enlarge" the pixel data buffer. This provides R/W access to kernel memory beyond the end of the pixel data buffer.
Both ways end up setting up a Manager/Worker scheme where 2 Bitmap objects are involved; if you want to study this topic in depth, I recommend reading the slides from the 2016 Ekoparty conference presentation titled "Abusing GDI for ring0 exploit primitives: Reloaded" [2], by Diego Juárez and Nicolás Economou.
The first technique provides full R/W capabilities. Its disadvantage is that it requires a "good" vulnerability (such as a write-what-where), which should allow to overwrite the pvScan0 pointer with an arbitrary value.
The second technique, although less powerful at first, since it initially provides R/W access to the memory located right after the end of the pixel data buffer, has the advantage of being usable even with limited vulnerabilities; simple arbitrary decrements/increments, or writing non-arbitrary values will do the trick. The exploitation strategy when overwriting the sizlBitmap member of the SURFOBJ structure is to make two SURFACE objects (let's call them SURFACE1 and SURFACE2) adjacent in memory; by corrupting the sizlBitmap of SURFACE1, its pixel data buffer is "enlarged", thus overlapping with the adjacent SURFACE2 object. From that point, further operations on the (now enlarged) SURFACE1 can arbitrarily overwrite members of the SURFACE2 header, effectively transforming the limited R/W beyond the bounds of the SURFACE1 pixel buffer into fully arbitrary R/W capabilities.
As you may have noticed, this second exploitation approach has been possible because, until now, the pixel data buffer of a Bitmap was typically contiguous to the SURFACE header; the whole SURFACE object was created through a single memory allocation, with size big enough to hold both the SURFACE header plus the pixel data buffer. It didn't have to be necessarily like that, but it was implemented that way. This made possible for exploit developers to obtain advantageous memory layouts, where the pixel data buffer of one SURFACE object can be followed by the header of another one.
Given this second exploitation approach, which allows to turn almost any kind of memory corruption vulnerability (except for NULL-writes) into arbitrary R/W over the kernel memory by abusing GDI Bitmap objects, Microsoft decided to kill it. In order to do this, Windows 10 Fall Creators Update introduced the Type Isolation feature, an exploitation mitigation in the Win32k subsystem, which splits the memory layout of SURFACE objects.
Type Isolation
Data Structures
Type Isolation is implemented through a number of linked structures. There are 4 main data structures involved in this mitigation (you can find below a cute lil' diagram that explains everything in a graphical way):
CTypeIsolation
CSectionEntry
CSectionBitmapAllocator
RTL_BITMAP (although not unique to Type Isolation, this well-known Windows kernel opaque structure plays an important role here)
All of them are allocated from the PagedPoolSession pool using ExAllocatePoolWithTag. They all share a new 4-byte pool tag, which is 'Uiso'. Also, the pool tag used for the pixel data buffer of SURFACE objects has changed, from 'Gh?5' to 'Gpbm'.
The win32kbase!gpTypeIsolation static variable is a pointer to another pointer, which in turn points to a global CTypeIsolation structure. This CTypeIsolation is the head of a circular doubly-linked list of CSectionEntry objects. Each CSectionEntry manages 0xF0 SURFACE headers. Each CSectionEntry owns a CSectionBitmapAllocator object, which maintains two main objects in sync: an array of 0x28 Views over a Section [3], with each View being able to hold 6 SURFACE headers, and a map of bits (RTL_BITMAP) that keeps track of the busy or free state of each one of the 0x28 * 6 == 0xF0 available slots in the Views. The doubly-linked list of CSectionEntry objects can grow as needed.
The reverse-engineered definitions of the 4 mentioned data structures follow, along with their sizes and the offsets of their members:
CTypeIsolation (size = 0x20 bytes)
typedef struct _CTYPEISOLATION {
PCSECTIONENTRY next; // + 0x00
PCSECTIONENTRY previous; // + 0x08
PVOID pushlock; // + 0x10
ULONG64 size; // + 0x18
} CTYPEISOLATION, *PCTYPEISOLATION;
CSectionEntry (size = 0x28 bytes)
typedef struct _CSECTIONENTRY CSECTIONENTRY, *PCSECTIONENTRY;
struct _CSECTIONENTRY {
CSECTIONENTRY *next; // + 0x00
CSECTIONENTRY *previous; // + 0x08
PVOID section; // + 0x10
PVOID view; // + 0x18
PCSECTIONBITMAPALLOCATOR bitmap_allocator; // + 0x20
};
CSectionBitmapAllocator (size = 0x28 bytes)
typedef struct _CSECTIONBITMAPALLOCATOR {
PVOID pushlock; // + 0x00
ULONG64 xored_view; // + 0x08
ULONG64 xor_key; // + 0x10
ULONG64 xored_rtl_bitmap; // + 0x18
ULONG bitmap_hint_index; // + 0x20
ULONG num_commited_views; // + 0x24
} CSECTIONBITMAPALLOCATOR, *PCSECTIONBITMAPALLOCATOR;
RTL_BITMAP (size = 0x10 bytes)
typedef struct _RTL_BITMAP {
ULONG64 size; // + 0x00
PVOID bitmap_buffer; // + 0x08
} RTL_BITMAP, *PRTL_BITMAP;
The following diagram tries to clarify the relationship between all the involved data structures.
This figure represents an hypothetical state of the Type Isolation structures with 3 CSectionEntry instances, each one with their associated CSectionBitmapAllocator and RTL_BITMAP instances. Since each instance of CSectionEntry manages 0xF0 SURFACE headers, the size member of the CTypeIsolation object is set to 0xF0 * 3 == 0x2D0.
The 0x28 Views of size 0x1000 backing the first CSectionEntry are also represented. In this case, only 2 out of 0x28 Views are committed; the rest of them remain unmapped until needed. The first View is full: all the 6 slots of size 0x280 are in use by SURFACE headers (the 0x100 spare bytes at the end of the page are not pictured here). The second View is only half full: 3 0x280-byte slots are in use, while the last 3 slots remain unused. At the same time, the free/busy status of each slot is kept in sync with the map of bits in the RTL_BITMAP belonging to the same CSectionEntry. In this hypothetical situation where the first 9 slots are in use and the rest are free, the map of bits would look like this: 11111111 00000001 00000000 00000000 ....
Also, notice that this figure doesn't picture the Section object backing the Views objects for the sake of simplicity, since no direct access is done to the Section (all the accesses are done through the Views).
As a separate remark, the win32kbase!SURFACE::tSize static variable, which holds the size in bytes of a SURFACE header, has value 0x278. However, through all the code analyzed here, calculations are done for a size of 0x280 bytes per SURFACE header, probably just for alignment purposes.
.data:00000001C0196110 ; Exported entry 387. ?tSize@SURFACE@@0_KA
.data:00000001C0196110 public private: static unsigned __int64 SURFACE::tSize
.data:00000001C0196110 private: static unsigned __int64 SURFACE::tSize dq 278h
Initialization
Initialization of the Type Isolation structures happens inside win32kbase!HmgCreate(), which gets called during the initialization of the win32kbase.sys driver. It starts by allocating a pointer to the future head NSInstrumentation::CTypeIsolation structure, and saving it to the win32kbase!gpTypeIsolation global variable. Then it calls the CTypeIsolation::Create() method, which allocates the head CTypeIsolation structure.
HmgCreate+397 mov edx, 'osiU'
HmgCreate+39C mov rcx, r14 ; size = 8 (ptr to CTypeIsolation)
HmgCreate+39F call Win32AllocPool
HmgCreate+3A4 mov cs:uchar * * gpTypeIsolation, rax
HmgCreate+3AB test rax, rax
HmgCreate+3AE jz short loc_1C0012561
HmgCreate+3B0 xor ecx, ecx
HmgCreate+3B2 mov [rax], rcx
HmgCreate+3B5 call TypeIsolationFactory<NSInstrumentation::CTypeIsolation<163840,640>>::Create(uchar * *)
CTypeIsolation::Create() allocates 0x20 bytes for the CTypeIsolation object, and then calls CTypeIsolation::Initialize() to initialize it. If everything went fine, the address of the CTypeIsolation object is saved to the pointer referenced by win32kbase!gpTypeIsolation.
.text:00000001C001263C public: static bool TypeIsolationFactory<class NSInstrumentation::CTypeIsolation<163840, 640>>::Create(unsigned char * *) proc near
[...]
.text:00000001C001264D mov edx, 20h ; NumberOfBytes
.text:00000001C0012652 mov r8d, 'osiU' ; Tag
.text:00000001C0012658 lea ecx, [rdx+1] ; PoolType
.text:00000001C001265B call cs:__imp_ExAllocatePoolWithTag ; allocates a NSInstrumentation::CTypeIsolation object
.text:00000001C0012661 mov rbx, rax ; rbx = CTypeIsolation object
.text:00000001C0012664 test rax, rax
.text:00000001C0012667 jz short loc_1C0012699
.text:00000001C0012669 and qword ptr [rax+10h], 0 ; CTypeIsolation->pushlock = NULL
.text:00000001C001266E mov rcx, rax
.text:00000001C0012671 and dword ptr [rax+18h], 0 ; CTypeIsolation->size = 0
.text:00000001C0012675 mov [rax+8], rax ; CTypeIsolation->previous = this
.text:00000001C0012679 mov [rax], rax ; CTypeIsolation->next = this
.text:00000001C001267C call NSInstrumentation::CTypeIsolation<163840,640>::Initialize(void)
.text:00000001C0012681 test al, al
.text:00000001C0012683 jz loc_1C00BA344
.text:00000001C0012689 mov [rdi], rbx ; *win32kbase!gpTypeIsolation = CTypeIsolation
Most notably, CTypeIsolation::Initialize() creates a CSectionEntry structure by calling CSectionEntry::Create(), and assigns it to both the next and previous members of the CTypeIsolation object:
.text:00000001C0039A34 private: bool NSInstrumentation::CTypeIsolation<163840, 640>::Initialize(void) proc near
[...]
.text:00000001C0039A5E call NSInstrumentation::CSectionEntry<163840,640>::Create(void)
.text:00000001C0039A63 test rax, rax ; rax == CSectionEntry object
.text:00000001C0039A66 jz short loc_1C0039A92
.text:00000001C0039A68 mov rcx, [rbx+8] ; rcx = CTypeIsolation->previous
.text:00000001C0039A6C mov dword ptr [rbx+18h], 0F0h ; CTypeIsolation->size = 0xF0
.text:00000001C0039A73 cmp [rcx], rbx ; CTypeIsolation->previous->next == CTypeIsolation?
.text:00000001C0039A76 jnz FatalListEntryError_10
.text:00000001C0039A7C mov [rax], rbx ; CSectionEntry->next= CTypeIsolation
.text:00000001C0039A7F mov [rax+8], rcx ; CSectionEntry->previous = CTypeIsolation->previous
.text:00000001C0039A83 mov [rcx], rax ; *CTypeIsolation->previous->next = CSectionEntry
.text:00000001C0039A86 mov [rbx+8], rax ; CTypeIsolation->previous = CSectionEntry
In turn, CSectionEntry::Create() calls CSectionEntry::Initialize(), which creates a Section object by calling nt!MmCreateSection(). The size of this Section is 0x28000 bytes; this Section will be accessed through 0x28 Views, each one 0x1000 bytes in size. A pointer to this Section object is stored in the CSectionEntry structure.
.text:00000001C0099E5C lea r9, [rbp+arg_0] ; MaximumSize
.text:00000001C0099E60 xor eax, eax
.text:00000001C0099E62 mov rdi, rcx ; rdi = CSectionEntry object
.text:00000001C0099E65 and [r11-10h], rax
.text:00000001C0099E69 lea rcx, [rbp+SectionHandle] ; SectionHandle
.text:00000001C0099E6D and [r11-18h], rax
.text:00000001C0099E71 xor r8d, r8d ; ObjectAttributes
.text:00000001C0099E74 mov [rbp+arg_0], rax
.text:00000001C0099E78 mov edx, 0F001Fh ; DesiredAccess = SECTION_ALL_ACCESS
.text:00000001C0099E7D mov [rsp+40h+var_18], SEC_RESERVE ; AllocationAttributes
.text:00000001C0099E85 mov [rsp+40h+var_20], PAGE_READWRITE ; SectionPageProtection
.text:00000001C0099E8D mov dword ptr [rbp+arg_0], 28000h ; size for the Section
.text:00000001C0099E94 call cs:__imp_MmCreateSection
Then it maps a View of this Section. A pointer to the view is also saved in the CSectionEntry structure.
.text:00000001C0099EB8 mov [rdi+10h], rcx ; CSectionEntry->section = section
.text:00000001C0099EBC test rcx, rcx
.text:00000001C0099EBF jz short loc_1C0099F0F
.text:00000001C0099EC1 and [rbp+arg_0], 0
.text:00000001C0099EC6 lea rbx, [rdi+18h] ; rbx = ptr to output view
.text:00000001C0099ECA mov rdx, rbx
.text:00000001C0099ECD lea r8, [rbp+arg_0]
.text:00000001C0099ED1 call cs:__imp_MmMapViewInSessionSpace ; populates CSectionEntry->view
Finally, CSectionEntry::Initialize() creates a CSectionBitmapAllocator object by calling CSectionBitmapAllocator::Create(). A pointer to this object is stored in the CSectionEntry structure.
.text:00000001C0099EED mov rcx, [rbx] ; rcx = CSectionEntry->view
.text:00000001C0099EF0 call NSInstrumentation::CSectionBitmapAllocator<163840,640>::Create(uchar * const)
.text:00000001C0099EF5 test rax, rax ; rax = CSectionBitmapAllocator
.text:00000001C0099EF8 mov [rdi+20h], rax ; CSectionEntry->bitmap_allocator = CSectionBitmapAllocator
As expected, CSectionBitmapAllocator::Create() calls CSectionBitmapAllocator::Initialize(). This method allocates a pool buffer of size 0x30, which is used to hold a RTL_BITMAP structure. Note that in this context, we are not talking about the GDI Bitmap objects, but about a general-purpose map of bits, which is typically used to keep track of a set of reusable items. The first 0x10 bytes of that pool buffer are used to hold the header of the bitmap, while the remaining 0x20 bytes are used to store the map of bits itself. A buffer of 0x20 bytes can hold 0x100 bits, however only 0xF0 is specified as the number of bits when calling nt!RtlInitializeBitMap, to match the number of SURFACE slots that are handled by a CSectionEntry. Then, all the bits in the bitmap are initialized to 0 by calling nt!RtlClearAllBits.
.text:00000001C009E324 allocate_rtl_bitmap proc near
[...]
.text:00000001C009E333 mov ecx, 21h ; PoolType = PagedPoolSession
.text:00000001C009E338 cmp edx, edi
.text:00000001C009E33A mov r8d, 'osiU' ; Tag = 'Uiso'
.text:00000001C009E340 cmovnb edi, edx ; edi = 0xF0
.text:00000001C009E343 mov edx, edi
.text:00000001C009E345 shr edx, 3 ; edx = 0x1e
.text:00000001C009E348 add edx, 7 ; edx = 0x25
.text:00000001C009E34B and edx, 0FFFFFFF8h ; edx = 0x20
.text:00000001C009E34E add edx, 10h ; NumberOfBytes = 0x30
.text:00000001C009E351 call cs:__imp_ExAllocatePoolWithTag ; allocs 0x30 bytes for a RTL_BITMAP
.text:00000001C009E357 mov rbx, rax
.text:00000001C009E35A test rax, rax
.text:00000001C009E35D jz short loc_1C009E386
.text:00000001C009E35F lea rdx, [rax+10h] ; BitMapBuffer (0x30 - 0x10 bytes)
.text:00000001C009E363 mov r8d, edi ; SizeOfBitMap (number of bits) = 0xF0
.text:00000001C009E366 mov rcx, rax ; BitMapHeader
.text:00000001C009E369 call cs:__imp_RtlInitializeBitMap
.text:00000001C009E36F mov rcx, rbx ; BitMapHeader
.text:00000001C009E372 call cs:__imp_RtlClearAllBits
Besides allocating this RTL_BITMAP structure, CSectionBitmapAllocator::Initialize() also generates a 64-bit random number, which is used as a XOR key to encode pointers to the View and RTL_BITMAP objects that were previously allocated:
.text:00000001C002DE38 private: bool NSInstrumentation::CSectionBitmapAllocator<163840, 640>::Initialize(unsigned char *) proc near
[...]
.text:00000001C002DE48 rdtsc ; source for RtlRandomEx
.text:00000001C002DE4A shl rdx, 20h
.text:00000001C002DE4E lea rcx, [rsp+28h+arg_0]
.text:00000001C002DE53 or rax, rdx
.text:00000001C002DE56 mov [rsp+28h+arg_0], eax
.text:00000001C002DE5A call cs:__imp_RtlRandomEx ; get a 32-bit random number
.text:00000001C002DE60 mov eax, eax
.text:00000001C002DE62 lea rcx, [rsp+28h+arg_0]
.text:00000001C002DE67 shl rax, 20h ; shift eax to the higher part of RAX
.text:00000001C002DE6B mov [rbx+10h], rax ; CSectionBitmapAllocator->xor_key = random
.text:00000001C002DE6F call cs:__imp_RtlRandomEx ; get another 32-bit random number
.text:00000001C002DE75 mov eax, eax
.text:00000001C002DE77 or [rbx+10h], rax ; CSectionBitmapAllocator->xor_key |= another_random
The XORed pointers to the View and RTL_BITMAP objects are stored in the CSectionBitmapAllocator structure.
.text:00000001C002DEB8 mov rdx, [rbx+10h] ; rdx = CSectionBitmapAllocator->xor_key
.text:00000001C002DEBC mov rcx, rdx
.text:00000001C002DEBF xor rcx, rax ; rcx = CSectionBitmapAllocator->xor_key ^ RTL_BITMAP
.text:00000001C002DEC2 mov al, 1
.text:00000001C002DEC4 xor rdx, rdi ; rdx = CSectionBitmapAllocator->xor_key ^ CSectionEntry->view
.text:00000001C002DEC7 mov [rbx+18h], rcx ; CSectionBitmapAllocator->xored_rtl_bitmap = CSectionBitmapAllocator->xor_key ^ RTL_BITMAP
.text:00000001C002DECB mov [rbx+8], rdx ; CSectionBitmapAllocator->xored_view = CSectionBitmapAllocator->xor_key ^ CSectionEntry->view
Allocation
The win32kfull!NtGdiCreateBitmap() system call is in charge of creating GDI Bitmap objects. win32kfull!NtGdiCreateBitmap() calls win32kbase!GreCreateBitmap(), which in turn calls win32kbase!SURFMEM::bCreateDIB(). The job of win32kbase!SURFMEM::bCreateDIB() is to allocate memory for the SURFACE object. In previous versions of Windows, the pixel data buffer of a Bitmap was typically contiguous to the SURFACE header; it didn't have to be necessarily like that, but it was done that way. This made possible to "extend" a pixel data buffer by corrupting the sizlBitmap member of the SURFACE header, as explained before, and making it overlap with the SURFACE header of an adjacent Bitmap.
Starting from Windows 10 Fall Creators Update, win32kbase!SURFMEM::bCreateDIB ensures that the SURFACE header and the pixel data buffer are allocated separately, using the Type Isolation mitigation.
The pixel data buffer is allocated on the PagedPoolSession pool in a straightforward way, by calling a wrapper of nt!ExAllocatePoolWithTag:
SURFMEM::bCreateDIB+10B sub r15d, r12d ; alloc_size = requested_size - sizeof(SURFACE)
SURFMEM::bCreateDIB+10E jz short loc_1C0038F91
SURFMEM::bCreateDIB+110 call cs:__imp_IsWin32AllocPoolImplSupported
SURFMEM::bCreateDIB+116 test eax, eax
SURFMEM::bCreateDIB+118 js loc_1C00C54D6
SURFMEM::bCreateDIB+11E mov r8d, 'mbpG' ; Tag = 'Gpbm'
SURFMEM::bCreateDIB+124 mov edx, r15d ; NumberOfBytes = requested_size - sizeof(SURFACE)
SURFMEM::bCreateDIB+127 mov ecx, 21h ; PoolType = PagedPoolSession
SURFMEM::bCreateDIB+12C call cs:__imp_Win32AllocPoolImpl ; <<< allocation! only for the pixel_data_buffer
On the other hand, the SURFACE header is now allocated from the CTypeIsolation structures described earlier, by calling CTypeIsolation::AllocateType(). To be precise, this allocation returns a buffer located on a View of a Section object:
SURFMEM::bCreateDIB+16C mov rax, cs:uchar * * gpTypeIsolation
SURFMEM::bCreateDIB+173 mov rcx, [rax]
SURFMEM::bCreateDIB+176 test rcx, rcx
SURFMEM::bCreateDIB+179 jz loc_1C00C579D
SURFMEM::bCreateDIB+17F call NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)
SURFMEM::bCreateDIB+184 mov rsi, rax ; rsi = buffer for the SURFACE header
SURFMEM::bCreateDIB+187 test rax, rax ; the returned buffer is a View of a Section object
SURFMEM::bCreateDIB+18A jz loc_1C00C5791
By digging into the CTypeIsolation::AllocateType() function, we can see how the allocation algorithm works.
CTypeIsolation::AllocateType() traverses the list of CSectionEntry objects; for each CSectionEntry, it checks if its CSectionBitmapAllocator contains a clear bit in its backing RTL_BITMAP structure, by calling nt!RtlFindClearBits. It makes use of the bitmap_hint_index member of the CSectionBitmapAllocator to try and speed up the lookup.
.text:00000001C0039863 mov r8d, ebp ; HintIndex = 0
.text:00000001C0039866 cmp eax, 0F0h ; bitmap_hint_index >= RTL_BITMAP->size?
.text:00000001C003986B jnb short loc_1C0039870
.text:00000001C003986D mov r8d, eax ; HintIndex = bitmap_hint_index
.text:00000001C0039870
.text:00000001C0039870 loc_1C0039870: ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+6Bj
.text:00000001C0039870 mov rcx, [rsi+18h] ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039874 mov edx, 1 ; NumberToFind
.text:00000001C0039879 xor rcx, [rsi+10h] ; BitMapHeader = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039879 ; ^ CSectionBitmapAllocator->xor_key
.text:00000001C003987D call cs:__imp_RtlFindClearBits
.text:00000001C0039883 mov r12d, eax ; r12 = free_bit_index
.text:00000001C0039886 cmp eax, 0FFFFFFFFh ; free_bit_index == -1?
.text:00000001C0039889 jz short loc_1C00398D6 ; if so, RTL_BITMAP is full, check another CSectionEntry
If nt!RtlFindClearBits returns -1, indicating that all the bits in the RTL_BITMAP are set to 1 (that is, the RTL_BITMAP is full), then it tries to repeat the operation with the next CSectionEntry on the list. We'll explore this case later. Otherwise, if nt!RtlFindClearBits returns a value different than -1, that means that the RTL_BITMAP had at least 1 clear bit, and therefore that the Section memory on the current CSectionEntry has at least 1 free slot for a SURFACE header.
So we need to map the index of the clear bit in the RTL_BITMAP -as returned by nt!RtlFindClearBits()- into the corresponding memory address of a free slot in a Section View. For this purpose, the index of the clear bit is divided by 6, since each 0x1000-byte View of the Section is capable of holding 6 SURFACE headers of size 0x280. The result is an index, which I call view_index in the disassembly snippets below. This view_index will be in the range [0, 0x27], since each Section is 0x28000 bytes in size, and so it can be divided into 0x28 Views of size 0x1000, and it's used to address one of the 0x28 possible Views of a Section.
This view_index is compared against the count of actually committed Views for the current Section, as held in the num_commited_views member of the CSectionBitmapAllocator object. As explained in MSDN [4], "no physical memory is allocated for a view until the virtual memory range is accessed". If view_index is smaller than the count of committed Views, then we don't need to commit a new View and we can go straight to the allocation. Otherwise, the address of the corresponding View is calculated (first_view + view_index * 0x1000) and committed to physical memory by calling nt!MmCommitSessionMappedView.
.text:00000001C003988B mov eax, 0AAAAAAABh
.text:00000001C0039890 mul r12d
.text:00000001C0039893 mov eax, [rsi+24h] ; eax = CSectionBitmapAllocator->num_commited_views
.text:00000001C0039896 mov r15d, edx ; HI_DWORD(free_bit_index * 0xaaaaaaab) / 4 == free_bit_index / 6
.text:00000001C0039899 shr r15d, 2 ; r15d = view_index = free_bit_index / 6 (6 SURFACE headers fit in 0x1000 bytes)
.text:00000001C003989D cmp r15d, eax ; view_index < num_commited_views ?
.text:00000001C00398A0 jb loc_1C003998A ; if so, no need to commit a new 0x1000-byte chunk from the View
.text:00000001C00398A6 cmp eax, 28h ; num_commited_views >= MAX_VIEW_INDEX ?
.text:00000001C00398A9 jnb loc_1C003998A
.text:00000001C00398AF mov rbp, [rsi+8]
.text:00000001C00398AF ; rbp = CSectionBitmapAllocator->xored_view
.text:00000001C00398B3 mov edx, r15d ; edx = view_index
.text:00000001C00398B6 xor rbp, [rsi+10h] ; CSectionBitmapAllocator->xored_view ^ CSectionBitmapAllocator->xor_key
.text:00000001C00398BA shl edx, 0Ch ; view_index * 0x1000
.text:00000001C00398BD add rbp, rdx ; rbp = view + view_index * 0x1000
.text:00000001C00398C0 mov edx, 1000h ; edx = size to commit
.text:00000001C00398C5 mov rcx, rbp ; rcx = addr of view to commit
.text:00000001C00398C8 call cs:__imp_MmCommitSessionMappedView
After a successful commit, the 0x1000-byte View is initialized to 0 (this write operation ends up doing the actual commit), and the num_commited_views member of the CSectionBitmapAllocator is updated accordingly.
.text:00000001C0039975 loc_1C0039975: ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+D0j
.text:00000001C0039975 xor edx, edx ; Val
.text:00000001C0039977 mov r8d, 1000h ; Size
.text:00000001C003997D mov rcx, rbp ; Dst
.text:00000001C0039980 call memset ; this memset actually commits the memory
.text:00000001C0039985 inc dword ptr [rsi+24h] ; CSectionBitmapAllocator->num_commited_views++
.text:00000001C0039988 xor ebp, ebp
Either if a new View had to be committed or not, the index of the clear bit of the RTL_BITMAP is then set to 1 by calling nt!RtlSetBit(), in order to mark that bit as busy. Curiously enough, the code calls nt!RtlTestBit() before setting the bit to 1, but the return value is not checked at all. Also, the bitmap_hint_index member of the CSectionBitmapAllocator is incremented by 1, resetting it to 0 if it happens to exceed the maximum value of 0xF0 - 1.
.text:00000001C003998A mov rcx, [rsi+18h] ; rcx = CsectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C003998E mov edx, r12d ; BitNumber = free bit index
.text:00000001C0039991 xor rcx, [rsi+10h] ; BitMapHeader = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039991 ; ^ CSectionBitmapAllocator->xor_key
.text:00000001C0039995 call cs:__imp_RtlTestBit ; [!] return value not checked
.text:00000001C003999B mov rcx, [rsi+18h] ; rcx = CsectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C003999F mov edx, r12d ; BitNumber
.text:00000001C00399A2 xor rcx, [rsi+10h] ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C00399A6 call cs:__imp_RtlSetBit
.text:00000001C00399AC inc dword ptr [rsi+20h] ; CSectionBitmapAllocator->bitmap_hint_index++
.text:00000001C00399AF cmp dword ptr [rsi+20h], 0F0h ; CSectionBitmapAllocator->bitmap_hint_index >= bitmap size?
.text:00000001C00399B6 jnb short loc_1C0039A27
[...]
.text:00000001C0039A27 loc_1C0039A27: ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+1B6j
.text:00000001C0039A27 mov [rsi+20h], ebp ; CSectionBitmapAllocator->bitmap_hint_index = 0
.text:00000001C0039A2A jmp short loc_1C00399B8
Now that we have mapped our clear bit into its corresponding View, we need to select a 0x280-byte chunk within that View. Each View can hold 6 SURFACE headers (0x1000 / 0x280 == 6). In order to do that, the following calculation is done: free_bit_index - view_index * 6, which simply equals free_bit_index % 6.
.text:00000001C00399B8 mov rax, [rsi+10h] ; rax = CSectionBitmapAllocator->xor_key
.text:00000001C00399BC mov ecx, r15d ; ecx = view_index
.text:00000001C00399BF mov rsi, [rsi+8] ; rsi = CSectionBitmapAllocator->xored_view
.text:00000001C00399C3 xor edx, edx
.text:00000001C00399C5 shl ecx, 0Ch ; ecx = view_index * 0x1000
.text:00000001C00399C8 xor rsi, rax ; rsi = xored_view ^ xor_key
.text:00000001C00399CB add rsi, rcx ; rsi = view + view_index * 0x1000
.text:00000001C00399CE mov rcx, rbx ; rcx = CSectionBitmapAllocator->pushlock
.text:00000001C00399D1 call cs:__imp_ExReleasePushLockExclusiveEx
.text:00000001C00399D7 call cs:__imp_KeLeaveCriticalRegion
.text:00000001C00399DD lea eax, [r15+r15*2] ; r15 == view_index
.text:00000001C00399E1 add eax, eax
.text:00000001C00399E3 sub r12d, eax ; r12d = free_bit_index - view_index * 6 == free_bit_index % 6
.text:00000001C00399E6 lea ebx, [r12+r12*4]
.text:00000001C00399EA shl ebx, 7 ; ebx = r12 * 0x5 * 0x80 == r12 * 0x280
.text:00000001C00399ED add rbx, rsi ; rbx += view + view_index * 0x1000
The value that RBX gets at 0x1C00399ED is the address of the newly allocated SURFACE header, and this is the value that will be returned by CTypeIsolation::AllocateType().
For the sake of completeness, and as promised, here's what happens when nt!RtlFindClearBits() returns -1, meaning that the RTL_BITMAP of the current CSectionEntry is full. In that case, the following conditional jump is taken:
.text:00000001C0039870 mov rcx, [rsi+18h] ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039874 mov edx, 1 ; NumberToFind
.text:00000001C0039879 xor rcx, [rsi+10h] ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C003987D call cs:__imp_RtlFindClearBits
.text:00000001C0039883 mov r12d, eax ; r12 = free_bit_index
.text:00000001C0039886 cmp eax, 0FFFFFFFFh ; free_bit_index == -1?
.text:00000001C0039889 jz short loc_1C00398D6 ; if so, RTL_BITMAP is full, check another CSectionEntry
That jump takes us here, where it checks if CSectionEntry->next == CTypeIsolation, meaning that we've reached the end of the list of CSectionEntry objects. If that's not the case, it loops and repeats the process with the next CSectionEntry object.
.text:00000001C00398D6 loc_1C00398D6: ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+89j
.text:00000001C00398D6 lea rcx, [rsp+48h+arg_0]
.text:00000001C00398DB call NSInstrumentation::CAutoExclusiveCReaderWriterLock<NSInstrumentation::CPlatformReaderWriterLock>::~CAutoExclusiveCReaderWriterLock<NSInstrumentation::CPlatformReaderWriterLock>(void)
.text:00000001C00398E0 loc_1C00398E0: ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+1F0j
.text:00000001C00398E0 mov r14, [r14] ; r14 = CSectionEntry->next
.text:00000001C00398E3 mov ebp, 0
.text:00000001C00398E8 cmp r14, r13 ; CSectionEntry->next == CTypeIsolation ?
.text:00000001C00398EB jnz loc_1C0039843 ; if not, keep traversing the list
Otherwise, if we've reached the end of the list of CSectionEntry objects without finding an empty slot (that is, every CSectionEntry is holding its maximum of 0xF0 SURFACE headers), the following code is reached. As shown below, it creates a new CSectionEntry, and it calls CSectionBitmapAllocator::Allocate() on the CSectionBitmapAllocator member of this new CSectionEntry. As expected, CSectionBitmapAllocator::Allocate() mostly duplicates the procedure explained before: it finds a clear bit in the RTL_BITMAP, it commits the 0x1000-bytes View corresponding to said free bit, it marks that bit as busy in the RTL_BITMAP, and finally it returns the address of the newly created SURFACE header within the committed View.
.text:00000001C00398F1 loc_1C00398F1: ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+3Dj
.text:00000001C00398F1 xor edx, edx ; if we land here, that means that we finished traversing
.text:00000001C00398F1 ; the list of CSectionEntry, without finding an empty slot
.text:00000001C00398F3 mov rcx, rdi
.text:00000001C00398F6 call cs:__imp_ExReleasePushLockSharedEx
.text:00000001C00398FC call cs:__imp_KeLeaveCriticalRegion
.text:00000001C0039902 call NSInstrumentation::CSectionEntry<163840,640>::Create(void)
.text:00000001C0039907 mov rdi, rax ; rdi = new CSectionEntry
.text:00000001C003990A test rax, rax
.text:00000001C003990D jz short loc_1C003996D
.text:00000001C003990F mov rcx, [rax+20h] ; rcx = CSectionEntry->bitmap_allocator
.text:00000001C0039913 call NSInstrumentation::CSectionBitmapAllocator<163840,640>::Allocate(void) ; *** do the actual SURFACE header allocation
.text:00000001C0039918 mov rbp, rax ; rbp = return value, allocated SURFACE header
Finally, the newly created CSectionEntry is inserted at the end of the doubly linked list, as detailed below. Notice that there is an integrity check before operating with the pointers of the list: the code verifies if the next pointer of CTypeIsolation->previous points to the CTypeIsolation head.
.text:00000001C0039939 mov rcx, [r13+8] ; rcx = CTypeIsolation->previous
.text:00000001C003993D cmp [rcx], r13 ; CTypeIsolation->previous->next == CTypeIsolation ?
.text:00000001C0039940 jnz FatalListEntryError_9 ; if not, the list is corrupted
.text:00000001C0039946 mov [rdi+8], rcx ; CSectionEntry->previous = CTypeIsolation->previous
.text:00000001C003994A xor edx, edx
.text:00000001C003994C mov [rdi], r13 ; CSectionEntry->next = CTypeIsolation
.text:00000001C003994F mov [rcx], rdi ; CTypeIsolation->previous->next = CSectionEntry
.text:00000001C0039952 mov rcx, rbx
.text:00000001C0039955 add dword ptr [r13+18h], 0F0h ; CTypeIsolation->size += 0xF0
.text:00000001C003995D mov [r13+8], rdi ; CTypeIsolation->previous = CSectionEntry
Deallocation
Deallocation of SURFACE objects is done in the win32kbase!SURFACE::Free() function. This function starts by freeing the pool allocation that holds the pixel data buffer:
.text:00000001C002DC9A cmp byte ptr [rbp+270h], 0 ; boolean is_kernel_mode_pixel_data_buffer
.text:00000001C002DCA1 loc_1C002DCA1: ; DATA XREF: .rdata:00000001C017D540o
.text:00000001C002DCA1 mov [rsp+48h+arg_8], rbx
.text:00000001C002DCA6 jz short loc_1C002DCCC ; if byte[SURFACE+0x270] == 0, the pixel data buffer is not freed
.text:00000001C002DCA8 mov rbx, [rbp+48h] ; rbx = SURFACE->pvScan0
.text:00000001C002DCAC test rbx, rbx
.text:00000001C002DCAF jz short loc_1C002DCCC
.text:00000001C002DCB1 call cs:__imp_IsWin32FreePoolImplSupported
.text:00000001C002DCB7 test eax, eax
.text:00000001C002DCB9 js short loc_1C002DCC4
.text:00000001C002DCBB mov rcx, rbx
.text:00000001C002DCBE call cs:__imp_Win32FreePoolImpl ; frees the pixel data buffer
After that, it takes the CTypeIsolation head and starts traversing the doubly linked list of CSectionEntry objects, trying to determine which CSectionEntry contains the SURFACE header that it's trying to free. In order to do this, it simply checks if CSectionEntry->view <= SURFACE <= CSectionEntry->view + 0x28000. Notice that there may be an error in this check, as it should probably be CSectionEntry->view <= SURFACE < CSectionEntry->view + 0x28000 (< instead of <= in the second comparison).
.text:00000001C002DCCC mov rax, cs:uchar * * gpTypeIsolation
.text:00000001C002DCD3 mov rsi, [rax] ; rsi = CTypeIsolation head
[...]
.text:00000001C002DD08 mov rbx, [rsi] ; rbx = CTypeIsolation->next
.text:00000001C002DD0B cmp rbx, rsi ; next == CTypeIsolation ?
.text:00000001C002DD0E jz loc_1C002DDFF ; if so, there's no CSectionEntry
.text:00000001C002DD14 mov r12, 0CCCCCCCCCCCCCCCDh
.text:00000001C002DD1E xchg ax, ax
.text:00000001C002DD20 loc_1C002DD20: ; CODE XREF: SURFACE::Free(SURFACE *)+C5j
.text:00000001C002DD20 mov r14, [rbx+20h] ; r14 = CSectionEntry->bitmap_allocator
.text:00000001C002DD24 mov r8, [r14+10h] ; r8 = bitmap_allocator->xor_key
.text:00000001C002DD28 mov rax, r8
.text:00000001C002DD2B xor rax, [r14+8] ; rax = xor_key ^ xored_view
.text:00000001C002DD2F cmp rbp, rax ; SURFACE < view?
.text:00000001C002DD32 jb short loc_1C002DD3F ; ...if so, skip to the next CSectionEntry
.text:00000001C002DD34 add rax, 28000h ; view += section_size
.text:00000001C002DD3A cmp rbp, rax ; SURFACE <= end of last view?
.text:00000001C002DD3D jbe short loc_1C002DD4C ; if so, we found the view containing the SURFACE header
When these conditions are satisfied, meaning that we've found the CSectionEntry containing the SURFACE header to be freed, the index of that SURFACE within its container View is calculated (called here index_within_view), by taking the 3 lower nibbles of the address of the SURFACE, and dividing it by 0x280:
.text:00000001C002DD4C loc_1C002DD4C: ; CODE XREF: SURFACE::Free(SURFACE *)+BDj
.text:00000001C002DD4C mov rcx, rbp ; rcx = SURFACE header
.text:00000001C002DD4F mov rax, r12
.text:00000001C002DD52 and ecx, 0FFFh
.text:00000001C002DD58 mul rcx
.text:00000001C002DD5B mov r15, rdx
.text:00000001C002DD5E shr r15, 9 ; r15 = (SURFACE & 0xfff) / 0x280 == index_within_view
.text:00000001C002DD62 lea rax, [r15+r15*4]
.text:00000001C002DD66 shl rax, 7 ; rax = r15 * 0x5 * 0x80 == r15 * 0x280
.text:00000001C002DD6A sub rcx, rax ; if rcx == rax, it's ok
.text:00000001C002DD6D jnz short loc_1C002DD3F
Then, the address of SURFACE needs to be mapped into the bit index that represents it in the RTL_BITMAP. In order to obtain the corresponding bit index, it obtains the view_index (that is, in which 0x1000-byte View this SURFACE object is located), and then it simply performs this calculation: view_index * 6 + index_within_view.
.text:00000001C002DD72 mov eax, ebp ; eax = lo_dword(SURFACE)
.text:00000001C002DD74 xor ecx, [r14+8] ; ecx = lo_dword(xor_key) ^ lo_dword(xored_view)
.text:00000001C002DD78 sub eax, ecx ; eax = lo_dword(SURFACE) - lo_dword(view)
.text:00000001C002DD7A mov rcx, [r14+18h] ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C002DD7E shr eax, 0Ch ; eax /= 0x1000 == view_index
.text:00000001C002DD81 xor rcx, r8 ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C002DD84 lea eax, [rax+rax*2]
.text:00000001C002DD87 lea edx, [r15+rax*2] ; BitNumber = view_index * 6 + index_within_view
.text:00000001C002DD8B call cs:__imp_RtlTestBit
.text:00000001C002DD91 test al, al
.text:00000001C002DD93 jz short loc_1C002DD3F ; bit is turned off?
The value of the calculated bit index is tested via nt!RtlTestBit(); if it's set to 1, as expected, then the execution flow continues in the code snippet below. As shown here, it calls CSectionBitmapAllocator::ContainsAllocation() (however the boolean value returned by this function is not checked at all), and then it clears the proper bit in the RTL_BITMAP by calling nt!RtlClearBit(), marking the slot as free. Finally, it clears the memory of the freed SURFACE header by calling memset(), and the bit index of the free slot is saved as the bitmap_hint_index, in order to speed up future operations.
.text:00000001C002DDA9 mov rdx, rbp ; rdx = SURFACE header
.text:00000001C002DDAC mov rcx, r14 ; rcx = bitmap_allocator
.text:00000001C002DDAF call NSInstrumentation::CSectionBitmapAllocator<163840,640>::ContainsAllocation(void const *)
.text:00000001C002DDB4 mov ecx, [r14+8] ; ecx = CSectionBitmapAllocator->xored_view
.text:00000001C002DDB8 mov eax, ebp ; [!] return value from ContainsAllocation() is not checked
.text:00000001C002DDBA xor ecx, [r14+10h] ; CSectionBitmapAllocator->xored_view ^ CSectionBitmapAllocator->xor_key
.text:00000001C002DDBE sub eax, ecx ; eax = lo_dword(SURFACE) - lo_dword(view)
.text:00000001C002DDC0 mov rcx, [r14+18h] ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C002DDC4 xor rcx, [r14+10h] ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C002DDC8 shr eax, 0Ch ; eax /= 0x1000 == view_index
.text:00000001C002DDCB lea eax, [rax+rax*2]
.text:00000001C002DDCE lea esi, [r15+rax*2]
.text:00000001C002DDD2 mov edx, esi ; BitNumber = view_index * 6 + index_within_view
.text:00000001C002DDD4 call cs:__imp_RtlClearBit ; mark the slot as available
.text:00000001C002DDDA xor edx, edx ; Val
.text:00000001C002DDDC mov r8d, 280h ; Size
.text:00000001C002DDE2 mov rcx, rbp ; Dst
.text:00000001C002DDE5 call memset ; null-out the freed SURFACE header in the view
.text:00000001C002DDEA xor edx, edx
.text:00000001C002DDEC mov [r14+20h], esi ; bitmap_allocator->bitmap_hint_index = index of freed slot
Windbg extension
While reverse engineering Win32k Type Isolation I developed a little WinDbg extension to help me dump the state of the Type Isolation structures. It is available at https://github.com/fdfalcon/TypeIsolationDbg.
The WinDbg extension provides the following commands:
!gptypeisolation [address] : prints the top-level CTypeIsolation structure (default address: win32kbase!gpTypeIsolation)
!typeisolation [address] : prints a NSInstrumentation::CTypeIsolation structure
!sectionentry [address] : prints a NSInstrumentation::CSectionEntry structure
!sectionbitmapallocator [address] : prints a NSInstrumentation::CSectionBitmapAllocator structure
!rtlbitmap [address] : prints a RTL_BITMAP structure
The output of the extension includes some clickable links to help you follow the Type Isolation data structures. It also decodes XORed pointers to save you a step. The following snippet shows the output of TypeIsolationDbg when dumping the global CTypeIsolation object and following the data structures for a single CSectionEntry, all the way down to the map of bits representing the busy/free state of the CSectionEntry's slots:
kd> !gptypeisolation
win32kbase!gpTypeIsolation is at address 0xffffe6cf95138a98.
Pointer [1] stored at win32kbase!gpTypeIsolation: 0xffffe6a4400006b0.
Pointer [2]: 0xffffe6a440000680.
NSInstrumentation::CTypeIsolation
+0x000 next : 0xffffe6a440000620
+0x008 previous : 0xffffe6a441d8ca20
+0x010 pushlock : 0xffffe6a440000660
+0x018 size : 0xF00 [number of section entries: 0x10]
kd> !sectionentry ffffe6a440000620
NSInstrumentation::CSectionEntry
+0x000 next : 0xffffe6a441ca2470
+0x008 previous : 0xffffe6a440000680
+0x010 section : 0xffff86855f09f260
+0x018 view : 0xffffe6a4403a0000
+0x020 bitmap_allocator : 0xffffe6a4400005e0
kd> !sectionbitmapallocator ffffe6a4400005e0
NSInstrumentation::CSectionBitmapAllocator
+0x000 pushlock : 0xffffe6a4400005c0
+0x008 xored_view : 0xa410b31c3f332f4c [decoded: 0xffffe6a4403a0000]
+0x010 xor_key : 0x5bef55b87f092f4c
+0x018 xored_rtl_bitmap : 0xa410b31c3f092acc [decoded: 0xffffe6a440000580]
+0x020 bitmap_hint_index : 0xC0
+0x024 num_commited_views : 0x27
kd> !rtlbitmap ffffe6a440000580
RTL_BITMAP
+0x000 size : 0xF0
+0x008 bitmap_buffer : 0xffffe6a440000590
kd> dyb ffffe6a440000590 L20
76543210 76543210 76543210 76543210
-------- -------- -------- --------
ffffe6a4`40000590 00000101 00000000 00000110 10110000 05 00 06 b0
ffffe6a4`40000594 00011100 10000000 11011011 11110110 1c 80 db f6
ffffe6a4`40000598 01111101 11111111 11111111 11111111 7d ff ff ff
ffffe6a4`4000059c 11111111 11011111 11110111 01111111 ff df f7 7f
ffffe6a4`400005a0 11111111 11111111 11111111 01111111 ff ff ff 7f
ffffe6a4`400005a4 11111101 11111001 11111111 01101111 fd f9 ff 6f
ffffe6a4`400005a8 11111110 11111111 11111111 11111111 fe ff ff ff
ffffe6a4`400005ac 11111111 00000011 00000000 00000000 ff 03 00 00
Conclusion
The Type Isolation mitigation implemented in the Win32k component of Windows 10 1709 modifies the way GDI Bitmap objects are allocated in kernel space: the SURFACE header gets allocated on a Section View, while the pixel data buffer is allocated on the PagedPoolSession pool. This definitely eliminates the commodity exploitation technique of using Bitmaps as targets for limited memory corruption vulnerabilities, since it's not possible anymore to make an aligned spray of adjacent Bitmaps where the end of a pixel data buffer is immediately followed by the header of the next SURFACE object.
Meanwhile, exploit writers have already transitioned to other useful kernel objects, such as Palettes [5] [6] [7].
As a curiosity, the CSectionBitmapAllocator object keeps both the pointer to the Section Views and the pointer to the RTL_BITMAP obfuscated via a XOR operation, however the parent CSectionEntry structure keeps the same pointer to the Views in plain.
Thanks
A big thanks goes to my colleagues at Quarkslab for proof-reading this blogpost and providing feedback about it.
References
[1] | https://msdn.microsoft.com/en-us/library/dd183377(v=vs.85).aspx |
[2] | https://www.coresecurity.com/system/files/publications/2016/10/Abusing-GDI-Reloaded-ekoparty-2016_0.pdf |
[3] | https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/section-objects-and-views |
[4] | https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/managing-memory-sections |
[5] | https://sensepost.com/blog/2017/abusing-gdi-objects-for-ring0-primitives-revolution/ |
[6] | https://labs.bluefrostsecurity.de/files/Abusing_GDI_for_ring0_exploit_primitives_Evolution_Slides.pdf |
[7] | http://theevilbit.blogspot.com/2017/10/abusing-gdi-objects-for-kernel.html |