Reverse Engineering the Win32k Type Isolation Mitigation

Posted Fri 02 February 2018
Author Francisco Falcon
Category Reverse-Engineering
Tags Windows, exploitation, reverse-engineering, 2018

Given the popularity of GDI Bitmap objects for exploitation of kernel vulnerabilities -due to the fact that almost any kind of memory corruption vulnerability (except for NULL-writes) could be used to reliably gain arbitrary R/W primitives over the kernel memory by abusing Bitmaps- Microsoft decided to kill exploitation techniques based on Bitmaps. In order to do this, Windows 10 Fall Creators Update (also known as Windows 10 1709) introduced the Type Isolation feature, an exploitation mitigation in the Win32k subsystem, which splits the memory layout of SURFACE objects, the internal representation of Bitmaps on the kernel side. This blogpost takes a deep dive into the details of how Type Isolation is implemented.

Analysis Notes

This analysis was initially performed using win32kbase.sys version 10.0.16288.1 on Windows 10 Fall Creators Update x64, which was one of the last Insider Preview builds before the general availability of Windows 10 1709 in October 2017. Doing a binary diff between said version and the latest win32kbase.sys version 10.0.16299.125 available as of this writing (end of January 2018), shows that the functions relevant to Type Isolation remain unchanged.

Context

Since mid-2015, Bitmaps [1], a type of GDI objects, have been the preferred choice of exploit developers when exploiting kernel vulnerabilities on Windows. The data structure representing this type of object on the Windows kernel turned out to have some very handy members, which when corrupted via a memory safety vulnerability, could provide an attacker with full-blown R/W access to the kernel address space.

On the kernel side, a Bitmap is represented by a SURFACE object using the following structure:

typedef struct _SURFACE {
    BASEOBJECT BaseObject;
    SURFOBJ surfobj;
    [...]
}

BASEOBJECT is common to several types of objects, and it's defined this way:

typedef struct _BASEOBJECT {
    HANDLE hHmgr;
    ULONG  ulShareCount;
    USHORT cExclusiveLock;
    USHORT BaseFlags;
    PVOID  Tid;
} BASEOBJECT, *PBASEOBJECT;

But we are interested in the SURFACE-specific structure, which is called SURFOBJ, and it's defined like this:

typedef struct _SURFOBJ {
    DHSURF dhsurf;
    HSURF  hsurf;
    DHPDEV dhpdev;
    HDEV   hdev;
    SIZEL  sizlBitmap;
    ULONG  cjBits;
    PVOID  pvBits;
    PVOID  pvScan0;
    LONG   lDelta;
    ULONG  iUniq;
    ULONG  iBitmapFormat;
    USHORT iType;
    USHORT fjBitmap;
} SURFOBJ, *PSURFOBJ;

The two most interesting members of this structure are pvScan0, which points to the buffer holding the pixel data of the Bitmap, and sizlBitmap, which holds the dimensions (width and height) of the Bitmap.

There are two main ways to take advantage of SURFACE objects for exploitation purposes, by corrupting the members mentioned before:

Both the GetBitmapBits and SetBitmapBits GDI APIs operate on the pixel data buffer pointed by the pvScan0 member of the SURFOBJ structure. Overwriting this pointer provides arbitrary R/W of kernel memory from user mode.

The sizlBitmap member of the SURFOBJ structure holds the width and height properties of the Bitmap. By overwriting sizlBitmap.cx or sizlBitmap.cy it's possible to "enlarge" the pixel data buffer. This provides R/W access to kernel memory beyond the end of the pixel data buffer.

Both ways end up setting up a Manager/Worker scheme where 2 Bitmap objects are involved; if you want to study this topic in depth, I recommend reading the slides from the 2016 Ekoparty conference presentation titled "Abusing GDI for ring0 exploit primitives: Reloaded" [2], by Diego Juárez and Nicolás Economou.

The first technique provides full R/W capabilities. Its disadvantage is that it requires a "good" vulnerability (such as a write-what-where), which should allow to overwrite the pvScan0 pointer with an arbitrary value.

The second technique, although less powerful at first, since it initially provides R/W access to the memory located right after the end of the pixel data buffer, has the advantage of being usable even with limited vulnerabilities; simple arbitrary decrements/increments, or writing non-arbitrary values will do the trick. The exploitation strategy when overwriting the sizlBitmap member of the SURFOBJ structure is to make two SURFACE objects (let's call them SURFACE1 and SURFACE2) adjacent in memory; by corrupting the sizlBitmap of SURFACE1, its pixel data buffer is "enlarged", thus overlapping with the adjacent SURFACE2 object. From that point, further operations on the (now enlarged) SURFACE1 can arbitrarily overwrite members of the SURFACE2 header, effectively transforming the limited R/W beyond the bounds of the SURFACE1 pixel buffer into fully arbitrary R/W capabilities.

As you may have noticed, this second exploitation approach has been possible because, until now, the pixel data buffer of a Bitmap was typically contiguous to the SURFACE header; the whole SURFACE object was created through a single memory allocation, with size big enough to hold both the SURFACE header plus the pixel data buffer. It didn't have to be necessarily like that, but it was implemented that way. This made possible for exploit developers to obtain advantageous memory layouts, where the pixel data buffer of one SURFACE object can be followed by the header of another one.

Given this second exploitation approach, which allows to turn almost any kind of memory corruption vulnerability (except for NULL-writes) into arbitrary R/W over the kernel memory by abusing GDI Bitmap objects, Microsoft decided to kill it. In order to do this, Windows 10 Fall Creators Update introduced the Type Isolation feature, an exploitation mitigation in the Win32k subsystem, which splits the memory layout of SURFACE objects.

Type Isolation

Data Structures

Type Isolation is implemented through a number of linked structures. There are 4 main data structures involved in this mitigation (you can find below a cute lil' diagram that explains everything in a graphical way):

CTypeIsolation
CSectionEntry
CSectionBitmapAllocator
RTL_BITMAP (although not unique to Type Isolation, this well-known Windows kernel opaque structure plays an important role here)

All of them are allocated from the PagedPoolSession pool using ExAllocatePoolWithTag. They all share a new 4-byte pool tag, which is 'Uiso'. Also, the pool tag used for the pixel data buffer of SURFACE objects has changed, from 'Gh?5' to 'Gpbm'.

The win32kbase!gpTypeIsolation static variable is a pointer to another pointer, which in turn points to a global CTypeIsolation structure. This CTypeIsolation is the head of a circular doubly-linked list of CSectionEntry objects. Each CSectionEntry manages 0xF0 SURFACE headers. Each CSectionEntry owns a CSectionBitmapAllocator object, which maintains two main objects in sync: an array of 0x28 Views over a Section [3], with each View being able to hold 6 SURFACE headers, and a map of bits (RTL_BITMAP) that keeps track of the busy or free state of each one of the 0x28 * 6 == 0xF0 available slots in the Views. The doubly-linked list of CSectionEntry objects can grow as needed.

The reverse-engineered definitions of the 4 mentioned data structures follow, along with their sizes and the offsets of their members:

CTypeIsolation (size = 0x20 bytes)

typedef struct _CTYPEISOLATION {
    PCSECTIONENTRY  next;           // + 0x00
    PCSECTIONENTRY  previous;       // + 0x08
    PVOID           pushlock;       // + 0x10
    ULONG64         size;           // + 0x18
} CTYPEISOLATION, *PCTYPEISOLATION;

CSectionEntry (size = 0x28 bytes)

typedef struct _CSECTIONENTRY CSECTIONENTRY, *PCSECTIONENTRY;

struct _CSECTIONENTRY {
    CSECTIONENTRY   *next;          // + 0x00
    CSECTIONENTRY   *previous;      // + 0x08
    PVOID           section;        // + 0x10
    PVOID           view;           // + 0x18
    PCSECTIONBITMAPALLOCATOR bitmap_allocator;  // + 0x20
};

CSectionBitmapAllocator (size = 0x28 bytes)

typedef struct _CSECTIONBITMAPALLOCATOR {
    PVOID           pushlock;           // + 0x00
    ULONG64         xored_view;         // + 0x08
    ULONG64         xor_key;            // + 0x10
    ULONG64         xored_rtl_bitmap;   // + 0x18
    ULONG           bitmap_hint_index;  // + 0x20
    ULONG           num_commited_views; // + 0x24
} CSECTIONBITMAPALLOCATOR, *PCSECTIONBITMAPALLOCATOR;

RTL_BITMAP (size = 0x10 bytes)

typedef struct _RTL_BITMAP {
    ULONG64         size;               // + 0x00
    PVOID           bitmap_buffer;      // + 0x08
} RTL_BITMAP, *PRTL_BITMAP;

The following diagram tries to clarify the relationship between all the involved data structures.

This figure represents an hypothetical state of the Type Isolation structures with 3 CSectionEntry instances, each one with their associated CSectionBitmapAllocator and RTL_BITMAP instances. Since each instance of CSectionEntry manages 0xF0 SURFACE headers, the size member of the CTypeIsolation object is set to 0xF0 * 3 == 0x2D0.

The 0x28 Views of size 0x1000 backing the first CSectionEntry are also represented. In this case, only 2 out of 0x28 Views are committed; the rest of them remain unmapped until needed. The first View is full: all the 6 slots of size 0x280 are in use by SURFACE headers (the 0x100 spare bytes at the end of the page are not pictured here). The second View is only half full: 3 0x280-byte slots are in use, while the last 3 slots remain unused. At the same time, the free/busy status of each slot is kept in sync with the map of bits in the RTL_BITMAP belonging to the same CSectionEntry. In this hypothetical situation where the first 9 slots are in use and the rest are free, the map of bits would look like this: 11111111 00000001 00000000 00000000 ....

Also, notice that this figure doesn't picture the Section object backing the Views objects for the sake of simplicity, since no direct access is done to the Section (all the accesses are done through the Views).

As a separate remark, the win32kbase!SURFACE::tSize static variable, which holds the size in bytes of a SURFACE header, has value 0x278. However, through all the code analyzed here, calculations are done for a size of 0x280 bytes per SURFACE header, probably just for alignment purposes.

.data:00000001C0196110 ; Exported entry 387. ?tSize@SURFACE@@0_KA
.data:00000001C0196110                 public private: static unsigned __int64 SURFACE::tSize
.data:00000001C0196110 private: static unsigned __int64 SURFACE::tSize dq 278h

Initialization

Initialization of the Type Isolation structures happens inside win32kbase!HmgCreate(), which gets called during the initialization of the win32kbase.sys driver. It starts by allocating a pointer to the future head NSInstrumentation::CTypeIsolation structure, and saving it to the win32kbase!gpTypeIsolation global variable. Then it calls the CTypeIsolation::Create() method, which allocates the head CTypeIsolation structure.

HmgCreate+397                  mov     edx, 'osiU'
HmgCreate+39C                  mov     rcx, r14        ; size = 8 (ptr to CTypeIsolation)
HmgCreate+39F                  call    Win32AllocPool
HmgCreate+3A4                  mov     cs:uchar * * gpTypeIsolation, rax
HmgCreate+3AB                  test    rax, rax
HmgCreate+3AE                  jz      short loc_1C0012561
HmgCreate+3B0                  xor     ecx, ecx
HmgCreate+3B2                  mov     [rax], rcx
HmgCreate+3B5                  call    TypeIsolationFactory<NSInstrumentation::CTypeIsolation<163840,640>>::Create(uchar * *)

CTypeIsolation::Create() allocates 0x20 bytes for the CTypeIsolation object, and then calls CTypeIsolation::Initialize() to initialize it. If everything went fine, the address of the CTypeIsolation object is saved to the pointer referenced by win32kbase!gpTypeIsolation.

.text:00000001C001263C public: static bool TypeIsolationFactory<class NSInstrumentation::CTypeIsolation<163840, 640>>::Create(unsigned char * *) proc near
[...]
.text:00000001C001264D                 mov     edx, 20h        ; NumberOfBytes
.text:00000001C0012652                 mov     r8d, 'osiU'     ; Tag
.text:00000001C0012658                 lea     ecx, [rdx+1]    ; PoolType
.text:00000001C001265B                 call    cs:__imp_ExAllocatePoolWithTag ; allocates a NSInstrumentation::CTypeIsolation object
.text:00000001C0012661                 mov     rbx, rax        ; rbx = CTypeIsolation object
.text:00000001C0012664                 test    rax, rax
.text:00000001C0012667                 jz      short loc_1C0012699
.text:00000001C0012669                 and     qword ptr [rax+10h], 0 ; CTypeIsolation->pushlock = NULL
.text:00000001C001266E                 mov     rcx, rax
.text:00000001C0012671                 and     dword ptr [rax+18h], 0 ; CTypeIsolation->size = 0
.text:00000001C0012675                 mov     [rax+8], rax    ; CTypeIsolation->previous = this
.text:00000001C0012679                 mov     [rax], rax      ; CTypeIsolation->next = this
.text:00000001C001267C                 call    NSInstrumentation::CTypeIsolation<163840,640>::Initialize(void)
.text:00000001C0012681                 test    al, al
.text:00000001C0012683                 jz      loc_1C00BA344
.text:00000001C0012689                 mov     [rdi], rbx      ; *win32kbase!gpTypeIsolation = CTypeIsolation

Most notably, CTypeIsolation::Initialize() creates a CSectionEntry structure by calling CSectionEntry::Create(), and assigns it to both the next and previous members of the CTypeIsolation object:

.text:00000001C0039A34 private: bool NSInstrumentation::CTypeIsolation<163840, 640>::Initialize(void) proc near
[...]
.text:00000001C0039A5E                 call    NSInstrumentation::CSectionEntry<163840,640>::Create(void)
.text:00000001C0039A63                 test    rax, rax        ; rax == CSectionEntry object
.text:00000001C0039A66                 jz      short loc_1C0039A92
.text:00000001C0039A68                 mov     rcx, [rbx+8]    ; rcx = CTypeIsolation->previous
.text:00000001C0039A6C                 mov     dword ptr [rbx+18h], 0F0h ; CTypeIsolation->size = 0xF0
.text:00000001C0039A73                 cmp     [rcx], rbx      ; CTypeIsolation->previous->next == CTypeIsolation?
.text:00000001C0039A76                 jnz     FatalListEntryError_10
.text:00000001C0039A7C                 mov     [rax], rbx      ; CSectionEntry->next= CTypeIsolation
.text:00000001C0039A7F                 mov     [rax+8], rcx    ; CSectionEntry->previous = CTypeIsolation->previous
.text:00000001C0039A83                 mov     [rcx], rax      ; *CTypeIsolation->previous->next = CSectionEntry
.text:00000001C0039A86                 mov     [rbx+8], rax    ; CTypeIsolation->previous = CSectionEntry

In turn, CSectionEntry::Create() calls CSectionEntry::Initialize(), which creates a Section object by calling nt!MmCreateSection(). The size of this Section is 0x28000 bytes; this Section will be accessed through 0x28 Views, each one 0x1000 bytes in size. A pointer to this Section object is stored in the CSectionEntry structure.

.text:00000001C0099E5C                 lea     r9, [rbp+arg_0] ; MaximumSize
.text:00000001C0099E60                 xor     eax, eax
.text:00000001C0099E62                 mov     rdi, rcx        ; rdi = CSectionEntry object
.text:00000001C0099E65                 and     [r11-10h], rax
.text:00000001C0099E69                 lea     rcx, [rbp+SectionHandle] ; SectionHandle
.text:00000001C0099E6D                 and     [r11-18h], rax
.text:00000001C0099E71                 xor     r8d, r8d        ; ObjectAttributes
.text:00000001C0099E74                 mov     [rbp+arg_0], rax
.text:00000001C0099E78                 mov     edx, 0F001Fh    ; DesiredAccess = SECTION_ALL_ACCESS
.text:00000001C0099E7D                 mov     [rsp+40h+var_18], SEC_RESERVE ; AllocationAttributes
.text:00000001C0099E85                 mov     [rsp+40h+var_20], PAGE_READWRITE ; SectionPageProtection
.text:00000001C0099E8D                 mov     dword ptr [rbp+arg_0], 28000h ; size for the Section
.text:00000001C0099E94                 call    cs:__imp_MmCreateSection

Then it maps a View of this Section. A pointer to the view is also saved in the CSectionEntry structure.

.text:00000001C0099EB8                 mov     [rdi+10h], rcx  ; CSectionEntry->section = section
.text:00000001C0099EBC                 test    rcx, rcx
.text:00000001C0099EBF                 jz      short loc_1C0099F0F
.text:00000001C0099EC1                 and     [rbp+arg_0], 0
.text:00000001C0099EC6                 lea     rbx, [rdi+18h]  ; rbx = ptr to output view
.text:00000001C0099ECA                 mov     rdx, rbx
.text:00000001C0099ECD                 lea     r8, [rbp+arg_0]
.text:00000001C0099ED1                 call    cs:__imp_MmMapViewInSessionSpace ; populates CSectionEntry->view

Finally, CSectionEntry::Initialize() creates a CSectionBitmapAllocator object by calling CSectionBitmapAllocator::Create(). A pointer to this object is stored in the CSectionEntry structure.

.text:00000001C0099EED                 mov     rcx, [rbx]      ; rcx = CSectionEntry->view
.text:00000001C0099EF0                 call    NSInstrumentation::CSectionBitmapAllocator<163840,640>::Create(uchar * const)
.text:00000001C0099EF5                 test    rax, rax        ; rax = CSectionBitmapAllocator
.text:00000001C0099EF8                 mov     [rdi+20h], rax  ; CSectionEntry->bitmap_allocator = CSectionBitmapAllocator

As expected, CSectionBitmapAllocator::Create() calls CSectionBitmapAllocator::Initialize(). This method allocates a pool buffer of size 0x30, which is used to hold a RTL_BITMAP structure. Note that in this context, we are not talking about the GDI Bitmap objects, but about a general-purpose map of bits, which is typically used to keep track of a set of reusable items. The first 0x10 bytes of that pool buffer are used to hold the header of the bitmap, while the remaining 0x20 bytes are used to store the map of bits itself. A buffer of 0x20 bytes can hold 0x100 bits, however only 0xF0 is specified as the number of bits when calling nt!RtlInitializeBitMap, to match the number of SURFACE slots that are handled by a CSectionEntry. Then, all the bits in the bitmap are initialized to 0 by calling nt!RtlClearAllBits.

.text:00000001C009E324 allocate_rtl_bitmap proc near
[...]
.text:00000001C009E333                 mov     ecx, 21h        ; PoolType = PagedPoolSession
.text:00000001C009E338                 cmp     edx, edi
.text:00000001C009E33A                 mov     r8d, 'osiU'     ; Tag = 'Uiso'
.text:00000001C009E340                 cmovnb  edi, edx        ; edi = 0xF0
.text:00000001C009E343                 mov     edx, edi
.text:00000001C009E345                 shr     edx, 3          ; edx = 0x1e
.text:00000001C009E348                 add     edx, 7          ; edx = 0x25
.text:00000001C009E34B                 and     edx, 0FFFFFFF8h ; edx = 0x20
.text:00000001C009E34E                 add     edx, 10h        ; NumberOfBytes = 0x30
.text:00000001C009E351                 call    cs:__imp_ExAllocatePoolWithTag ; allocs 0x30 bytes for a RTL_BITMAP
.text:00000001C009E357                 mov     rbx, rax
.text:00000001C009E35A                 test    rax, rax
.text:00000001C009E35D                 jz      short loc_1C009E386
.text:00000001C009E35F                 lea     rdx, [rax+10h]  ; BitMapBuffer (0x30 - 0x10 bytes)
.text:00000001C009E363                 mov     r8d, edi        ; SizeOfBitMap (number of bits) = 0xF0
.text:00000001C009E366                 mov     rcx, rax        ; BitMapHeader
.text:00000001C009E369                 call    cs:__imp_RtlInitializeBitMap
.text:00000001C009E36F                 mov     rcx, rbx        ; BitMapHeader
.text:00000001C009E372                 call    cs:__imp_RtlClearAllBits

Besides allocating this RTL_BITMAP structure, CSectionBitmapAllocator::Initialize() also generates a 64-bit random number, which is used as a XOR key to encode pointers to the View and RTL_BITMAP objects that were previously allocated:

.text:00000001C002DE38 private: bool NSInstrumentation::CSectionBitmapAllocator<163840, 640>::Initialize(unsigned char *) proc near
[...]
.text:00000001C002DE48                 rdtsc                   ; source for RtlRandomEx
.text:00000001C002DE4A                 shl     rdx, 20h
.text:00000001C002DE4E                 lea     rcx, [rsp+28h+arg_0]
.text:00000001C002DE53                 or      rax, rdx
.text:00000001C002DE56                 mov     [rsp+28h+arg_0], eax
.text:00000001C002DE5A                 call    cs:__imp_RtlRandomEx ; get a 32-bit random number
.text:00000001C002DE60                 mov     eax, eax
.text:00000001C002DE62                 lea     rcx, [rsp+28h+arg_0]
.text:00000001C002DE67                 shl     rax, 20h        ; shift eax to the higher part of RAX
.text:00000001C002DE6B                 mov     [rbx+10h], rax  ; CSectionBitmapAllocator->xor_key = random
.text:00000001C002DE6F                 call    cs:__imp_RtlRandomEx ; get another 32-bit random number
.text:00000001C002DE75                 mov     eax, eax
.text:00000001C002DE77                 or      [rbx+10h], rax  ; CSectionBitmapAllocator->xor_key |= another_random

The XORed pointers to the View and RTL_BITMAP objects are stored in the CSectionBitmapAllocator structure.

.text:00000001C002DEB8                 mov     rdx, [rbx+10h]  ; rdx = CSectionBitmapAllocator->xor_key
.text:00000001C002DEBC                 mov     rcx, rdx
.text:00000001C002DEBF                 xor     rcx, rax        ; rcx = CSectionBitmapAllocator->xor_key ^ RTL_BITMAP
.text:00000001C002DEC2                 mov     al, 1
.text:00000001C002DEC4                 xor     rdx, rdi        ; rdx = CSectionBitmapAllocator->xor_key ^ CSectionEntry->view
.text:00000001C002DEC7                 mov     [rbx+18h], rcx  ; CSectionBitmapAllocator->xored_rtl_bitmap = CSectionBitmapAllocator->xor_key ^ RTL_BITMAP
.text:00000001C002DECB                 mov     [rbx+8], rdx    ; CSectionBitmapAllocator->xored_view = CSectionBitmapAllocator->xor_key ^ CSectionEntry->view

Allocation

The win32kfull!NtGdiCreateBitmap() system call is in charge of creating GDI Bitmap objects. win32kfull!NtGdiCreateBitmap() calls win32kbase!GreCreateBitmap(), which in turn calls win32kbase!SURFMEM::bCreateDIB(). The job of win32kbase!SURFMEM::bCreateDIB() is to allocate memory for the SURFACE object. In previous versions of Windows, the pixel data buffer of a Bitmap was typically contiguous to the SURFACE header; it didn't have to be necessarily like that, but it was done that way. This made possible to "extend" a pixel data buffer by corrupting the sizlBitmap member of the SURFACE header, as explained before, and making it overlap with the SURFACE header of an adjacent Bitmap.

Starting from Windows 10 Fall Creators Update, win32kbase!SURFMEM::bCreateDIB ensures that the SURFACE header and the pixel data buffer are allocated separately, using the Type Isolation mitigation.

The pixel data buffer is allocated on the PagedPoolSession pool in a straightforward way, by calling a wrapper of nt!ExAllocatePoolWithTag:

SURFMEM::bCreateDIB+10B                  sub     r15d, r12d      ; alloc_size = requested_size - sizeof(SURFACE)
SURFMEM::bCreateDIB+10E                  jz      short loc_1C0038F91
SURFMEM::bCreateDIB+110                  call    cs:__imp_IsWin32AllocPoolImplSupported
SURFMEM::bCreateDIB+116                  test    eax, eax
SURFMEM::bCreateDIB+118                  js      loc_1C00C54D6
SURFMEM::bCreateDIB+11E                  mov     r8d, 'mbpG'                 ; Tag = 'Gpbm'
SURFMEM::bCreateDIB+124                  mov     edx, r15d                   ; NumberOfBytes = requested_size - sizeof(SURFACE)
SURFMEM::bCreateDIB+127                  mov     ecx, 21h                    ; PoolType = PagedPoolSession
SURFMEM::bCreateDIB+12C                  call    cs:__imp_Win32AllocPoolImpl ; <<< allocation! only for the pixel_data_buffer

On the other hand, the SURFACE header is now allocated from the CTypeIsolation structures described earlier, by calling CTypeIsolation::AllocateType(). To be precise, this allocation returns a buffer located on a View of a Section object:

SURFMEM::bCreateDIB+16C                  mov     rax, cs:uchar * * gpTypeIsolation
SURFMEM::bCreateDIB+173                  mov     rcx, [rax]
SURFMEM::bCreateDIB+176                  test    rcx, rcx
SURFMEM::bCreateDIB+179                  jz      loc_1C00C579D
SURFMEM::bCreateDIB+17F                  call    NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)
SURFMEM::bCreateDIB+184                  mov     rsi, rax        ; rsi = buffer for the SURFACE header
SURFMEM::bCreateDIB+187                  test    rax, rax        ; the returned buffer is a View of a Section object
SURFMEM::bCreateDIB+18A                  jz      loc_1C00C5791

By digging into the CTypeIsolation::AllocateType() function, we can see how the allocation algorithm works.

CTypeIsolation::AllocateType() traverses the list of CSectionEntry objects; for each CSectionEntry, it checks if its CSectionBitmapAllocator contains a clear bit in its backing RTL_BITMAP structure, by calling nt!RtlFindClearBits. It makes use of the bitmap_hint_index member of the CSectionBitmapAllocator to try and speed up the lookup.

.text:00000001C0039863                 mov     r8d, ebp        ; HintIndex = 0
.text:00000001C0039866                 cmp     eax, 0F0h       ; bitmap_hint_index >= RTL_BITMAP->size?
.text:00000001C003986B                 jnb     short loc_1C0039870
.text:00000001C003986D                 mov     r8d, eax        ; HintIndex = bitmap_hint_index
.text:00000001C0039870
.text:00000001C0039870 loc_1C0039870:                          ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+6Bj
.text:00000001C0039870                 mov     rcx, [rsi+18h]  ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039874                 mov     edx, 1          ; NumberToFind
.text:00000001C0039879                 xor     rcx, [rsi+10h]  ; BitMapHeader = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039879                                         ; ^ CSectionBitmapAllocator->xor_key
.text:00000001C003987D                 call    cs:__imp_RtlFindClearBits
.text:00000001C0039883                 mov     r12d, eax       ; r12 = free_bit_index
.text:00000001C0039886                 cmp     eax, 0FFFFFFFFh ; free_bit_index == -1?
.text:00000001C0039889                 jz      short loc_1C00398D6 ; if so, RTL_BITMAP is full, check another CSectionEntry

If nt!RtlFindClearBits returns -1, indicating that all the bits in the RTL_BITMAP are set to 1 (that is, the RTL_BITMAP is full), then it tries to repeat the operation with the next CSectionEntry on the list. We'll explore this case later. Otherwise, if nt!RtlFindClearBits returns a value different than -1, that means that the RTL_BITMAP had at least 1 clear bit, and therefore that the Section memory on the current CSectionEntry has at least 1 free slot for a SURFACE header.

So we need to map the index of the clear bit in the RTL_BITMAP -as returned by nt!RtlFindClearBits()- into the corresponding memory address of a free slot in a Section View. For this purpose, the index of the clear bit is divided by 6, since each 0x1000-byte View of the Section is capable of holding 6 SURFACE headers of size 0x280. The result is an index, which I call view_index in the disassembly snippets below. This view_index will be in the range [0, 0x27], since each Section is 0x28000 bytes in size, and so it can be divided into 0x28 Views of size 0x1000, and it's used to address one of the 0x28 possible Views of a Section.

This view_index is compared against the count of actually committed Views for the current Section, as held in the num_commited_views member of the CSectionBitmapAllocator object. As explained in MSDN [4], "no physical memory is allocated for a view until the virtual memory range is accessed". If view_index is smaller than the count of committed Views, then we don't need to commit a new View and we can go straight to the allocation. Otherwise, the address of the corresponding View is calculated (first_view + view_index * 0x1000) and committed to physical memory by calling nt!MmCommitSessionMappedView.

.text:00000001C003988B                 mov     eax, 0AAAAAAABh
.text:00000001C0039890                 mul     r12d
.text:00000001C0039893                 mov     eax, [rsi+24h]  ; eax = CSectionBitmapAllocator->num_commited_views
.text:00000001C0039896                 mov     r15d, edx       ; HI_DWORD(free_bit_index * 0xaaaaaaab) / 4 == free_bit_index / 6
.text:00000001C0039899                 shr     r15d, 2         ; r15d = view_index = free_bit_index / 6 (6 SURFACE headers fit in 0x1000 bytes)
.text:00000001C003989D                 cmp     r15d, eax       ; view_index < num_commited_views ?
.text:00000001C00398A0                 jb      loc_1C003998A   ; if so, no need to commit a new 0x1000-byte chunk from the View
.text:00000001C00398A6                 cmp     eax, 28h        ; num_commited_views >= MAX_VIEW_INDEX ?
.text:00000001C00398A9                 jnb     loc_1C003998A
.text:00000001C00398AF                 mov     rbp, [rsi+8]
.text:00000001C00398AF                                         ; rbp = CSectionBitmapAllocator->xored_view
.text:00000001C00398B3                 mov     edx, r15d       ; edx = view_index
.text:00000001C00398B6                 xor     rbp, [rsi+10h]  ; CSectionBitmapAllocator->xored_view ^ CSectionBitmapAllocator->xor_key
.text:00000001C00398BA                 shl     edx, 0Ch        ; view_index * 0x1000
.text:00000001C00398BD                 add     rbp, rdx        ; rbp = view + view_index * 0x1000
.text:00000001C00398C0                 mov     edx, 1000h      ; edx = size to commit
.text:00000001C00398C5                 mov     rcx, rbp        ; rcx = addr of view to commit
.text:00000001C00398C8                 call    cs:__imp_MmCommitSessionMappedView

After a successful commit, the 0x1000-byte View is initialized to 0 (this write operation ends up doing the actual commit), and the num_commited_views member of the CSectionBitmapAllocator is updated accordingly.

.text:00000001C0039975 loc_1C0039975:                          ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+D0j
.text:00000001C0039975                 xor     edx, edx        ; Val
.text:00000001C0039977                 mov     r8d, 1000h      ; Size
.text:00000001C003997D                 mov     rcx, rbp        ; Dst
.text:00000001C0039980                 call    memset          ; this memset actually commits the memory
.text:00000001C0039985                 inc     dword ptr [rsi+24h] ; CSectionBitmapAllocator->num_commited_views++
.text:00000001C0039988                 xor     ebp, ebp

Either if a new View had to be committed or not, the index of the clear bit of the RTL_BITMAP is then set to 1 by calling nt!RtlSetBit(), in order to mark that bit as busy. Curiously enough, the code calls nt!RtlTestBit() before setting the bit to 1, but the return value is not checked at all. Also, the bitmap_hint_index member of the CSectionBitmapAllocator is incremented by 1, resetting it to 0 if it happens to exceed the maximum value of 0xF0 - 1.

.text:00000001C003998A                 mov     rcx, [rsi+18h]  ; rcx = CsectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C003998E                 mov     edx, r12d       ; BitNumber = free bit index
.text:00000001C0039991                 xor     rcx, [rsi+10h]  ; BitMapHeader = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039991                                         ; ^ CSectionBitmapAllocator->xor_key
.text:00000001C0039995                 call    cs:__imp_RtlTestBit ; [!] return value not checked
.text:00000001C003999B                 mov     rcx, [rsi+18h]  ; rcx = CsectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C003999F                 mov     edx, r12d       ; BitNumber
.text:00000001C00399A2                 xor     rcx, [rsi+10h]  ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C00399A6                 call    cs:__imp_RtlSetBit
.text:00000001C00399AC                 inc     dword ptr [rsi+20h] ; CSectionBitmapAllocator->bitmap_hint_index++
.text:00000001C00399AF                 cmp     dword ptr [rsi+20h], 0F0h ; CSectionBitmapAllocator->bitmap_hint_index >= bitmap size?
.text:00000001C00399B6                 jnb     short loc_1C0039A27
[...]
.text:00000001C0039A27 loc_1C0039A27:                          ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+1B6j
.text:00000001C0039A27                 mov     [rsi+20h], ebp  ; CSectionBitmapAllocator->bitmap_hint_index = 0
.text:00000001C0039A2A                 jmp     short loc_1C00399B8

Now that we have mapped our clear bit into its corresponding View, we need to select a 0x280-byte chunk within that View. Each View can hold 6 SURFACE headers (0x1000 / 0x280 == 6). In order to do that, the following calculation is done: free_bit_index - view_index * 6, which simply equals free_bit_index % 6.

.text:00000001C00399B8                 mov     rax, [rsi+10h]  ; rax = CSectionBitmapAllocator->xor_key
.text:00000001C00399BC                 mov     ecx, r15d       ; ecx = view_index
.text:00000001C00399BF                 mov     rsi, [rsi+8]    ; rsi = CSectionBitmapAllocator->xored_view
.text:00000001C00399C3                 xor     edx, edx
.text:00000001C00399C5                 shl     ecx, 0Ch        ; ecx = view_index * 0x1000
.text:00000001C00399C8                 xor     rsi, rax        ; rsi = xored_view ^ xor_key
.text:00000001C00399CB                 add     rsi, rcx        ; rsi = view + view_index * 0x1000
.text:00000001C00399CE                 mov     rcx, rbx        ; rcx = CSectionBitmapAllocator->pushlock
.text:00000001C00399D1                 call    cs:__imp_ExReleasePushLockExclusiveEx
.text:00000001C00399D7                 call    cs:__imp_KeLeaveCriticalRegion
.text:00000001C00399DD                 lea     eax, [r15+r15*2] ; r15 == view_index
.text:00000001C00399E1                 add     eax, eax
.text:00000001C00399E3                 sub     r12d, eax       ; r12d = free_bit_index - view_index * 6 == free_bit_index % 6
.text:00000001C00399E6                 lea     ebx, [r12+r12*4]
.text:00000001C00399EA                 shl     ebx, 7          ; ebx = r12 * 0x5 * 0x80 == r12 * 0x280
.text:00000001C00399ED                 add     rbx, rsi        ; rbx += view + view_index * 0x1000

The value that RBX gets at 0x1C00399ED is the address of the newly allocated SURFACE header, and this is the value that will be returned by CTypeIsolation::AllocateType().

For the sake of completeness, and as promised, here's what happens when nt!RtlFindClearBits() returns -1, meaning that the RTL_BITMAP of the current CSectionEntry is full. In that case, the following conditional jump is taken:

.text:00000001C0039870                 mov     rcx, [rsi+18h]  ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C0039874                 mov     edx, 1          ; NumberToFind
.text:00000001C0039879                 xor     rcx, [rsi+10h]  ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C003987D                 call    cs:__imp_RtlFindClearBits
.text:00000001C0039883                 mov     r12d, eax       ; r12 = free_bit_index
.text:00000001C0039886                 cmp     eax, 0FFFFFFFFh ; free_bit_index == -1?
.text:00000001C0039889                 jz      short loc_1C00398D6 ; if so, RTL_BITMAP is full, check another CSectionEntry

That jump takes us here, where it checks if CSectionEntry->next == CTypeIsolation, meaning that we've reached the end of the list of CSectionEntry objects. If that's not the case, it loops and repeats the process with the next CSectionEntry object.

.text:00000001C00398D6 loc_1C00398D6:                          ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+89j
.text:00000001C00398D6                 lea     rcx, [rsp+48h+arg_0]
.text:00000001C00398DB                 call    NSInstrumentation::CAutoExclusiveCReaderWriterLock<NSInstrumentation::CPlatformReaderWriterLock>::~CAutoExclusiveCReaderWriterLock<NSInstrumentation::CPlatformReaderWriterLock>(void)
.text:00000001C00398E0 loc_1C00398E0:                          ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+1F0j
.text:00000001C00398E0                 mov     r14, [r14]      ; r14 = CSectionEntry->next
.text:00000001C00398E3                 mov     ebp, 0
.text:00000001C00398E8                 cmp     r14, r13        ; CSectionEntry->next == CTypeIsolation ?
.text:00000001C00398EB                 jnz     loc_1C0039843   ; if not, keep traversing the list

Otherwise, if we've reached the end of the list of CSectionEntry objects without finding an empty slot (that is, every CSectionEntry is holding its maximum of 0xF0 SURFACE headers), the following code is reached. As shown below, it creates a new CSectionEntry, and it calls CSectionBitmapAllocator::Allocate() on the CSectionBitmapAllocator member of this new CSectionEntry. As expected, CSectionBitmapAllocator::Allocate() mostly duplicates the procedure explained before: it finds a clear bit in the RTL_BITMAP, it commits the 0x1000-bytes View corresponding to said free bit, it marks that bit as busy in the RTL_BITMAP, and finally it returns the address of the newly created SURFACE header within the committed View.

.text:00000001C00398F1 loc_1C00398F1:                          ; CODE XREF: NSInstrumentation::CTypeIsolation<163840,640>::AllocateType(void)+3Dj
.text:00000001C00398F1                 xor     edx, edx        ; if we land here, that means that we finished traversing
.text:00000001C00398F1                                         ; the list of CSectionEntry, without finding an empty slot
.text:00000001C00398F3                 mov     rcx, rdi
.text:00000001C00398F6                 call    cs:__imp_ExReleasePushLockSharedEx
.text:00000001C00398FC                 call    cs:__imp_KeLeaveCriticalRegion
.text:00000001C0039902                 call    NSInstrumentation::CSectionEntry<163840,640>::Create(void)
.text:00000001C0039907                 mov     rdi, rax        ; rdi = new CSectionEntry
.text:00000001C003990A                 test    rax, rax
.text:00000001C003990D                 jz      short loc_1C003996D
.text:00000001C003990F                 mov     rcx, [rax+20h]  ; rcx = CSectionEntry->bitmap_allocator
.text:00000001C0039913                 call    NSInstrumentation::CSectionBitmapAllocator<163840,640>::Allocate(void) ; *** do the actual SURFACE header allocation
.text:00000001C0039918                 mov     rbp, rax        ; rbp = return value, allocated SURFACE header

Finally, the newly created CSectionEntry is inserted at the end of the doubly linked list, as detailed below. Notice that there is an integrity check before operating with the pointers of the list: the code verifies if the next pointer of CTypeIsolation->previous points to the CTypeIsolation head.

.text:00000001C0039939                 mov     rcx, [r13+8]    ; rcx = CTypeIsolation->previous
.text:00000001C003993D                 cmp     [rcx], r13      ; CTypeIsolation->previous->next == CTypeIsolation ?
.text:00000001C0039940                 jnz     FatalListEntryError_9 ; if not, the list is corrupted
.text:00000001C0039946                 mov     [rdi+8], rcx    ; CSectionEntry->previous = CTypeIsolation->previous
.text:00000001C003994A                 xor     edx, edx
.text:00000001C003994C                 mov     [rdi], r13      ; CSectionEntry->next = CTypeIsolation
.text:00000001C003994F                 mov     [rcx], rdi      ; CTypeIsolation->previous->next = CSectionEntry
.text:00000001C0039952                 mov     rcx, rbx
.text:00000001C0039955                 add     dword ptr [r13+18h], 0F0h ; CTypeIsolation->size += 0xF0
.text:00000001C003995D                 mov     [r13+8], rdi    ; CTypeIsolation->previous = CSectionEntry

Deallocation

Deallocation of SURFACE objects is done in the win32kbase!SURFACE::Free() function. This function starts by freeing the pool allocation that holds the pixel data buffer:

.text:00000001C002DC9A                 cmp     byte ptr [rbp+270h], 0 ; boolean is_kernel_mode_pixel_data_buffer
.text:00000001C002DCA1 loc_1C002DCA1:                          ; DATA XREF: .rdata:00000001C017D540o
.text:00000001C002DCA1                 mov     [rsp+48h+arg_8], rbx
.text:00000001C002DCA6                 jz      short loc_1C002DCCC    ; if byte[SURFACE+0x270] == 0, the pixel data buffer is not freed
.text:00000001C002DCA8                 mov     rbx, [rbp+48h]  ; rbx = SURFACE->pvScan0
.text:00000001C002DCAC                 test    rbx, rbx
.text:00000001C002DCAF                 jz      short loc_1C002DCCC
.text:00000001C002DCB1                 call    cs:__imp_IsWin32FreePoolImplSupported
.text:00000001C002DCB7                 test    eax, eax
.text:00000001C002DCB9                 js      short loc_1C002DCC4
.text:00000001C002DCBB                 mov     rcx, rbx
.text:00000001C002DCBE                 call    cs:__imp_Win32FreePoolImpl ; frees the pixel data buffer

After that, it takes the CTypeIsolation head and starts traversing the doubly linked list of CSectionEntry objects, trying to determine which CSectionEntry contains the SURFACE header that it's trying to free. In order to do this, it simply checks if CSectionEntry->view <= SURFACE <= CSectionEntry->view + 0x28000. Notice that there may be an error in this check, as it should probably be CSectionEntry->view <= SURFACE < CSectionEntry->view + 0x28000 (< instead of <= in the second comparison).

.text:00000001C002DCCC                 mov     rax, cs:uchar * * gpTypeIsolation
.text:00000001C002DCD3                 mov     rsi, [rax]      ; rsi = CTypeIsolation head
[...]
.text:00000001C002DD08                 mov     rbx, [rsi]      ; rbx = CTypeIsolation->next
.text:00000001C002DD0B                 cmp     rbx, rsi        ; next == CTypeIsolation ?
.text:00000001C002DD0E                 jz      loc_1C002DDFF   ; if so, there's no CSectionEntry
.text:00000001C002DD14                 mov     r12, 0CCCCCCCCCCCCCCCDh
.text:00000001C002DD1E                 xchg    ax, ax
.text:00000001C002DD20 loc_1C002DD20:                          ; CODE XREF: SURFACE::Free(SURFACE *)+C5j
.text:00000001C002DD20                 mov     r14, [rbx+20h]  ; r14 = CSectionEntry->bitmap_allocator
.text:00000001C002DD24                 mov     r8, [r14+10h]   ; r8 = bitmap_allocator->xor_key
.text:00000001C002DD28                 mov     rax, r8
.text:00000001C002DD2B                 xor     rax, [r14+8]    ; rax = xor_key ^ xored_view
.text:00000001C002DD2F                 cmp     rbp, rax        ; SURFACE < view?
.text:00000001C002DD32                 jb      short loc_1C002DD3F ; ...if so, skip to the next CSectionEntry
.text:00000001C002DD34                 add     rax, 28000h     ; view += section_size
.text:00000001C002DD3A                 cmp     rbp, rax        ; SURFACE <= end of last view?
.text:00000001C002DD3D                 jbe     short loc_1C002DD4C ; if so, we found the view containing the SURFACE header

When these conditions are satisfied, meaning that we've found the CSectionEntry containing the SURFACE header to be freed, the index of that SURFACE within its container View is calculated (called here index_within_view), by taking the 3 lower nibbles of the address of the SURFACE, and dividing it by 0x280:

.text:00000001C002DD4C loc_1C002DD4C:                          ; CODE XREF: SURFACE::Free(SURFACE *)+BDj
.text:00000001C002DD4C                 mov     rcx, rbp        ; rcx = SURFACE header
.text:00000001C002DD4F                 mov     rax, r12
.text:00000001C002DD52                 and     ecx, 0FFFh
.text:00000001C002DD58                 mul     rcx
.text:00000001C002DD5B                 mov     r15, rdx
.text:00000001C002DD5E                 shr     r15, 9          ; r15 = (SURFACE & 0xfff) / 0x280 == index_within_view
.text:00000001C002DD62                 lea     rax, [r15+r15*4]
.text:00000001C002DD66                 shl     rax, 7          ; rax = r15 * 0x5 * 0x80 == r15 * 0x280
.text:00000001C002DD6A                 sub     rcx, rax        ; if rcx == rax, it's ok
.text:00000001C002DD6D                 jnz     short loc_1C002DD3F

Then, the address of SURFACE needs to be mapped into the bit index that represents it in the RTL_BITMAP. In order to obtain the corresponding bit index, it obtains the view_index (that is, in which 0x1000-byte View this SURFACE object is located), and then it simply performs this calculation: view_index * 6 + index_within_view.

.text:00000001C002DD72                 mov     eax, ebp        ; eax = lo_dword(SURFACE)
.text:00000001C002DD74                 xor     ecx, [r14+8]    ; ecx = lo_dword(xor_key) ^ lo_dword(xored_view)
.text:00000001C002DD78                 sub     eax, ecx        ; eax = lo_dword(SURFACE) - lo_dword(view)
.text:00000001C002DD7A                 mov     rcx, [r14+18h]  ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C002DD7E                 shr     eax, 0Ch        ; eax /= 0x1000 == view_index
.text:00000001C002DD81                 xor     rcx, r8         ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C002DD84                 lea     eax, [rax+rax*2]
.text:00000001C002DD87                 lea     edx, [r15+rax*2] ; BitNumber = view_index * 6 + index_within_view
.text:00000001C002DD8B                 call    cs:__imp_RtlTestBit
.text:00000001C002DD91                 test    al, al
.text:00000001C002DD93                 jz      short loc_1C002DD3F ; bit is turned off?

The value of the calculated bit index is tested via nt!RtlTestBit(); if it's set to 1, as expected, then the execution flow continues in the code snippet below. As shown here, it calls CSectionBitmapAllocator::ContainsAllocation() (however the boolean value returned by this function is not checked at all), and then it clears the proper bit in the RTL_BITMAP by calling nt!RtlClearBit(), marking the slot as free. Finally, it clears the memory of the freed SURFACE header by calling memset(), and the bit index of the free slot is saved as the bitmap_hint_index, in order to speed up future operations.

.text:00000001C002DDA9                 mov     rdx, rbp        ; rdx = SURFACE header
.text:00000001C002DDAC                 mov     rcx, r14        ; rcx = bitmap_allocator
.text:00000001C002DDAF                 call    NSInstrumentation::CSectionBitmapAllocator<163840,640>::ContainsAllocation(void const *)
.text:00000001C002DDB4                 mov     ecx, [r14+8]    ; ecx = CSectionBitmapAllocator->xored_view
.text:00000001C002DDB8                 mov     eax, ebp        ; [!] return value from ContainsAllocation() is not checked
.text:00000001C002DDBA                 xor     ecx, [r14+10h]  ; CSectionBitmapAllocator->xored_view ^ CSectionBitmapAllocator->xor_key
.text:00000001C002DDBE                 sub     eax, ecx        ; eax = lo_dword(SURFACE) - lo_dword(view)
.text:00000001C002DDC0                 mov     rcx, [r14+18h]  ; rcx = CSectionBitmapAllocator->xored_rtl_bitmap
.text:00000001C002DDC4                 xor     rcx, [r14+10h]  ; BitMapHeader = xored_rtl_bitmap ^ xor_key
.text:00000001C002DDC8                 shr     eax, 0Ch        ; eax /= 0x1000 == view_index
.text:00000001C002DDCB                 lea     eax, [rax+rax*2]
.text:00000001C002DDCE                 lea     esi, [r15+rax*2]
.text:00000001C002DDD2                 mov     edx, esi        ; BitNumber = view_index * 6 + index_within_view
.text:00000001C002DDD4                 call    cs:__imp_RtlClearBit ; mark the slot as available
.text:00000001C002DDDA                 xor     edx, edx        ; Val
.text:00000001C002DDDC                 mov     r8d, 280h       ; Size
.text:00000001C002DDE2                 mov     rcx, rbp        ; Dst
.text:00000001C002DDE5                 call    memset          ; null-out the freed SURFACE header in the view
.text:00000001C002DDEA                 xor     edx, edx
.text:00000001C002DDEC                 mov     [r14+20h], esi  ; bitmap_allocator->bitmap_hint_index = index of freed slot

Windbg extension

While reverse engineering Win32k Type Isolation I developed a little WinDbg extension to help me dump the state of the Type Isolation structures. It is available at https://github.com/fdfalcon/TypeIsolationDbg.

The WinDbg extension provides the following commands:

!gptypeisolation [address] : prints the top-level CTypeIsolation structure (default address: win32kbase!gpTypeIsolation)
!typeisolation [address] : prints a NSInstrumentation::CTypeIsolation structure
!sectionentry [address] : prints a NSInstrumentation::CSectionEntry structure
!sectionbitmapallocator [address] : prints a NSInstrumentation::CSectionBitmapAllocator structure
!rtlbitmap [address] : prints a RTL_BITMAP structure

The output of the extension includes some clickable links to help you follow the Type Isolation data structures. It also decodes XORed pointers to save you a step. The following snippet shows the output of TypeIsolationDbg when dumping the global CTypeIsolation object and following the data structures for a single CSectionEntry, all the way down to the map of bits representing the busy/free state of the CSectionEntry's slots:

kd> !gptypeisolation
win32kbase!gpTypeIsolation is at address 0xffffe6cf95138a98.
Pointer [1] stored at win32kbase!gpTypeIsolation: 0xffffe6a4400006b0.
Pointer [2]: 0xffffe6a440000680.
NSInstrumentation::CTypeIsolation
      +0x000 next                                : 0xffffe6a440000620
      +0x008 previous                            : 0xffffe6a441d8ca20
      +0x010 pushlock                            : 0xffffe6a440000660
      +0x018 size                                : 0xF00 [number of section entries: 0x10]

kd> !sectionentry ffffe6a440000620
NSInstrumentation::CSectionEntry
      +0x000 next                                : 0xffffe6a441ca2470
      +0x008 previous                            : 0xffffe6a440000680
      +0x010 section                             : 0xffff86855f09f260
      +0x018 view                                : 0xffffe6a4403a0000
      +0x020 bitmap_allocator                    : 0xffffe6a4400005e0

kd> !sectionbitmapallocator ffffe6a4400005e0
NSInstrumentation::CSectionBitmapAllocator
      +0x000 pushlock                            : 0xffffe6a4400005c0
      +0x008 xored_view                          : 0xa410b31c3f332f4c [decoded: 0xffffe6a4403a0000]
      +0x010 xor_key                             : 0x5bef55b87f092f4c
      +0x018 xored_rtl_bitmap                    : 0xa410b31c3f092acc [decoded: 0xffffe6a440000580]
      +0x020 bitmap_hint_index                   : 0xC0
      +0x024 num_commited_views                  : 0x27

kd> !rtlbitmap ffffe6a440000580
RTL_BITMAP
      +0x000 size                                : 0xF0
      +0x008 bitmap_buffer                       : 0xffffe6a440000590

kd> dyb ffffe6a440000590 L20
                   76543210 76543210 76543210 76543210
                   -------- -------- -------- --------
ffffe6a4`40000590  00000101 00000000 00000110 10110000  05 00 06 b0
ffffe6a4`40000594  00011100 10000000 11011011 11110110  1c 80 db f6
ffffe6a4`40000598  01111101 11111111 11111111 11111111  7d ff ff ff
ffffe6a4`4000059c  11111111 11011111 11110111 01111111  ff df f7 7f
ffffe6a4`400005a0  11111111 11111111 11111111 01111111  ff ff ff 7f
ffffe6a4`400005a4  11111101 11111001 11111111 01101111  fd f9 ff 6f
ffffe6a4`400005a8  11111110 11111111 11111111 11111111  fe ff ff ff
ffffe6a4`400005ac  11111111 00000011 00000000 00000000  ff 03 00 00

Conclusion

The Type Isolation mitigation implemented in the Win32k component of Windows 10 1709 modifies the way GDI Bitmap objects are allocated in kernel space: the SURFACE header gets allocated on a Section View, while the pixel data buffer is allocated on the PagedPoolSession pool. This definitely eliminates the commodity exploitation technique of using Bitmaps as targets for limited memory corruption vulnerabilities, since it's not possible anymore to make an aligned spray of adjacent Bitmaps where the end of a pixel data buffer is immediately followed by the header of the next SURFACE object.

Meanwhile, exploit writers have already transitioned to other useful kernel objects, such as Palettes [5] [6] [7].

As a curiosity, the CSectionBitmapAllocator object keeps both the pointer to the Section Views and the pointer to the RTL_BITMAP obfuscated via a XOR operation, however the parent CSectionEntry structure keeps the same pointer to the Views in plain.

Thanks

A big thanks goes to my colleagues at Quarkslab for proof-reading this blogpost and providing feedback about it.

References

[1]	https://msdn.microsoft.com/en-us/library/dd183377(v=vs.85).aspx

[2]	https://www.coresecurity.com/system/files/publications/2016/10/Abusing-GDI-Reloaded-ekoparty-2016_0.pdf

[3]	https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/section-objects-and-views

[4]	https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/managing-memory-sections

[5]	https://sensepost.com/blog/2017/abusing-gdi-objects-for-ring0-primitives-revolution/

[6]	https://labs.bluefrostsecurity.de/files/Abusing_GDI_for_ring0_exploit_primitives_Evolution_Slides.pdf

[7]	http://theevilbit.blogspot.com/2017/10/abusing-gdi-objects-for-kernel.html

If you would like to learn more about our security audits and explore how we can help you, get in touch with us!

Table of contents