Breaking Secure Boot on the Silicon Labs Gecko platform

In this blog post, we present a new vulnerability on the Gecko Bootloader from Silicon Labs more precisely inside the OTA parser.

Introduction

Silicon Labs is a chip manufacturer with several network-targeted features like Bluetooth and Zigbee. These chips are the base of a large number of connected objects, and compromising them means compromising all of these connected objects insofar as they use the vulnerable functionality.

We decided to look into the open source SDK offered by Silicon Labs: the Gecko SDK (GSDK), in particular it's OTA functionality which seems to be state of the art of secure over-the-air updates.

This R&D work was carried out during Sami Babigeon's internship at Quarkslab, as part of his Master's degree program at the University of Rouen Normandie.

Background

The Gecko SDK provides many features such as radio abstraction layers for Bluetooth or other radio protocols like Zigbee. It also provides the Gecko Bootloader as a common bootloader for all the newer MCUs and wireless MCUs from Silicon Labs. The Gecko Bootloader can be configured to perform a variety of bootload functions, from device initialization to firmware upgrades. It uses a proprietary (but documented) format for its upgrade images, called GBL (Gecko Bootloader).

Overview of the OTA feature

The Gecko Bootloader has a two-stage design, where a minimal first stage bootloader is used to upgrade the main one. The first stage bootloader only contains functionality to read from and write to fixed addresses in internal flash. To perform a main bootloader upgrade, the running main bootloader verifies the integrity and authenticity of the upgrade image file. The running main bootloader then writes the upgrade image to a fixed location in flash and issues a reboot into the first stage bootloader. The first stage bootloader verifies the integrity of the main bootloader firmware upgrade image before copying the upgrade image to the main bootloader location.

Most of the OTA functionality is built into the Bluetooth stack and the Gecko Bootloader is only involved in verification and copying data from download area to its final destination in flash. When the OTA functionality is enabled the chip exposes a GATT service described by the following table:

Description	UUID	Length
OTA service	`1D14D6EE-FD63-4FA1-BFA4-8F47B42119F0`	-
OTA Control Attribute	`F7BF3564-FB6D-4E53-88A4-5E37E0326063`	1 byte
OTA Data Attribute	`984227F3-34FC-4045-A5D0-2C581F81A153`	up to 244 bytes

The OTA procedure is triggered when the client writes the value 0x00 to the OTA control attribute, which will reboot the device in DFU mode. Once in DFU mode, a remote Bluetooth device can upload a new firmware image by sending chunks of at most 244 bytes to the OTA data attribute. After the entire GBL file has been uploaded the client writes the value 0x03 to indicate that upload is finished. The device then reboots and if the image is correct, it installs the new firmware.

For additional security, Silicon Labs recommends to configure the Gecko bootloader to use Secure Boot and signed GBL images but it's not done by default.

GBL File Format

The GBL file format is a custom file format used by Silicon Labs for the firmware upgrade. It’s a TLV (Tag Length Value) format and contains several tags such as the ones described below:

GblTagHeader_t: GBL tag header. Must be the first element in all GBL tags.
GblHeader_t: GBL header tag type.
GblApplication_t: GBL application tag type.
GblBootloader_t: GBL bootloader tag type.
GblEnd_t: GBL end tag type.
GblEncryptionHeader_t: GBL encryption header tag type.

A valid GBL file must have certain tags in a specific order, some tags can appear multiple times while others must be used only once. Below is a schema explaining the order of the tags inside a GBL file:

Vulnerability research

For the vulnerability research phase of our project we decided to adopt a fuzzing approach, as it allows to discover bugs very efficiently. For this purpose we used AFL++ with Unicorn emulation support, to run the code that is responsible for parsing and flashing the GBL files. AFL offers the possibility to define custom mutators, which can be used to generate inputs for guided fuzzing.

We decided to use libprotobuf-mutator to generate GBL inputs with mutated fields. We created a set of messages corresponding to the tags of the GBL parser, then defined a set of mutations that can be applied by the fuzzer depending on some set weights. There several mutations possible:

Add a new randomly generated valid tag
Remove an existing tag at random
Mutate a field of a randomly chosen tag (it can be an invalid value)
Reset the field of a randomly chosen tag to a valid value
Reset the GBL file

Here is the GBL message type definition:

syntax = "proto2";

message Gbl {
    // GBL tag header. Must be the first element in all GBL tags
    message EblTagHeader_t {
        // Tag ID
        optional uint32 tagId = 1;
        // Length (in bytes) of the rest of the tag
        optional uint32 length = 2;
    }
    // GBL header tag type
    message GblHeader_t {
        optional EblTagHeader_t header = 1;
        // Version of the GBL spec used in this file
        optional uint32 version = 2;
        // Type of GBL
        optional uint32 type = 3;
    }
    ...
    // List of all the GBL tags
    message GblTag_t {
        oneof tag {
            // GBL header tag type
            GblHeader_t header = 1;
            // GBL application tag type
            GblApplication_t app = 2;
            ... 
            // GBL ECDSA secp256r1 signature tag type.
            GblSignatureEcdsaP256_t sig = 11;
        }
    }
    repeated GblTag_t tags = 2;
}

As you can see a GBL message is a list (repeated) of tags (oneof tag). We used this message definition to create and mutate the inputs. Here is the Python program that handles the mutations:

# Mutate a gbl input
def mutate_gbl(gbl):
    # List of all the mutations we can do
    mutations = [
        mutate_add_tag,
        mutate_remove_tag,
        mutate_tag,
        mutate_reset_tag,
        mutate_reset_gbl
    ]
    weights = [
        0.15,   # mutate_add_tag
        0.15,   # mutate_remove_tag
        0.55,   # mutate_tag_value
        0.1,    # mutate_reset_tag
        0.05    # mutate_reset_gbl
    ]
    # Choice a random mutation
    mutation = numpy.random.choice(mutations, p=weights)
    # Apply mutation
    return mutation(gbl)

def fuzz(buf, add_buf, max_size):
    """
        Called per fuzzing iteration.
    """
    gbl = gbl_pb2.Gbl()
    try:
        # Parse protobuf
        gbl.ParseFromString(bytes(buf))
    except:
        # Invalid serialize protobuf data. Don't mutate, return a zero length buffer
        return bytearray(b'')

    # Mutate protobuf
    gbl = mutate_gbl(gbl)
    # Convert protobuf to raw data
    raw = bytearray(gbl.SerializeToString()[0:max_size])
    return raw

We determined the weight to associate to each mutation empirically, trying to avoid costly mutations like mutate_reset_gbl too often.

Using this mutator, we were able to fuzz the file format efficiently and avoided losing time on inputs that would have been rejected by the parser. This approach is very efficient thanks to the coverage AFL++ is able to get from the emulator, which helps prioritizing inputs leading to new paths.

However this technique has a big drawback: we can't fully emulate the code as it tries to access low level hardware peripherals, and this makes the Unicorn emulator crash. To overcome this problem we had to prevent execution of some portions of the code such as those related to cryptography using Unicorn hooks, at the price of degrading performance.

The vulnerability (CVE-2023-4041)

During the fuzzing campaign we discovered an interesting crash caused by an unmapped read during the parsing of the Application Info tag. The original crash found by AFL caused the corruption of the r4 and r5 registers, which then lead to an unmapped memory access. The vulnerability allowed us to overwrite the PC register and by doing so, to gain remote code execution on the device.

The vulnerability is a classic buffer overflow on the stack, easy to exploit because there are no exploit mitigations, such as stack cookies, address space randomization, etc.

Now let's see the crash in details. The original GBL input looks like this:

[header {
  header {
    tagId: 61216747
    length: 8
  }
  version: 50331648
  type: 0
}
, app {
  header {
    tagId: 4094298868
    length: 90
    ^^^^^^^^^^
  }
  appInfo {
    type: 128
    version: 40125
    capabilities: 42508
    productId_upper: 2863265858
    productId_lower: 71213172157017827
  }
}
, boot {
  header {
    tagId: 4111010293
    length: 108
  }
  bootloaderVersion: 33685506
  address: 0
  data: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\307AAAAAAA"
}
, end {
  header {
    tagId: 4228121852
    length: 4
  }
  gblCrc: 3735928559
}
]

The interesting field is app.header.length, which is set to 90 bytes, meaning that after this field the rest of the tag data should be 90 bytes long. In reality it's only 28 byte long.

The code that parses the application info tag does the following:

1295 static int32_t parser_parseApplicationInfo(ParserContext_t   *parserContext,
1296                                            GblInputBuffer_t  *input,
1297                                            ImageProperties_t *imageProperties)
1298 {
1299   volatile int32_t retval;
1300   uint8_t tagBuffer[GBL_PARSER_BUFFER_SIZE];   // #define GBL_PARSER_BUFFER_SIZE 64UL
1301
1302   while (parserContext->offsetInTag < parserContext->lengthOfTag) {
1303     // This fonction copy parserContext->lengthOfTag byte of data from input to tagBuffer
1304     retval = gbl_getData(parserContext,
1305                          input,
1306                          tagBuffer,
1307                          parserContext->lengthOfTag,
1308                          true,
1309                          true);

This is one of the one rare places of the source tree where the length passed to a parsing function is not hard-coded or checked for sanity. Since we control the lengthOfTag field of the parserContext (as it is read from the file we upload) we can overwrite data on the stack until the saved program counter (PC), thus gaining code execution on the device.

Exploitation

Moving on to the exploitation step, we wanted to demonstrate that is possible to use this vulnerability to gain persistent code execution on the device. To accomplish that we have to somehow flash the code we want in a place where it is executed at each reboot. Our plan is to first overwrite the interrupt vector table located at 0x0.

In embedded devices, typically there is a table at memory offset 0x0 that describes which code to execute on reboot, crash, etc. By overwriting the table we would be able to change the entry point of the existing bootloader to point to our code. Unfortunately this approach is not possible because the board crashes during the memory flashing operations as low level peripherals try to access code that no longer exists.

Here is an overview of the interrupt table:

During our tests we found out that the only region in the flash memory that we could overwrite without crashing the device was the page at 0xa000, and fortunately for us, that's the place where the Reset_Handler (the main function in embedded world) is located. This means that we can replace the page contents with a branch instruction to our bootloader code, padding it with some NOP instructions, as we don't know the exact location of the Reset_Handler.

By leveraging a weakness in the way a firmware is updated it is possible to exploit the vulnerability to upload and run an unverified firmware, even if secure boot is enabled. When a new firmware is uploaded using the OTA mechanims, it is stored temporarily by flashing it to a fixed location and the first 4 bytes are overwritten with the value of 0xffffffff. This prevents the new firmware from being runnable in case of a power failure or an abnormal reboot condition. Once the firmware's signature is verified, it is moved to its permanent location, the overwritten bytes restored and a reboot triggered, so upon reboot it is verified and run

We can take advantage of the firmware update mechanism described above to gain persistent code execution of an unsigned firmware on the device by doing the following:

First, use the OTA feature to upload a custom crafted GBL file that contains the firmware we want to boot compiled with the correct base address, 0x1a000 in our case. It will also contain a small shellcode placed after the firmware that performs the following:
1. Deactivates all the interrupts using CORE_EnterCritical. This function prevent all interrupts except fault handlers, this way we unsure that nothing will disturb the execution of rest of the shellcode
2. Erases the page at 0xa000 on the flash where the Reset_Handler is located
3. Write a NOP sledge at this page, so when the board reboots it will execute NOP instructions until a real instruction is met. The first instruction after the NOP sledge will be a call to the Reset_Handler of the uploaded firmware.
4. Call the reset_withReason function to reset the board
Finally, the buffer overflow vulnerability is used to redirect code execution to our shellcode.

Here is a schema showing on the left the content of the flash after sending the unsigned firmware and on the right the content of the flash after the shellcode is executed:

Impact

The vulnerability impacts all the devices that are using the Gecko SDK version v4.3.0 and prior.

Mitigations

One existing mitigation is the use of encrypted firmware. When the OTA parser enforces encryption, the Application Tag is first decrypted using an AES key that the attacker doesn't know, so the payload results in scrambled shellcode, preventing the redirection of control flow.

Silicon Labs already patched the vulnerability as it was quite simple, the following check was added, on the parserContext->lengthOfTag field:

1295 static int32_t parser_parseApplicationInfo(ParserContext_t   *parserContext,
1296                                            GblInputBuffer_t  *input,
1297                                            ImageProperties_t *imageProperties)
1298 {
1299   volatile int32_t retval;
1300   uint8_t tagBuffer[GBL_PARSER_BUFFER_SIZE];   // #define GBL_PARSER_BUFFER_SIZE 64UL
1301
1302   if (parserContext->lengthOfTag != sizeof(ApplicationData_t)) {
1303       return BOOTLOADER_ERROR_PARSER_UNEXPECTED;
1304   }
1305

Conclusion

We discovered a bug in the parser of the OTA feature which can be used in combination with a weakness in the update mechanism to gain persistent code execution on the device, bypassing Secure Boot enforcement and firmware signature verification.

Disclosure timeline

2023-07-20 Quarkslab sent report via email to Silicon Labs.
2023-07-27 Quarkslab requested an acknowledgement of the report.
2023-07-28 Silicon Labs acknowledged the report and asked if there is a planned date for publication,
2023-07-28 Quarkslab replied that a date for publication is not set but it will surely be in the second week of September.
2023-07-28 Silicon Labs informed that it triage the bug and a fix and security advisory will be released on August 7th. Asked if Quarkslab would like them to delay their publication.
2023-07-29 Quarkslab replied that it does not want to delay publication of Silicon Labs advisory and fixes in any way.
2023-07-29 Silicon Labs informed that they assigned CVE ID CVE-2023-4041 to the vulnerability.
2023-08-08 Quarkslab asked if the security advisory was published.
2023-08-08 Silicon Labs informed that it had delayed publication and requested to coordinate publication of Quarkslab's blog post.
2023-08-08 Quarklab agreed to postpone publication until August 15th, pending more information.
2023-08-14 Silicon Labs informad that the fix is included in a new FSDK package that has other improvements and bug fixes and it being tested. They requested to delay publication of Quarkslab blog post to have more time for testing.
2023-08-15 Quarklab agreed to postpone publication until August 21st, pending more information.
2023-08-16 Silicon Labs released GSDK version 4.2.4.0 that fixes the issue and published a security advisory.
2023-08-21 Blog post is published.

If you would like to learn more about our security audits and explore how we can help you, get in touch with us!