An Experimental Study of Different Binary Exporters

This blog post presents a comparison between various disassembled binary exporters.

Disclaimer

All the tools presented in this blog post have been tested in accordance with the knowledge we had of them. We do not claim at all that our results are an accurate view of the state of the tools, and we probably missed features we did not know about. The figures should be seen as indicators and not as ground truth.

Introduction

Analyzing binaries programs often requires to disassemble them. The two most famous tools for this task are IDA Pro and the newer one from the NSA Ghidra. Even if really powerful, these tools are inadequate for running custom analyses on a disassembled binary or on multiple binaries at the same time. If the disassembler is not needed anymore, why bothering to keep it open and running in the background? This is actually costly, as each instance may eat up to a few hundreds of megabytes in RAM. The only necessary element is an export of the disassembled binary and this blog post presents an overview of the different exporters and disassemblers available.

For the rest of this article, an export of a binary is defined as a file which stores various information about the program. These data range from meta information (format, architecture, compilers identification) to more specific elements on the disassembled code itself (instructions, mnemonics) and intelligence gathered by the disassembler (x-references, symbols).

Disassemblers review

Overview

The first step to export a disassembled binary is to disassemble it. Numerous tools exist for this task, the most famous one being IDA, a commercial tool by HexRays. During the last years, other tools have been released (Binary Ninja, radare, Ghidra) with different ranges of features and prices. While this blog post does not conduct a complete review of all the existing tools, nor pretends to as it would be slippery, we still wanted to have a more informed opinion on the different options.

The table below lists some important tools to disassemble a complete binary and some of their features.

Tool name Authors OSS Lang Bindings Exporters Architectures
i386 ARM Mips PowerPC
angr UCSB Python-n/c
Ghidra NSAJavaPythonXML, BinExport
IDA Pro HexRaysn/cC, PythonBinExport
BAP CMUOCamlRust, Python, Cn/c(32)(32)(32)
ddisasm GrammaTechC++BashIn Protobuf(64)(64)
Macaw GaloisIncHaskell-n/c(32)
radare2 pancake (and community)CPythonn/c
Miasm CEA-SECPython-n/c
Binary Ninja Vector 35n/cC++, PythonJSON

As seen in the table above, where only active projects are listed, there is a broad range of tools available. It was not possible to compare all the binary disassembly tools, as our time was limited. We thus elected not to include the following:

  • Binary Ninja: we had no license for the tool;
  • McSema: it relies on IDA to perform the disassembling;
  • BAP: the python bindings are using a client/server model that is not really practical for our needs;
  • Pharos: tuned to be used for C++ disassembly;
  • Macaw: supports a limited set of architecture.

Note: Even if these tools have been left aside because they did not seem to fit our needs, they are nice pieces of engineering. We still encourage everyone to have a look at them. [1]

Binaries

To test the performances of the disassemblers, the three following programs were used, classified in three categories, small, medium and large. The selected binaries are:

  • Small: elf-Linux-x64-bash (~900KB) ELF file for x86-64 (source: Linux)
  • Medium: delta_generator (17MB) ELF file for x86 (source: Android Open Source Project)
  • Large: llvm-opt (34MB) ELF file for x86-64 (source: LLVM)

These programs were selected at random from programs available on our computers at the time of the tests. They are not supposed to have any outstanding features, just regular programs coming from widely used open-source projects.

Disassembly results

All tests were run on a Dell XPS 15 with an Intel® Core™ i7-6700HQ CPU @ 2.60GHz with an SSD and 16 Go RAM running Debian 10 (Buster).

small (900KB) medium (17MB) large (34MB)
time #inst #funcs time #inst #funcs time #inst #funcs
angr (8.19.7.25) 39.20s139,6078,470
Ghidra (9.1-dev) 23s132,4632,2813m9s213,5843,07329m41s4,935,68731,942
IDA (7.2) 4.22s133,0722,2548.33s285,4782,0054m41s4,960,29031,924
ddisasm 5.40s31,1531,19444.57s65,5496899m43s1,946,30624,696
radare2 (4.0.0) 15.72s95,74437430.3s19,5027612m19s535,0442,090
Miasm (0.1.1) 3m48s54,33422m30s60,65022h2m395,5802

Some notes on the table:

  • These disassembly figures should be handled with care as there is no ground truth results (in terms of instructions/functions count). Nonetheless, IDA/Ghidra results can be considered as a close approximation of the right results.
  • angr is a complex tool which performs control flow analysis through code emulation to correctly disassemble a binary. However, this implies to have some stubs [2] written to perform the emulation. While a lot of the necessary stubs are already available, some are still missing (hence the errors shown). To our knowledge, there is no method in angr to disassemble a binary without generating a CFG (see #1116).
  • Miasm needs the entry points of a program to start disassembling. It performs the disassembly from one entry point recursively until no instruction is found. The two functions found by the tool are actually the two entry points we specified (_start and main) [3].
  • The x86_64 version of the delta_generator program was used for ddisasm because x86 is not (yet) supported by the tool.
  • Both IDA and Ghidra stand out for the number of instructions and functions retrieved, with a little advantage to IDA for its disassembly speed. These results are not surprising since IDA and Ghidra are huge players with decades of experience.

Exporters

The following step is to export the disassembled program into a standalone file. The goal is to close the disassembler after the initial disassembly step, as its features are not needed anymore.

Overview

The list of exporters available for the tools tested in the first section is shown below.

Disassembler Exporter Project
IDA BinExport Exporter from Zynamics (mainly used for BinDiff)
Ghidra-IDA Ghidra's plugin to export a project from IDA to Ghidra
McSema Exporter for McSema lifter
Ghidra BinExport-Ghidra Experimental port of BinExport by C. Blichmann
Ghidra Built-in exporter from Ghidra
ddisasm ddisasm GrammaTech in-house tool to export a binary
  • Ghidra provides an IDA plugin to generate an XML file (and a raw data file) so the user can import them in Ghidra.
  • The widely used tool BinDiff uses BinExport, a Protobuf generated file, exported from IDA as a basis to perform its diffing. One of the authors of BinExport has started a port of the exporting feature on Ghidra (the proof-of-concept is available in his personal project on GitHub and worked really nicely so far).
  • ddisasm is able to parse a binary file and export a lot of information via a Protobuf file. The ultimate goal of the toolchain developed by GrammaTech is to do binary rewriting [4]. As a consequence, the exported features focus on the sole information useful for this task. This represents only a subset of all the available information.

Ignored exporters

We also found other exporters that were left out of this study:

  • Diaphora: The tool exports a binary to a SQLite database and is written in Python. A preliminary study has shown that the sqlite file is much larger (around 4-6 times) than the i64 and thus not compact enough for our needs.
  • YaCo: Plugin developed by the Direction Générale de l'Armement (DGA) for the YaTools suite. It does not export any information below the granularity of basic blocks (and only a hash of them). However, it is worth noticing as it is the only tool generating a FlatBuffers file.
  • bnida: Plugin used to port a project from IDA to Binary Ninja. It exports to a JSON file and is written in Python. It does not export any data on the content of the functions (just their names and address) nor below this granularity.

Exporter features

The table below details the various information exported by the different exporters selected. The results were gathered by analyzing the description of the protocol and actual exported files.

Note

To improve readability, explanations for ambiguous results (orange tildes) are provided as tooltips.

Exporters
BinExport McSema ddisasm Ghidra-XML
Metadata Name
Arch
ISA
Compiler
Layout Segments ~
Code layout ~
Symbols Name
Value
Type
Data Address
Type
Size
Name
Graph Call graph
CFG~
Comments Address
Type
Content
Functions Name ~
Demangled name
Type (I: , G: )
Argument count
Instructions Mnemonic
Operand
Operand type
Bytes
Address
Expressions
Xref (code, data)(, )
Basic block Start address ~
End address ~ (size)
Instructions list
Content ~ (indirect)
Strings Address (data)
Content (data)
Data types Structure
Enumerations

Important notes:

  • The goals of the different exporters are not identical, so they do not export the same type of information from a binary. While BinExport was designed to be a part of a diffing engine, ddisasm was designed to be a part of binary rewriting toolchain.
  • In the Ghidra-XML column, when the exported information varies between the IDA and Ghidra implementations, those differences are noted.

Two main strategies exist for exporters. The first one is to export disassembled instructions with information on their content (mnemonic, operands, expressions inside the operands). Using this strategy, the export itself is self-contained and no other tool is required to analyze it. The second strategy is to export only the raw bytes (of the instructions) themselves and leave the remaining disassembly work to another disassembler (e.g capstone). An export using this strategy will be more compact, but at the price of needing a helping tool to understand the content of the export. The choice of the strategy obviously depends on the final objective of the tool. It makes sense for Ghidra not to export disassembled instructions because they have their own disassembler, and for BinExport to export everything because BinDiff should be autonomous (and as fast as possible).

Full benchmark

This sections aims to compare with more details the exporters found for IDA and Ghidra. The results of the first section of this article comforted us to only consider those two disassemblers as they were more accurate.

We are also interested in comparing the performance of the built-in exporter of Ghidra against the plugin they offer for IDA. However, we choose not to include the experimental port of BinExport for Ghidra because it is still a work in progress and its performances are below the ones from IDA's version while exporting the same features.

Dataset

For the rest of the benchmarks, we gathered a dataset of various binaries coming from different sources. While our dataset is not exhaustive, it tries to mimic the diversity of programs a reverser could encounter. It gathers binaries of various architectures, files formats, size and bitness. The sources used are listed below [5] :

  • binary-samples: A test suite for binary analysis tools made by Jonathan Salwan
  • AOSP (Android Open Source Project): An open source operating system for mobile devices
  • LLVM: The compiler infrastructure project
Binary Name md5sum Architecture Format Binary size
x64_delta_generator8ad5f84d44b73289aa863c44aa7619e9x86_64ELF15.28
elf-Linux-x64-bash9a99d4a76f3f773f7ab5e9e3e482c213x86_64ELF904.82 KB
pe-Windows-x64-cmd5746bd7e255dd6a8afa06f7c42c1ba41x86_64PE337.00 KB
elf-Linux-lib-x64.so89a9ff6d56c3ad2ef9a185a17ef9f658x86_64ELF1.09 MB
busybox-mipsb55e00aa275948e6aea776028088c746MIPS-32ELF352.48 KB
clang-check4a3aec55b02c6b3fec39d0cdaaca483ex86_64ELF46.83 MB
elf-Linux-ARMv7-lsde9f91f9cd038989fec8abf25031b42barmv7ELF88.68 KB
MachO-OSX-x86-lsdf2580eaf51e15e23de3db979992af1ex86MachO34.86 KB
ts3server3c5c3e83dca78b4602148ce8643521e2x86_64ELF7.73 MB
busybox-powerpcbcfd1ebe98bf3519c3f2c9c14e0f9cf9PPC-32ELF1.10 MB
dex38.dex0acbdd5244d0726d0cbfb2d45d2f95a8-DEX11.48 KB
MachO-OSX-x64-lsd174dcfb35c14d5fcaa086d2c864ae61x86_64MachO38.66 KB
pe-Windows-x86-cmde52110456ec302786585656f220405ebx86PE294.50 KB
classes.dexe62eaf49283093501e7c7cbe9743a0f7-DEX3.53 MB
wpa_supplicantaa782fa15d1265b0d8cfc00b6f883187x86ELF21.64 MB
ctags48644ed9bbb64c22ee538cbe99481f21x86_64ELF4.59 MB
crackmips9416c32035cf2f2da41876e1c9411850MIPS-32ELF25.54 KB
llvm-optf0d325ba8ebbe72aad180c8cab6de09cx86_64ELF33.83 MB
elf-Linux-x86-bashb5bfc5bc405340bcc5050756ac92cf45x86ELF792.14 KB
delta_generatorc2bd1c45f4647932e85561a42e0cbbb4x86ELF16.49 MB
mdbook9c405c56cf9c05e0a25766f6639cd5cax86_64ELF10.67 MB
elf-Linux-ARM64-bash086f3ad932f5b1bcf631b17b33b0bb0aarmv8ELF827.54 KB
elf-Linux-lib-x86.sodf9fd3ec63ac207b9fa193b8dcea7eb7x86ELF1.08 MB
elf-Linux-Mips4-bash628f094cff8ec9d9e36c5b94460c7454MIPS-32ELF882.38 KB
MachO-iOS-armv7-armv7s-arm64-Helloworld750338e86da4e5c8c318b885ba341d82armv7, armv8MachO299.06 KB
MachO-iOS-armv7s-Helloworld5ae2549bda51d826a51e97c03fb06f73armv7MachO89.64 KB

The graph above shows the number of instructions per program in the dataset. If most of our test suite is made of programs with less than a million instructions, a few large binaries were also included, to better understand how the exporters and disassemblers scaled. As we need to plot large ranges of values in the same graph, most of the curves looks flat for the first points. [6]

Disassembly time

The first metrics we were interested in is the disassembly time, defined as the duration of the automatic analysis. We knew that IDA was faster than Ghidra, but we wanted to measure to what extent.

The results are impressive, Ghidra is much slower than IDA (up to 13 times slower for large binaries). Even if the disassembly step is a one time process, the performances of Ghidra are problematic for scalability. Nevertheless, it should be noted that the results are biased, because Ghidra performs an additional decompilation step.

Export time and size

The first section helped us to draw an overview of the available exporters. Another interesting metrics is the export time for the following disassemblers/exporters pairs:

  • IDA + BinExport
  • IDA + Ghidra XML
  • Ghidra + XML

We chose to keep only those exporters because they were running on the disassemblers we selected, and had an interesting set of exported features. They also had a good support for Ghidra, and BinDiff has been used for years in the community without issues. We may also note that they use different exporting strategies: Ghidra does not export any information on instructions while BinExport decomposes every operand of each instruction and exports them.

The export size of a program is far greater than the program itself for both tools. While BinExport produces a single Protobuf file, Ghidra generates two files, one XML with all the information and a raw byte file containing all the code of the exported binary. The figures on the graph represent the sum of the size of these two files.

Program Size i64 BinExport IDA-XML Ghidra-XML
elf-Linux-x64-bash908 KB11 MB4.2 MB4.9 MB7.1 MB
ts3server7.8 MB58 MB20 MB19 MB64.8 MB
llvm-opt34 MB300 MB144 MB127 MB202 MB

We observe that the size of the export for BinExport and XML is roughly the same. However, BinExport exports a lot more information on the binary than Ghidra. Remember that Ghidra does not export any information on the instructions themselves neither on the basic blocks besides their contents (i.e. raw bytes). The sizes of the exported files remain equivalent because of optimizations made by BinExport: the format is specifically designed for compactness (e.g. there is an extensive usage of deduplications tables) and the export file uses a binary serialization protocol, namely Protobuf. This will be further discussed in the next section.

The table above also includes the sizes of the database generated by IDA, the i64 file, which is much larger than any of the exported file considered in this study.

Full export

To summarize the results from the previous tests, we plot hereafter a graph explaining the time spent in the three phases of the export process:

  • Disassembly phase: disassembling the binary
  • Export phase: generating the export files
  • Deserialization / Loading phase: Importing the exported file in Python

This graph shows that the deserialization time can to become non-negligeable with the Protobuf format for large binaries (here mdbook). This observation led us to the next section which explores various binary serialization formats to find which one is the more suitable for our needs.

Experiments on binary serialization formats

Introduction

Numerous formats exist [7] for serialization because not all usages (e.g persistent storage, RPC communication, data transfer, ...) require the same set of features. One may want to have the data stored in a "human-readable" way (i.e as text), have a fast-access time, or a compact storage size. For program serialization, we need a trade-off between a compact disk usage, a reasonable deserialization time and a low memory footprint. Since a readable format is not needed and disk usage is a concern, binary serialization formats seemed more appropriate, as opposed to text formats (e.g. JSON, XML).

Binary serialization formats

In this section, we will focus on three formats used for binary serialization:

  • Protobuf: A format developed (and extensively used) by Google for serializing structured data.
  • FlatBuffers: Another format developed by Google to serialize data. Mostly used for performance critical applications.
  • Cap'n Proto: A format developed by Kenton Varda (tech lead of Protobuf while he was working at Google) for Sandstorm.

All these formats use a custom schema definition language to explain how the data will be formatted on the wire. Even if this blog post does not intend to be a crash course on data serialization, nor a tutorial on how to write a schema for the three protocols, the syntax of a basic message is shown below.

message Meta {
  optional string executable_name = 1;
  optional string executable_id = 2;
  optional string architecture_name = 3;
  optional int64 timestamp = 4;
  }
        
struct Meta {
    executableName @0 :Text;
    executableId @1 :Text;
    architectureName @2 :Text;
    timestamp @3 :UInt64;
}
            
table Meta {
  executable_name:string;
  executable_id:string;
  architecture_name:string;
  timestamp:long;
  }
            

The main difference between these formats is how they store data on the wire. Protobuf, the oldest one, uses an encoding/packing step which transforms the input on the wire. This allows Protobuf to be more compact because the encoding step reduces the amount of bytes needed to store an object (see Encoding in Protobuf documentation). However, both FlatBuffers and Cap'n Proto use a 'zero-copy' strategy, meaning that the data on the wire is structured the same way as it is in the memory. The main advantage of this technique is to nullify the time needed to decode the object because no decoding step is performed.

Another huge difference between FlatBuffers/Cap'n Proto and Protobuf is the ability to perform random access reads (the ability to read a specific part of the message without reading the whole message before). With Protobuf this is not possible because the message needs to be parsed upfront (and memory allocated). However, both FlatBuffers and Cap'n Proto implement this feature using pointers, allowing fast access to part of the message.

Allocation (i.e how to write message) has to be done bottom-up for FlatBuffers because a message must be finished before another one is started. This limitation does not apply to Protobuf (because all the message is written at the end) and Cap'n Proto (because the size of an object is known when allocated).

The final difference we will go through is how unset fields (i.e. fields with no values for this specific message) are stored on the wire. Both Protobuf and FlatBuffers do not allocate them while Cap'n Proto still do. This leads to a waste of space for Cap'n Proto.

Benchmarks

For these benchmarks, we translated the BinExport Protobuf into a FlatBuffers and a Cap'n Proto schema. The translation was done manually for Cap'n Proto and using the option --proto of flatc for FlatBuffers (plus some minor revisions). We do not pretend to have fully optimized the new schemes using all the features of the two serializations formats but believe this still leads to an informative comparison.

First, we want to compare how big the exported files are compared to the binaries themselves. This size is represented by the dashed line and is linear (\(y=x\)).

We see that the size of the exported file grows non-linearly with the size of the binary. The following graph shows the ratio between the size of the exported file and the size of the binary.

We see that Protobuf is much more compact than the two others (the encoding step is crucial for this part) and the ratio skyrockets for specific binaries. There is still room for improvement in the export size for the two other protocols, mostly by having a better understanding of the ranges of the different values. With Protobuf, one may declare every integers as 64-bits wide integers, the serialization algorithm will only write on the wire the varint encoded value of the number (a reduction up to a scale factor of 8 for the 127 first values). However, with Cap'n Proto and FlatBuffers, the value would need to be 64 bits long anyway.

Another interesting point to study is how much memory is used for loading the serialized file in Python. (Note: using the memory_profiler [8] module to retrieve memory usage.)

As expected, the memory needed to load the export of a binary is much more important for Protobuf. For example, for llvm-opt the Protobuf file is around 150MB but the loading takes around 1.8 Go of RAM.

The last metrics we want to consider is how much time is needed to load an export file in Python from the three files format.

As expected, the Protobuf format takes a lot of time to be deserialized. Cap'n Proto and FlatBuffers have similar performances, mostly because they are based on the same patterns.

Note

  • We could have reduced the size of the exported file for Cap'n Proto by applying their 'packed' algorithm. However, this removes the interesting property of having a 'zero-copy' protocol. More experiments are still needed to understand if this would be a better option than Protobuf.
  • Compressing the exported file using well-known algorithms could also be a viable strategy for Cap'n Proto and FlatBuffers as it would also reduce the size of the exported file. However, this option adds some time upfront, as it requires to decompress the file before using it. It is not applicable to Protobuf because the format is already compact.

Conclusion

Exporting as many data as possible from a binary is interesting not in itself but as a basis for other applications, like features extraction for machine learning algorithms, graph traversing algorithms, or fast access to functions / blocks / instructions based on user defined criteria.

This blog post explored different options to export a disassembled program from a disassembler using available exporters. To the best of our knowledge, the most complete exporter available is BinExport as it exports a lot of information while remaining compact thanks to the serialization format used, Protobuf. Nonetheless, there is still room for improvement for binary exporters as none of the explored solutions answered all our scalability needs.

Changelog

  • 09.25.19 : Update the results for radare using the last version (from 3.2.1 to 4.0.0)
[1]If any mistake were to be found, do not hesitate to contact us.
[2]A stub (the SimProcedure) in angr is an helper function written to emulate an external function (e.g a library function).
[3]We used a derivative of the script found here: https://github.com/cea-sec/miasm/blob/master/example/disasm/full.py
[4]https://blogs.grammatech.com/open-source-tools-for-binary-analysis-and-rewriting
[5]Although none of the programs used were chosen because of inner specificities, the dataset is available upon request (e.g one wants to bench another disassembler/exporter).
[6]The graphs presented are interactive: it is possible to zoom on parts of the graph, to change the scale factors or to hover points to have the precise values.
[7]https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats
[8]https://github.com/pythonprofilers/memory_profiler

Comments