This blog post presents a comparison between various disassembled binary exporters.
Disclaimer
All the tools presented in this blog post have been tested in accordance with the knowledge we had of them. We do not claim at all that our results are an accurate view of the state of the tools, and we probably missed features we did not know about. The figures should be seen as indicators and not as ground truth.
Introduction
Analyzing binaries programs often requires to disassemble them. The two most famous tools for this task are IDA Pro and the newer one from the NSA Ghidra. Even if really powerful, these tools are inadequate for running custom analyses on a disassembled binary or on multiple binaries at the same time. If the disassembler is not needed anymore, why bothering to keep it open and running in the background? This is actually costly, as each instance may eat up to a few hundreds of megabytes in RAM. The only necessary element is an export of the disassembled binary and this blog post presents an overview of the different exporters and disassemblers available.
For the rest of this article, an export of a binary is defined as a file which stores various information about the program. These data range from meta information (format, architecture, compilers identification) to more specific elements on the disassembled code itself (instructions, mnemonics) and intelligence gathered by the disassembler (x-references, symbols).
Disassemblers review
Overview
The first step to export a disassembled binary is to disassemble it. Numerous tools exist for this task, the most famous one being IDA, a commercial tool by HexRays. During the last years, other tools have been released (Binary Ninja, radare, Ghidra) with different ranges of features and prices. While this blog post does not conduct a complete review of all the existing tools, nor pretends to as it would be slippery, we still wanted to have a more informed opinion on the different options.
The table below lists some important tools to disassemble a complete binary and some of their features.
Tool name | Authors | OSS | Lang | Bindings | Exporters | Architectures | |||
---|---|---|---|---|---|---|---|---|---|
i386 | ARM | Mips | PowerPC | ||||||
angr | UCSB | Python | - | n/c | |||||
Ghidra | NSA | Java | Python | XML, BinExport | |||||
IDA Pro | HexRays | n/c | C, Python | BinExport | |||||
BAP | CMU | OCaml | Rust, Python, C | n/c | (32) | (32) | (32) | ||
ddisasm | GrammaTech | C++ | Bash | In Protobuf | (64) | (64) | |||
Macaw | GaloisInc | Haskell | - | n/c | (32) | ||||
radare2 | pancake (and community) | C | Python | n/c | |||||
Miasm | CEA-SEC | Python | - | n/c | |||||
Binary Ninja | Vector 35 | n/c | C++, Python | JSON | |||||
JEB | PNF Software | n/c | Java, Python | JSON, C |
As seen in the table above, where only active projects are listed, there is a broad range of tools available. It was not possible to compare all the binary disassembly tools, as our time was limited. We thus elected not to include the following:
Binary Ninja: we had no license for the tool;
McSema: it relies on IDA to perform the disassembling;
BAP: the python bindings are using a client/server model that is not really practical for our needs;
Pharos: tuned to be used for C++ disassembly;
Macaw: supports a limited set of architecture.
Note: Even if these tools have been left aside because they did not seem to fit our needs, they are nice pieces of engineering. We still encourage everyone to have a look at them. [1]
Binaries
To test the performances of the disassemblers, the three following programs were used, classified in three categories, small, medium and large. The selected binaries are:
Small: elf-Linux-x64-bash (~900KB) ELF file for x86-64 (source: Linux)
Medium: delta_generator (17MB) ELF file for x86 (source: Android Open Source Project)
Large: llvm-opt (34MB) ELF file for x86-64 (source: LLVM)
These programs were selected at random from programs available on our computers at the time of the tests. They are not supposed to have any outstanding features, just regular programs coming from widely used open-source projects.
Disassembly results
All tests were run on a Dell XPS 15 with an Intel® Core™ i7-6700HQ CPU @ 2.60GHz with an SSD and 16 Go RAM running Debian 10 (Buster).
small (900KB) | medium (17MB) | large (34MB) | |||||||
---|---|---|---|---|---|---|---|---|---|
time | #inst | #funcs | time | #inst | #funcs | time | #inst | #funcs | |
angr (8.19.7.25) | 39.20s | 139,607 | 8,470 | ||||||
Ghidra (9.1-dev) | 23s | 132,463 | 2,281 | 3m9s | 213,584 | 3,073 | 29m41s | 4,935,687 | 31,942 |
IDA (7.2) | 4.22s | 133,072 | 2,254 | 8.33s | 285,478 | 2,005 | 4m41s | 4,960,290 | 31,924 |
ddisasm | 5.40s | 31,153 | 1,194 | 44.57s | 65,549 | 689 | 9m43s | 1,946,306 | 24,696 |
radare2 (4.0.0) | 15.72s | 95,744 | 374 | 30.3s | 19,502 | 76 | 12m19s | 535,044 | 2,090 |
Miasm (0.1.1) | 3m48s | 54,334 | 2 | 2m30s | 60,650 | 2 | 2h2m | 395,580 | 2 |
JEB (3.7.0) | 28.34s | 132,628 | 2,809 | 51.58s | 284,936 | 2,323 | 18min43 | 4,963,729 | 51,901 |
Some notes on the table:
These disassembly figures should be handled with care as there is no ground truth results (in terms of instructions/functions count). Nonetheless, IDA/Ghidra results can be considered as a close approximation of the right results.
angr is a complex tool which performs control flow analysis through code emulation to correctly disassemble a binary. However, this implies to have some stubs [2] written to perform the emulation. While a lot of the necessary stubs are already available, some are still missing (hence the errors shown). To our knowledge, there is no method in angr to disassemble a binary without generating a CFG (see #1116).
Miasm needs the entry points of a program to start disassembling. It performs the disassembly from one entry point recursively until no instruction is found. The two functions found by the tool are actually the two entry points we specified (_start and main) [3].
The x86_64 version of the delta_generator program was used for ddisasm because x86 is not (yet) supported by the tool.
JEB uses a slightly broader notion of functions where every instruction is assigned to a function (while IDA leaves orphaned instructions). New functions are also created for exception handlers and per switch target if it does not succeed in reconstructing it. These are factors which explain the difference in function numbers. In terms of retrieved instruction numbers it performs as well as IDA and Ghidra while being more similar to Ghidra in terms of speed.
Both IDA and Ghidra stand out for the number of instructions and functions retrieved, with a little advantage to IDA for its disassembly speed. These results are not surprising since IDA and Ghidra are huge players with decades of experience.
Exporters
The following step is to export the disassembled program into a standalone file. The goal is to close the disassembler after the initial disassembly step, as its features are not needed anymore.
Overview
The list of exporters available for the tools tested in the first section is shown below.
Disassembler | Exporter | Project |
---|---|---|
IDA | BinExport | Exporter from Zynamics (mainly used for BinDiff) |
Ghidra-IDA | Ghidra's plugin to export a project from IDA to Ghidra | |
McSema | Exporter for McSema lifter | |
Ghidra | BinExport-Ghidra | Experimental port of BinExport by C. Blichmann |
Ghidra | Built-in exporter from Ghidra | |
ddisasm | ddisasm | GrammaTech in-house tool to export a binary |
Ghidra provides an IDA plugin to generate an XML file (and a raw data file) so the user can import them in Ghidra.
The widely used tool BinDiff uses BinExport, a Protobuf generated file, exported from IDA as a basis to perform its diffing. One of the authors of BinExport has started a port of the exporting feature on Ghidra (the proof-of-concept is available in his personal project on GitHub and worked really nicely so far).
ddisasm is able to parse a binary file and export a lot of information via a Protobuf file. The ultimate goal of the toolchain developed by GrammaTech is to do binary rewriting [4]. As a consequence, the exported features focus on the sole information useful for this task. This represents only a subset of all the available information.
Ignored exporters
We also found other exporters that were left out of this study:
Diaphora: The tool exports a binary to a SQLite database and is written in Python. A preliminary study has shown that the sqlite file is much larger (around 4-6 times) than the i64 and thus not compact enough for our needs.
YaCo: Plugin developed by the Direction Générale de l'Armement (DGA) for the YaTools suite. It does not export any information below the granularity of basic blocks (and only a hash of them). However, it is worth noticing as it is the only tool generating a FlatBuffers file.
bnida: Plugin used to port a project from IDA to Binary Ninja. It exports to a JSON file and is written in Python. It does not export any data on the content of the functions (just their names and address) nor below this granularity.
JEB: JEB has a built-in exporter that exports the disassembled (and decompiled) code as C-code in files. While interesting, this approach is not really suitable for our purposes.
Exporter features
The table below details the various information exported by the different exporters selected. The results were gathered by analyzing the description of the protocol and actual exported files.
Note
To improve readability, explanations for ambiguous results (orange tildes) are provided as tooltips.
Exporters | |||||
---|---|---|---|---|---|
BinExport | McSema | ddisasm | Ghidra-XML | ||
Metadata | Name | ||||
Arch | |||||
ISA | |||||
Compiler | |||||
Layout | Segments | ||||
Code layout | |||||
Symbols | Name | ||||
Value | |||||
Type | |||||
Data | Address | ||||
Type | |||||
Size | |||||
Name | |||||
Graph | Call graph | ||||
CFG | |||||
Comments | Address | ||||
Type | |||||
Content | |||||
Functions | Name | ||||
Demangled name | |||||
Type | (I: , G: ) | ||||
Argument count | |||||
Instructions | Mnemonic | ||||
Operand | |||||
Operand type | |||||
Bytes | |||||
Address | |||||
Expressions | |||||
Xref (code, data) | (, ) | ||||
Basic block | Start address | ||||
End address | (size) | ||||
Instructions list | |||||
Content | (indirect) | ||||
Strings | Address | (data) | |||
Content | (data) | ||||
Data types | Structure | ||||
Enumerations |
Important notes:
The goals of the different exporters are not identical, so they do not export the same type of information from a binary. While BinExport was designed to be a part of a diffing engine, ddisasm was designed to be a part of binary rewriting toolchain.
In the Ghidra-XML column, when the exported information varies between the IDA and Ghidra implementations, those differences are noted.
Two main strategies exist for exporters. The first one is to export disassembled instructions with information on their content (mnemonic, operands, expressions inside the operands). Using this strategy, the export itself is self-contained and no other tool is required to analyze it. The second strategy is to export only the raw bytes (of the instructions) themselves and leave the remaining disassembly work to another disassembler (e.g capstone). An export using this strategy will be more compact, but at the price of needing a helping tool to understand the content of the export. The choice of the strategy obviously depends on the final objective of the tool. It makes sense for Ghidra not to export disassembled instructions because they have their own disassembler, and for BinExport to export everything because BinDiff should be autonomous (and as fast as possible).
Full benchmark
This sections aims to compare with more details the exporters found for IDA and Ghidra. The results of the first section of this article comforted us to only consider those two disassemblers as they were more accurate.
We are also interested in comparing the performance of the built-in exporter of Ghidra against the plugin they offer for IDA. However, we choose not to include the experimental port of BinExport for Ghidra because it is still a work in progress and its performances are below the ones from IDA's version while exporting the same features.
Dataset
For the rest of the benchmarks, we gathered a dataset of various binaries coming from different sources. While our dataset is not exhaustive, it tries to mimic the diversity of programs a reverser could encounter. It gathers binaries of various architectures, files formats, size and bitness. The sources used are listed below [5] :
binary-samples: A test suite for binary analysis tools made by Jonathan Salwan
AOSP (Android Open Source Project): An open source operating system for mobile devices
LLVM: The compiler infrastructure project
Binary Name | md5sum | Architecture | Format | Binary size |
---|---|---|---|---|
x64_delta_generator | 8ad5f84d44b73289aa863c44aa7619e9 | x86_64 | ELF | 15.28 |
elf-Linux-x64-bash | 9a99d4a76f3f773f7ab5e9e3e482c213 | x86_64 | ELF | 904.82 KB |
pe-Windows-x64-cmd | 5746bd7e255dd6a8afa06f7c42c1ba41 | x86_64 | PE | 337.00 KB |
elf-Linux-lib-x64.so | 89a9ff6d56c3ad2ef9a185a17ef9f658 | x86_64 | ELF | 1.09 MB |
busybox-mips | b55e00aa275948e6aea776028088c746 | MIPS-32 | ELF | 352.48 KB |
clang-check | 4a3aec55b02c6b3fec39d0cdaaca483e | x86_64 | ELF | 46.83 MB |
elf-Linux-ARMv7-ls | de9f91f9cd038989fec8abf25031b42b | armv7 | ELF | 88.68 KB |
MachO-OSX-x86-ls | df2580eaf51e15e23de3db979992af1e | x86 | MachO | 34.86 KB |
ts3server | 3c5c3e83dca78b4602148ce8643521e2 | x86_64 | ELF | 7.73 MB |
busybox-powerpc | bcfd1ebe98bf3519c3f2c9c14e0f9cf9 | PPC-32 | ELF | 1.10 MB |
dex38.dex | 0acbdd5244d0726d0cbfb2d45d2f95a8 | - | DEX | 11.48 KB |
MachO-OSX-x64-ls | d174dcfb35c14d5fcaa086d2c864ae61 | x86_64 | MachO | 38.66 KB |
pe-Windows-x86-cmd | e52110456ec302786585656f220405eb | x86 | PE | 294.50 KB |
classes.dex | e62eaf49283093501e7c7cbe9743a0f7 | - | DEX | 3.53 MB |
wpa_supplicant | aa782fa15d1265b0d8cfc00b6f883187 | x86 | ELF | 21.64 MB |
ctags | 48644ed9bbb64c22ee538cbe99481f21 | x86_64 | ELF | 4.59 MB |
crackmips | 9416c32035cf2f2da41876e1c9411850 | MIPS-32 | ELF | 25.54 KB |
llvm-opt | f0d325ba8ebbe72aad180c8cab6de09c | x86_64 | ELF | 33.83 MB |
elf-Linux-x86-bash | b5bfc5bc405340bcc5050756ac92cf45 | x86 | ELF | 792.14 KB |
delta_generator | c2bd1c45f4647932e85561a42e0cbbb4 | x86 | ELF | 16.49 MB |
mdbook | 9c405c56cf9c05e0a25766f6639cd5ca | x86_64 | ELF | 10.67 MB |
elf-Linux-ARM64-bash | 086f3ad932f5b1bcf631b17b33b0bb0a | armv8 | ELF | 827.54 KB |
elf-Linux-lib-x86.so | df9fd3ec63ac207b9fa193b8dcea7eb7 | x86 | ELF | 1.08 MB |
elf-Linux-Mips4-bash | 628f094cff8ec9d9e36c5b94460c7454 | MIPS-32 | ELF | 882.38 KB |
MachO-iOS-armv7-armv7s-arm64-Helloworld | 750338e86da4e5c8c318b885ba341d82 | armv7, armv8 | MachO | 299.06 KB |
MachO-iOS-armv7s-Helloworld | 5ae2549bda51d826a51e97c03fb06f73 | armv7 | MachO | 89.64 KB |
The graph above shows the number of instructions per program in the dataset. If most of our test suite is made of programs with less than a million instructions, a few large binaries were also included, to better understand how the exporters and disassemblers scaled. As we need to plot large ranges of values in the same graph, most of the curves looks flat for the first points. [6]
Disassembly time
The first metrics we were interested in is the disassembly time, defined as the duration of the automatic analysis. We knew that IDA was faster than Ghidra, but we wanted to measure to what extent.
The results are impressive, Ghidra is much slower than IDA (up to 13 times slower for large binaries). Even if the disassembly step is a one time process, the performances of Ghidra are problematic for scalability. Nevertheless, it should be noted that the results are biased, because Ghidra performs an additional decompilation step.
Export time and size
The first section helped us to draw an overview of the available exporters. Another interesting metrics is the export time for the following disassemblers/exporters pairs:
IDA + BinExport
IDA + Ghidra XML
Ghidra + XML
We chose to keep only those exporters because they were running on the disassemblers we selected, and had an interesting set of exported features. They also had a good support for Ghidra, and BinDiff has been used for years in the community without issues. We may also note that they use different exporting strategies: Ghidra does not export any information on instructions while BinExport decomposes every operand of each instruction and exports them.
The export size of a program is far greater than the program itself for both tools. While BinExport produces a single Protobuf file, Ghidra generates two files, one XML with all the information and a raw byte file containing all the code of the exported binary. The figures on the graph represent the sum of the size of these two files.
Program | Size | i64 | BinExport | IDA-XML | Ghidra-XML |
---|---|---|---|---|---|
elf-Linux-x64-bash | 908 KB | 11 MB | 4.2 MB | 4.9 MB | 7.1 MB |
ts3server | 7.8 MB | 58 MB | 20 MB | 19 MB | 64.8 MB |
llvm-opt | 34 MB | 300 MB | 144 MB | 127 MB | 202 MB |
We observe that the size of the export for BinExport and XML is roughly the same. However, BinExport exports a lot more information on the binary than Ghidra. Remember that Ghidra does not export any information on the instructions themselves neither on the basic blocks besides their contents (i.e. raw bytes). The sizes of the exported files remain equivalent because of optimizations made by BinExport: the format is specifically designed for compactness (e.g. there is an extensive usage of deduplications tables) and the export file uses a binary serialization protocol, namely Protobuf. This will be further discussed in the next section.
The table above also includes the sizes of the database generated by IDA, the i64 file, which is much larger than any of the exported file considered in this study.
Full export
To summarize the results from the previous tests, we plot hereafter a graph explaining the time spent in the three phases of the export process:
Disassembly phase: disassembling the binary
Export phase: generating the export files
Deserialization / Loading phase: Importing the exported file in Python
This graph shows that the deserialization time can to become non-negligeable with the Protobuf format for large binaries (here mdbook). This observation led us to the next section which explores various binary serialization formats to find which one is the more suitable for our needs.
Experiments on binary serialization formats
Introduction
Numerous formats exist [7] for serialization because not all usages (e.g persistent storage, RPC communication, data transfer, ...) require the same set of features. One may want to have the data stored in a "human-readable" way (i.e as text), have a fast-access time, or a compact storage size. For program serialization, we need a trade-off between a compact disk usage, a reasonable deserialization time and a low memory footprint. Since a readable format is not needed and disk usage is a concern, binary serialization formats seemed more appropriate, as opposed to text formats (e.g. JSON, XML).
Binary serialization formats
In this section, we will focus on three formats used for binary serialization:
Protobuf: A format developed (and extensively used) by Google for serializing structured data.
FlatBuffers: Another format developed by Google to serialize data. Mostly used for performance critical applications.
Cap'n Proto: A format developed by Kenton Varda (tech lead of Protobuf while he was working at Google) for Sandstorm.
All these formats use a custom schema definition language to explain how the data will be formatted on the wire. Even if this blog post does not intend to be a crash course on data serialization, nor a tutorial on how to write a schema for the three protocols, the syntax of a basic message is shown below.
message Meta {
optional string executable_name = 1;
optional string executable_id = 2;
optional string architecture_name = 3;
optional int64 timestamp = 4;
}
struct Meta {
executableName @0 :Text;
executableId @1 :Text;
architectureName @2 :Text;
timestamp @3 :UInt64;
}
table Meta {
executable_name:string;
executable_id:string;
architecture_name:string;
timestamp:long;
}
The main difference between these formats is how they store data on the wire. Protobuf, the oldest one, uses an encoding/packing step which transforms the input on the wire. This allows Protobuf to be more compact because the encoding step reduces the amount of bytes needed to store an object (see Encoding in Protobuf documentation). However, both FlatBuffers and Cap'n Proto use a 'zero-copy' strategy, meaning that the data on the wire is structured the same way as it is in the memory. The main advantage of this technique is to nullify the time needed to decode the object because no decoding step is performed.
Another huge difference between FlatBuffers/Cap'n Proto and Protobuf is the ability to perform random access reads (the ability to read a specific part of the message without reading the whole message before). With Protobuf this is not possible because the message needs to be parsed upfront (and memory allocated). However, both FlatBuffers and Cap'n Proto implement this feature using pointers, allowing fast access to part of the message.
Allocation (i.e how to write message) has to be done bottom-up for FlatBuffers because a message must be finished before another one is started. This limitation does not apply to Protobuf (because all the message is written at the end) and Cap'n Proto (because the size of an object is known when allocated).
The final difference we will go through is how unset fields (i.e. fields with no values for this specific message) are stored on the wire. Both Protobuf and FlatBuffers do not allocate them while Cap'n Proto still do. This leads to a waste of space for Cap'n Proto.
Benchmarks
For these benchmarks, we translated the BinExport Protobuf into a FlatBuffers and a Cap'n Proto schema. The translation was done manually for Cap'n Proto and using the option --proto of flatc for FlatBuffers (plus some minor revisions). We do not pretend to have fully optimized the new schemes using all the features of the two serializations formats but believe this still leads to an informative comparison.
First, we want to compare how big the exported files are compared to the binaries themselves. This size is represented by the dashed line and is linear ().
We see that the size of the exported file grows non-linearly with the size of the binary. The following graph shows the ratio between the size of the exported file and the size of the binary.
We see that Protobuf is much more compact than the two others (the encoding step is crucial for this part) and the ratio skyrockets for specific binaries. There is still room for improvement in the export size for the two other protocols, mostly by having a better understanding of the ranges of the different values. With Protobuf, one may declare every integers as 64-bits wide integers, the serialization algorithm will only write on the wire the varint encoded value of the number (a reduction up to a scale factor of 8 for the 127 first values). However, with Cap'n Proto and FlatBuffers, the value would need to be 64 bits long anyway.
Another interesting point to study is how much memory is used for loading the serialized file in Python. (Note: using the memory_profiler [8] module to retrieve memory usage.)
As expected, the memory needed to load the export of a binary is much more important for Protobuf. For example, for llvm-opt the Protobuf file is around 150MB but the loading takes around 1.8 Go of RAM.
The last metrics we want to consider is how much time is needed to load an export file in Python from the three files format.
As expected, the Protobuf format takes a lot of time to be deserialized. Cap'n Proto and FlatBuffers have similar performances, mostly because they are based on the same patterns.
Note
We could have reduced the size of the exported file for Cap'n Proto by applying their 'packed' algorithm. However, this removes the interesting property of having a 'zero-copy' protocol. More experiments are still needed to understand if this would be a better option than Protobuf.
Compressing the exported file using well-known algorithms could also be a viable strategy for Cap'n Proto and FlatBuffers as it would also reduce the size of the exported file. However, this option adds some time upfront, as it requires to decompress the file before using it. It is not applicable to Protobuf because the format is already compact.
Conclusion
Exporting as many data as possible from a binary is interesting not in itself but as a basis for other applications, like features extraction for machine learning algorithms, graph traversing algorithms, or fast access to functions / blocks / instructions based on user defined criteria.
This blog post explored different options to export a disassembled program from a disassembler using available exporters. To the best of our knowledge, the most complete exporter available is BinExport as it exports a lot of information while remaining compact thanks to the serialization format used, Protobuf. Nonetheless, there is still room for improvement for binary exporters as none of the explored solutions answered all our scalability needs.
Changelog
09.25.19 : Update the results for radare using the last version (from 3.2.1 to 4.0.0)
10.30.19 : Add the results for JEB, a disassembler (and decompiler) by PNFSoftware.
[1] | If any mistake were to be found, do not hesitate to contact us. |
[2] | A stub (the SimProcedure) in angr is an helper function written to emulate an external function (e.g a library function). |
[3] | We used a derivative of the script found here: https://github.com/cea-sec/miasm/blob/master/example/disasm/full.py |
[4] | https://blogs.grammatech.com/open-source-tools-for-binary-analysis-and-rewriting |
[5] | Although none of the programs used were chosen because of inner specificities, the dataset is available upon request (e.g one wants to bench another disassembler/exporter). |
[6] | The graphs presented are interactive: it is possible to zoom on parts of the graph, to change the scale factors or to hover points to have the precise values. |
[7] | https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats |
[8] | https://github.com/pythonprofilers/memory_profiler |