An Experimental Study of Different Binary Exporters

Posted Tue 24 September 2019
Authors Robin David, Alexis Challande
Category Program Analysis
Tags reverse-engineering, serialization, data analysis, program analysis, 2019

This blog post presents a comparison between various disassembled binary exporters.

Disclaimer

All the tools presented in this blog post have been tested in accordance with the knowledge we had of them. We do not claim at all that our results are an accurate view of the state of the tools, and we probably missed features we did not know about. The figures should be seen as indicators and not as ground truth.

Introduction

Analyzing binaries programs often requires to disassemble them. The two most famous tools for this task are IDA Pro and the newer one from the NSA Ghidra. Even if really powerful, these tools are inadequate for running custom analyses on a disassembled binary or on multiple binaries at the same time. If the disassembler is not needed anymore, why bothering to keep it open and running in the background? This is actually costly, as each instance may eat up to a few hundreds of megabytes in RAM. The only necessary element is an export of the disassembled binary and this blog post presents an overview of the different exporters and disassemblers available.

For the rest of this article, an export of a binary is defined as a file which stores various information about the program. These data range from meta information (format, architecture, compilers identification) to more specific elements on the disassembled code itself (instructions, mnemonics) and intelligence gathered by the disassembler (x-references, symbols).

Disassemblers review

Overview

The first step to export a disassembled binary is to disassemble it. Numerous tools exist for this task, the most famous one being IDA, a commercial tool by HexRays. During the last years, other tools have been released (Binary Ninja, radare, Ghidra) with different ranges of features and prices. While this blog post does not conduct a complete review of all the existing tools, nor pretends to as it would be slippery, we still wanted to have a more informed opinion on the different options.

The table below lists some important tools to disassemble a complete binary and some of their features.

Tool name	Authors	OSS	Lang	Bindings	Exporters	Architectures
Tool name	Authors	OSS	Lang	Bindings	Exporters	i386	ARM	Mips	PowerPC
angr	UCSB		Python	-	n/c
Ghidra	NSA		Java	Python	XML, BinExport
IDA Pro	HexRays		n/c	C, Python	BinExport
BAP	CMU		OCaml	Rust, Python, C	n/c		(32)	(32)	(32)
ddisasm	GrammaTech		C++	Bash	In Protobuf	(64)	(64)
Macaw	GaloisInc		Haskell	-	n/c		(32)
radare2	pancake (and community)		C	Python	n/c
Miasm	CEA-SEC		Python	-	n/c
Binary Ninja	Vector 35		n/c	C++, Python	JSON
JEB	PNF Software		n/c	Java, Python	JSON, C

As seen in the table above, where only active projects are listed, there is a broad range of tools available. It was not possible to compare all the binary disassembly tools, as our time was limited. We thus elected not to include the following:

Binary Ninja: we had no license for the tool;
McSema: it relies on IDA to perform the disassembling;
BAP: the python bindings are using a client/server model that is not really practical for our needs;
Pharos: tuned to be used for C++ disassembly;
Macaw: supports a limited set of architecture.

Note: Even if these tools have been left aside because they did not seem to fit our needs, they are nice pieces of engineering. We still encourage everyone to have a look at them. [1]

Binaries

To test the performances of the disassemblers, the three following programs were used, classified in three categories, small, medium and large. The selected binaries are:

Small: elf-Linux-x64-bash (~900KB) ELF file for x86-64 (source: Linux)
Medium: delta_generator (17MB) ELF file for x86 (source: Android Open Source Project)
Large: llvm-opt (34MB) ELF file for x86-64 (source: LLVM)

These programs were selected at random from programs available on our computers at the time of the tests. They are not supposed to have any outstanding features, just regular programs coming from widely used open-source projects.

Disassembly results

All tests were run on a Dell XPS 15 with an Intel® Core™ i7-6700HQ CPU @ 2.60GHz with an SSD and 16 Go RAM running Debian 10 (Buster).

	small (900KB)			medium (17MB)			large (34MB)
	time	#inst	#funcs	time	#inst	#funcs	time	#inst	#funcs
angr (8.19.7.25)	39.20s	139,607	8,470
Ghidra (9.1-dev)	23s	132,463	2,281	3m9s	213,584	3,073	29m41s	4,935,687	31,942
IDA (7.2)	4.22s	133,072	2,254	8.33s	285,478	2,005	4m41s	4,960,290	31,924
ddisasm	5.40s	31,153	1,194	44.57s	65,549	689	9m43s	1,946,306	24,696
radare2 (4.0.0)	15.72s	95,744	374	30.3s	19,502	76	12m19s	535,044	2,090
Miasm (0.1.1)	3m48s	54,334	2	2m30s	60,650	2	2h2m	395,580	2
JEB (3.7.0)	28.34s	132,628	2,809	51.58s	284,936	2,323	18min43	4,963,729	51,901

Some notes on the table:

These disassembly figures should be handled with care as there is no ground truth results (in terms of instructions/functions count). Nonetheless, IDA/Ghidra results can be considered as a close approximation of the right results.
angr is a complex tool which performs control flow analysis through code emulation to correctly disassemble a binary. However, this implies to have some stubs [2] written to perform the emulation. While a lot of the necessary stubs are already available, some are still missing (hence the errors shown). To our knowledge, there is no method in angr to disassemble a binary without generating a CFG (see #1116).
Miasm needs the entry points of a program to start disassembling. It performs the disassembly from one entry point recursively until no instruction is found. The two functions found by the tool are actually the two entry points we specified (_start and main) [3].
The x86_64 version of the delta_generator program was used for ddisasm because x86 is not (yet) supported by the tool.
JEB uses a slightly broader notion of functions where every instruction is assigned to a function (while IDA leaves orphaned instructions). New functions are also created for exception handlers and per switch target if it does not succeed in reconstructing it. These are factors which explain the difference in function numbers. In terms of retrieved instruction numbers it performs as well as IDA and Ghidra while being more similar to Ghidra in terms of speed.
Both IDA and Ghidra stand out for the number of instructions and functions retrieved, with a little advantage to IDA for its disassembly speed. These results are not surprising since IDA and Ghidra are huge players with decades of experience.

Exporters

The following step is to export the disassembled program into a standalone file. The goal is to close the disassembler after the initial disassembly step, as its features are not needed anymore.

Overview

The list of exporters available for the tools tested in the first section is shown below.

Disassembler	Exporter	Project
IDA	BinExport	Exporter from Zynamics (mainly used for BinDiff)
	Ghidra-IDA	Ghidra's plugin to export a project from IDA to Ghidra
	McSema	Exporter for McSema lifter
Ghidra	BinExport-Ghidra	Experimental port of BinExport by C. Blichmann
Ghidra	Ghidra	Built-in exporter from Ghidra
ddisasm	ddisasm	GrammaTech in-house tool to export a binary

Ghidra provides an IDA plugin to generate an XML file (and a raw data file) so the user can import them in Ghidra.
The widely used tool BinDiff uses BinExport, a Protobuf generated file, exported from IDA as a basis to perform its diffing. One of the authors of BinExport has started a port of the exporting feature on Ghidra (the proof-of-concept is available in his personal project on GitHub and worked really nicely so far).
ddisasm is able to parse a binary file and export a lot of information via a Protobuf file. The ultimate goal of the toolchain developed by GrammaTech is to do binary rewriting [4]. As a consequence, the exported features focus on the sole information useful for this task. This represents only a subset of all the available information.

Ignored exporters

We also found other exporters that were left out of this study:

Diaphora: The tool exports a binary to a SQLite database and is written in Python. A preliminary study has shown that the sqlite file is much larger (around 4-6 times) than the i64 and thus not compact enough for our needs.
YaCo: Plugin developed by the Direction Générale de l'Armement (DGA) for the YaTools suite. It does not export any information below the granularity of basic blocks (and only a hash of them). However, it is worth noticing as it is the only tool generating a FlatBuffers file.
bnida: Plugin used to port a project from IDA to Binary Ninja. It exports to a JSON file and is written in Python. It does not export any data on the content of the functions (just their names and address) nor below this granularity.
JEB: JEB has a built-in exporter that exports the disassembled (and decompiled) code as C-code in files. While interesting, this approach is not really suitable for our purposes.

Exporter features

The table below details the various information exported by the different exporters selected. The results were gathered by analyzing the description of the protocol and actual exported files.

Note

To improve readability, explanations for ambiguous results (orange tildes) are provided as tooltips.

		Exporters
		BinExport	McSema	ddisasm	Ghidra-XML
Metadata	Name
	Arch
	ISA
	Compiler
Layout	Segments
Layout	Code layout
Symbols	Name
	Value
	Type
Data	Address
	Type
	Size
	Name
Graph	Call graph
Graph	CFG
Comments	Address
	Type
	Content
Functions	Name
	Demangled name
	Type				(I: , G: )
	Argument count
Instructions	Mnemonic
	Operand
	Operand type
	Bytes
	Address
	Expressions
	Xref (code, data)	(, )
Basic block	Start address
	End address			(size)
	Instructions list
	Content			(indirect)
Strings	Address				(data)
Strings	Content				(data)
Data types	Structure
Data types	Enumerations

Important notes:

The goals of the different exporters are not identical, so they do not export the same type of information from a binary. While BinExport was designed to be a part of a diffing engine, ddisasm was designed to be a part of binary rewriting toolchain.
In the Ghidra-XML column, when the exported information varies between the IDA and Ghidra implementations, those differences are noted.

Two main strategies exist for exporters. The first one is to export disassembled instructions with information on their content (mnemonic, operands, expressions inside the operands). Using this strategy, the export itself is self-contained and no other tool is required to analyze it. The second strategy is to export only the raw bytes (of the instructions) themselves and leave the remaining disassembly work to another disassembler (e.g capstone). An export using this strategy will be more compact, but at the price of needing a helping tool to understand the content of the export. The choice of the strategy obviously depends on the final objective of the tool. It makes sense for Ghidra not to export disassembled instructions because they have their own disassembler, and for BinExport to export everything because BinDiff should be autonomous (and as fast as possible).

Full benchmark

This sections aims to compare with more details the exporters found for IDA and Ghidra. The results of the first section of this article comforted us to only consider those two disassemblers as they were more accurate.

We are also interested in comparing the performance of the built-in exporter of Ghidra against the plugin they offer for IDA. However, we choose not to include the experimental port of BinExport for Ghidra because it is still a work in progress and its performances are below the ones from IDA's version while exporting the same features.

Dataset

For the rest of the benchmarks, we gathered a dataset of various binaries coming from different sources. While our dataset is not exhaustive, it tries to mimic the diversity of programs a reverser could encounter. It gathers binaries of various architectures, files formats, size and bitness. The sources used are listed below [5] :

binary-samples: A test suite for binary analysis tools made by Jonathan Salwan
AOSP (Android Open Source Project): An open source operating system for mobile devices
LLVM: The compiler infrastructure project

Binary Name	md5sum	Architecture	Format	Binary size
x64_delta_generator	8ad5f84d44b73289aa863c44aa7619e9	x86_64	ELF	15.28
elf-Linux-x64-bash	9a99d4a76f3f773f7ab5e9e3e482c213	x86_64	ELF	904.82 KB
pe-Windows-x64-cmd	5746bd7e255dd6a8afa06f7c42c1ba41	x86_64	PE	337.00 KB
elf-Linux-lib-x64.so	89a9ff6d56c3ad2ef9a185a17ef9f658	x86_64	ELF	1.09 MB
busybox-mips	b55e00aa275948e6aea776028088c746	MIPS-32	ELF	352.48 KB
clang-check	4a3aec55b02c6b3fec39d0cdaaca483e	x86_64	ELF	46.83 MB
elf-Linux-ARMv7-ls	de9f91f9cd038989fec8abf25031b42b	armv7	ELF	88.68 KB
MachO-OSX-x86-ls	df2580eaf51e15e23de3db979992af1e	x86	MachO	34.86 KB
ts3server	3c5c3e83dca78b4602148ce8643521e2	x86_64	ELF	7.73 MB
busybox-powerpc	bcfd1ebe98bf3519c3f2c9c14e0f9cf9	PPC-32	ELF	1.10 MB
dex38.dex	0acbdd5244d0726d0cbfb2d45d2f95a8	-	DEX	11.48 KB
MachO-OSX-x64-ls	d174dcfb35c14d5fcaa086d2c864ae61	x86_64	MachO	38.66 KB
pe-Windows-x86-cmd	e52110456ec302786585656f220405eb	x86	PE	294.50 KB
classes.dex	e62eaf49283093501e7c7cbe9743a0f7	-	DEX	3.53 MB
wpa_supplicant	aa782fa15d1265b0d8cfc00b6f883187	x86	ELF	21.64 MB
ctags	48644ed9bbb64c22ee538cbe99481f21	x86_64	ELF	4.59 MB
crackmips	9416c32035cf2f2da41876e1c9411850	MIPS-32	ELF	25.54 KB
llvm-opt	f0d325ba8ebbe72aad180c8cab6de09c	x86_64	ELF	33.83 MB
elf-Linux-x86-bash	b5bfc5bc405340bcc5050756ac92cf45	x86	ELF	792.14 KB
delta_generator	c2bd1c45f4647932e85561a42e0cbbb4	x86	ELF	16.49 MB
mdbook	9c405c56cf9c05e0a25766f6639cd5ca	x86_64	ELF	10.67 MB
elf-Linux-ARM64-bash	086f3ad932f5b1bcf631b17b33b0bb0a	armv8	ELF	827.54 KB
elf-Linux-lib-x86.so	df9fd3ec63ac207b9fa193b8dcea7eb7	x86	ELF	1.08 MB
elf-Linux-Mips4-bash	628f094cff8ec9d9e36c5b94460c7454	MIPS-32	ELF	882.38 KB
MachO-iOS-armv7-armv7s-arm64-Helloworld	750338e86da4e5c8c318b885ba341d82	armv7, armv8	MachO	299.06 KB
MachO-iOS-armv7s-Helloworld	5ae2549bda51d826a51e97c03fb06f73	armv7	MachO	89.64 KB

The graph above shows the number of instructions per program in the dataset. If most of our test suite is made of programs with less than a million instructions, a few large binaries were also included, to better understand how the exporters and disassemblers scaled. As we need to plot large ranges of values in the same graph, most of the curves looks flat for the first points. [6]

Disassembly time

The first metrics we were interested in is the disassembly time, defined as the duration of the automatic analysis. We knew that IDA was faster than Ghidra, but we wanted to measure to what extent.

The results are impressive, Ghidra is much slower than IDA (up to 13 times slower for large binaries). Even if the disassembly step is a one time process, the performances of Ghidra are problematic for scalability. Nevertheless, it should be noted that the results are biased, because Ghidra performs an additional decompilation step.

Export time and size

The first section helped us to draw an overview of the available exporters. Another interesting metrics is the export time for the following disassemblers/exporters pairs:

IDA + BinExport
IDA + Ghidra XML
Ghidra + XML

We chose to keep only those exporters because they were running on the disassemblers we selected, and had an interesting set of exported features. They also had a good support for Ghidra, and BinDiff has been used for years in the community without issues. We may also note that they use different exporting strategies: Ghidra does not export any information on instructions while BinExport decomposes every operand of each instruction and exports them.

The export size of a program is far greater than the program itself for both tools. While BinExport produces a single Protobuf file, Ghidra generates two files, one XML with all the information and a raw byte file containing all the code of the exported binary. The figures on the graph represent the sum of the size of these two files.

Program	Size	i64	BinExport	IDA-XML	Ghidra-XML
elf-Linux-x64-bash	908 KB	11 MB	4.2 MB	4.9 MB	7.1 MB
ts3server	7.8 MB	58 MB	20 MB	19 MB	64.8 MB
llvm-opt	34 MB	300 MB	144 MB	127 MB	202 MB

We observe that the size of the export for BinExport and XML is roughly the same. However, BinExport exports a lot more information on the binary than Ghidra. Remember that Ghidra does not export any information on the instructions themselves neither on the basic blocks besides their contents (i.e. raw bytes). The sizes of the exported files remain equivalent because of optimizations made by BinExport: the format is specifically designed for compactness (e.g. there is an extensive usage of deduplications tables) and the export file uses a binary serialization protocol, namely Protobuf. This will be further discussed in the next section.

The table above also includes the sizes of the database generated by IDA, the i64 file, which is much larger than any of the exported file considered in this study.

Full export

To summarize the results from the previous tests, we plot hereafter a graph explaining the time spent in the three phases of the export process:

Disassembly phase: disassembling the binary
Export phase: generating the export files
Deserialization / Loading phase: Importing the exported file in Python

This graph shows that the deserialization time can to become non-negligeable with the Protobuf format for large binaries (here mdbook). This observation led us to the next section which explores various binary serialization formats to find which one is the more suitable for our needs.

Experiments on binary serialization formats

Introduction

Numerous formats exist [7] for serialization because not all usages (e.g persistent storage, RPC communication, data transfer, ...) require the same set of features. One may want to have the data stored in a "human-readable" way (i.e as text), have a fast-access time, or a compact storage size. For program serialization, we need a trade-off between a compact disk usage, a reasonable deserialization time and a low memory footprint. Since a readable format is not needed and disk usage is a concern, binary serialization formats seemed more appropriate, as opposed to text formats (e.g. JSON, XML).

Binary serialization formats

In this section, we will focus on three formats used for binary serialization:

Protobuf: A format developed (and extensively used) by Google for serializing structured data.
FlatBuffers: Another format developed by Google to serialize data. Mostly used for performance critical applications.
Cap'n Proto: A format developed by Kenton Varda (tech lead of Protobuf while he was working at Google) for Sandstorm.

All these formats use a custom schema definition language to explain how the data will be formatted on the wire. Even if this blog post does not intend to be a crash course on data serialization, nor a tutorial on how to write a schema for the three protocols, the syntax of a basic message is shown below.

message Meta {
  optional string executable_name = 1;
  optional string executable_id = 2;
  optional string architecture_name = 3;
  optional int64 timestamp = 4;
  }

struct Meta {
    executableName @0 :Text;
    executableId @1 :Text;
    architectureName @2 :Text;
    timestamp @3 :UInt64;
}

table Meta {
  executable_name:string;
  executable_id:string;
  architecture_name:string;
  timestamp:long;
  }

The main difference between these formats is how they store data on the wire. Protobuf, the oldest one, uses an encoding/packing step which transforms the input on the wire. This allows Protobuf to be more compact because the encoding step reduces the amount of bytes needed to store an object (see Encoding in Protobuf documentation). However, both FlatBuffers and Cap'n Proto use a 'zero-copy' strategy, meaning that the data on the wire is structured the same way as it is in the memory. The main advantage of this technique is to nullify the time needed to decode the object because no decoding step is performed.

Another huge difference between FlatBuffers/Cap'n Proto and Protobuf is the ability to perform random access reads (the ability to read a specific part of the message without reading the whole message before). With Protobuf this is not possible because the message needs to be parsed upfront (and memory allocated). However, both FlatBuffers and Cap'n Proto implement this feature using pointers, allowing fast access to part of the message.

Allocation (i.e how to write message) has to be done bottom-up for FlatBuffers because a message must be finished before another one is started. This limitation does not apply to Protobuf (because all the message is written at the end) and Cap'n Proto (because the size of an object is known when allocated).

The final difference we will go through is how unset fields (i.e. fields with no values for this specific message) are stored on the wire. Both Protobuf and FlatBuffers do not allocate them while Cap'n Proto still do. This leads to a waste of space for Cap'n Proto.

Benchmarks

For these benchmarks, we translated the BinExport Protobuf into a FlatBuffers and a Cap'n Proto schema. The translation was done manually for Cap'n Proto and using the option --proto of flatc for FlatBuffers (plus some minor revisions). We do not pretend to have fully optimized the new schemes using all the features of the two serializations formats but believe this still leads to an informative comparison.

First, we want to compare how big the exported files are compared to the binaries themselves. This size is represented by the dashed line and is linear ( $y=x$ ).

We see that the size of the exported file grows non-linearly with the size of the binary. The following graph shows the ratio between the size of the exported file and the size of the binary.

We see that Protobuf is much more compact than the two others (the encoding step is crucial for this part) and the ratio skyrockets for specific binaries. There is still room for improvement in the export size for the two other protocols, mostly by having a better understanding of the ranges of the different values. With Protobuf, one may declare every integers as 64-bits wide integers, the serialization algorithm will only write on the wire the varint encoded value of the number (a reduction up to a scale factor of 8 for the 127 first values). However, with Cap'n Proto and FlatBuffers, the value would need to be 64 bits long anyway.

Another interesting point to study is how much memory is used for loading the serialized file in Python. (Note: using the memory_profiler [8] module to retrieve memory usage.)

As expected, the memory needed to load the export of a binary is much more important for Protobuf. For example, for llvm-opt the Protobuf file is around 150MB but the loading takes around 1.8 Go of RAM.

The last metrics we want to consider is how much time is needed to load an export file in Python from the three files format.

As expected, the Protobuf format takes a lot of time to be deserialized. Cap'n Proto and FlatBuffers have similar performances, mostly because they are based on the same patterns.

Note

We could have reduced the size of the exported file for Cap'n Proto by applying their 'packed' algorithm. However, this removes the interesting property of having a 'zero-copy' protocol. More experiments are still needed to understand if this would be a better option than Protobuf.
Compressing the exported file using well-known algorithms could also be a viable strategy for Cap'n Proto and FlatBuffers as it would also reduce the size of the exported file. However, this option adds some time upfront, as it requires to decompress the file before using it. It is not applicable to Protobuf because the format is already compact.

Conclusion

Exporting as many data as possible from a binary is interesting not in itself but as a basis for other applications, like features extraction for machine learning algorithms, graph traversing algorithms, or fast access to functions / blocks / instructions based on user defined criteria.

This blog post explored different options to export a disassembled program from a disassembler using available exporters. To the best of our knowledge, the most complete exporter available is BinExport as it exports a lot of information while remaining compact thanks to the serialization format used, Protobuf. Nonetheless, there is still room for improvement for binary exporters as none of the explored solutions answered all our scalability needs.

Changelog

09.25.19 : Update the results for radare using the last version (from 3.2.1 to 4.0.0)
10.30.19 : Add the results for JEB, a disassembler (and decompiler) by PNFSoftware.

[1]	If any mistake were to be found, do not hesitate to contact us.

[2]	A stub (the `SimProcedure`) in angr is an helper function written to emulate an external function (e.g a library function).

[3]	We used a derivative of the script found here: https://github.com/cea-sec/miasm/blob/master/example/disasm/full.py

[4]	https://blogs.grammatech.com/open-source-tools-for-binary-analysis-and-rewriting

[5]	Although none of the programs used were chosen because of inner specificities, the dataset is available upon request (e.g one wants to bench another disassembler/exporter).

[6]	The graphs presented are interactive: it is possible to zoom on parts of the graph, to change the scale factors or to hover points to have the precise values.

[7]	https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats

[8]	https://github.com/pythonprofilers/memory_profiler

If you would like to learn more about our security audits and explore how we can help you, get in touch with us!

Table of contents

Disclaimer

Introduction

Disassemblers review

Overview

Binaries

Disassembly results

Exporters

Overview

Ignored exporters

Exporter features

Note

Full benchmark

Dataset

Disassembly time

Export time and size

Full export

Experiments on binary serialization formats

Introduction

Binary serialization formats

Benchmarks

Note

Conclusion

Changelog