Structured binary data tools
Sean Bartell
Abstract
Exploring structured binary data is frequently necessary in HelenOS. Hex editors and scripting languages require too much work and don't handle the data in a fully structured way. I will create a portable library and tools to make exploring structured binary data faster and easier. Although some tools already exist, my approach is more general and powerful. I will focus on making my tools highly versatile and making it easy to handle new structures.
Additional Information
Exploring and working with structured binary data is necessary in many different situations in a project like HelenOS. For instance, when implementing a file format or filesystem, it is first necessary to explore preexisting files and disks and learn the low‐level details of the format. Debugging compiled programs and exploring network protocols also require some way of interpreting binary data.
The most basic tool for exploring binary data is the hex editor. Using a hex editor is inefficient and unpleasant because it requires manual calculation of lengths and offsets with constant reference to the data format. General‐purpose scripting languages can be used instead, so a structure can be defined once and decoded as often as necessary. However, even with useful tools like Python’s struct module, the programmer must specify how to read the input data, calculate lengths and offsets, and provide useful output, so there’s much more work involved than simply specifying the format of the data. This extra code will probably be rewritten every time a new script is made, due to slightly differing requirements.
I propose to create a powerful library and tools that will make working with structured binary data faster and easier. My project will consist of:
- A core library that manages structured data and provides basic building blocks for binary data interpretation out of which complex format specifications can be built.
- Data providers to access various sources of raw binary data.
- Format providers, which can load and save complex format specifications. In particular, there will be a domain‐specific language for format specifications.
- Clients, programs which use the library to work with binary data. For instance, there will be an interactive browser.
Similar existing projects, such as QuickBMS and various hex editors, accomplish this in a primitive way by combining scripting and C‐like structures. My approach is more general and powerful; it will be possible to interactively explore an entire filesystem. I will focus on making my tools highly versatile and making it easy to create new format specifications compared to existing tools.
Periodic updates will be posted to the HelenOS-devel mailing list.
Code samples
| File name | Size | Date submitted |
|---|---|---|
| Sean_Bartell.tar.gz | 58.6 KB | August 25 2012 18:36 UTC |
