Parsing binary data is an indispensable skill. Inevitably, we are tasked with analyzing artifacts that are unfamiliar or undocumented. This issue is compounded when the file of interest is a binary file. Rather than analyzing a text-like file, we often need to use our favorite hex editor to begin reverse engineering the file's internal binary structure. Reverse engineering the underlying logic of binary files is out of scope for this chapter. Instead, we will work with a binary object whose structure is already well-known. This will allow us to highlight how to use Python to parse these binary structures automatically once the internal structure is understood. In this chapter, we will examine the UserAssist registry key from the NTUSER.DAT registry hive.
This chapter illustrates how to extract Python objects from binary data and generate an automatic Excel report. We will use three modules to accomplish this task: struct, yarp, and xlsxwriter. Although the struct module is included in the standard installation of Python, both yarp and xlsxwriter must be installed separately. We will cover how to install these modules in their respective sections.
The struct library is used to parse the binary object into Python objects. Once we have parsed the data from the binary object, we can write our findings into a report. In past chapters, we have reported results in the CSV or HTML files. In this chapter, we will create an Excel report containing tables and summary charts of the data.
In this chapter, we will cover the following topics:
- Understanding the UserAssist artifact and its binary structure
- An introduction to ROT-13 encoding and decoding
- Installing and manipulating registry files with the yarp module
- Using struct to extract Python objects from binary data
- Creating worksheets, tables, and charts using xlsxwriter