Our framework takes some input directory, recursively indexes all of its files, runs a series of plugins to identify forensic artifacts, and then writes a series of reports into a specified output directory. The idea is that the examiner could mount a .E01 or .dd file using a tool such as FTK Imager and then run the framework against the mounted directory.
The layout of a framework is an important first step in achieving a simplistic design. We recommend placing writers and plugins in appropriately labeled subdirectories under the framework controller. Our framework is laid out in the following manner:
|-- framework.py
|-- requirements.txt
|-- plugins
|-- __init__.py
|-- exif.py
|-- id3.py
|-- office.py
|-- pst_indexer.py
|-- setupapi.py
|-- userassist.py
|-- wal_crawler.py
|-- helper
|-- __init__.py
|-- utility.py
|-- usb_lookup.py
|-- writers
|-- __init__.py
|-- csv_writer.py
|-- xlsx_writer.py
|-- kml_writer.py
Our framework.py script contains the main logic of our framework-handling the input and output values for all of our plugins. The requirements.txt file contains one third-party module on each line used by the framework. In this format, we can use this file with pip to install all of the listed modules. pip will attempts to install the latest version of the module unless a version is specified immediately following the module name and two equal to signs (that is, colorama==0.4.1). We can install third-party modules from our requirements.txt file using the following code:
pip install -r requirements.txt
The plugins and writers are stored in their own respective directories with an __init__.py file to ensure that Python can find the directory. Within the plugins directory are seven initial plugins our framework will support. The plugins we'll include are as follows:
- The EXIF, ID3, and Office embedded metadata parsers from Chapter 8, The Media Age
- The PST parser from Chapter 11, Parsing Outlook PST Containers
- The Setupapi parser from Chapter 3, Parsing Text Files
- The UserAssist parser from Chapter 6, Extracting Artifacts from Binary Files
- The WAL file parser from Chapter 12, Recovering Transient Database Records
There's also a helper directory containing some helper scripts that are required by some of the plugins. There are currently three supported output formats for our framework: CSV, XLSX, and KML. Only the exif plugin will make use of kml_writer to create a Google Earth map with plotted EXIF GPS data, as we saw in Chapter 8, The Media Age.
Now that we understand the how, why, and layout of our framework, let's dig into some code. On lines 2 through 11, we import the modules we plan to use. Note that this is only the list of modules that are required in this immediate script. It doesn't include the dependencies required by the various plugins. Plugin-specific imports are made in their respective scripts.
Most of these imports should look familiar from the previous chapters, with the exception of the new additions of colorama and pyfiglet. On lines 7 and 8, we import our plugins and writers subdirectories, which contain the scripts for our plugins and writers. The colorama.init() call on line 13 is a prerequisite that allows us to print colored text to the Windows Command Prompt:
002 from __future__ import print_function
003 import os
004 import sys
005 import logging
006 import argparse
007 import plugins
008 import writers
009 import colorama
010 from datetime import datetime
011 from pyfiglet import Figlet
012
013 colorama.init()
On line 49, we define our Framework class. This class will contain a variety of methods, all of which handle the initialization and execution of the framework. The run() method acts as our typical main function and calls the _list_files() and _run_plugins() methods. The _list_files() method walks through files in the user-supplied directory and, based upon the name or extension, adds the file to a plugin-specific processing list. Then, the _run_plugins() method takes these lists and executes each plugin, stores the results, and calls the appropriate writer:
049 class Framework(object):
...
051 def __init__():
...
061 def run():
...
074 def _list_files():
...
115 def _run_plugins():
Within the Framework class are two subclasses: Plugin and Writer. The Plugin class is responsible for actually running the plugin, logging when it completes, and sending data to be written. The run() method repeatedly executes each function for every file in the plugin's processing list. It appends the returned data to a list, mapped to the key in a dictionary. This dictionary also stores the desired field names for the spreadsheet. The write() method creates the plugin specific output directory and, based on the type of output specified, makes appropriate calls to the Writer class:
207 class Plugin(object):
...
209 def __init__():
...
215 def run():
...
236 def write():
The Writer class is the simplest class of the three. Its run() method simply executes the desired writers with the correct input:
258 class Writer(object):
...
260 def __init__():
...
271 def run():
As with all of our scripts, we use argparse to handle command-line switches. On lines 285 and 287, we create two positional arguments for our input and output directories. The two optional arguments on lines 288 and 290 specify XLSX output and the desired log path, respectively:
279 if __name__ == '__main__':
280
281 parser = argparse.ArgumentParser(description=__description__,
282 epilog='Developed by ' +
283 __author__ + ' on ' +
284 __date__)
285 parser.add_argument('INPUT_DIR',
286 help='Base directory to process.')
287 parser.add_argument('OUTPUT_DIR', help='Output directory.')
288 parser.add_argument('-x', help='Excel output (Default CSV)',
289 action='store_true')
290 parser.add_argument('-l',
291 help='File path and name of log file.')
292 args = parser.parse_args()
We can see our first use of the colorama library on line 297. If the supplied input and output directories are files, we print a red error message to the console. For the rest of our framework, we use error messages displayed in red text and success messages in green:
294 if(os.path.isfile(args.INPUT_DIR) or
295 os.path.isfile(args.OUTPUT_DIR)):
296 msg = 'Input and Output arguments must be directories.'
297 print(colorama.Fore.RED + '[-]', msg)
298 sys.exit(1)
On line 300, we check whether the optional directory path was supplied for the log file. If so, we create these directories (if they don't exist), and store the filename for the log in the log_path variable:
300 if args.l:
301 if not os.path.exists(args.l):
302 os.makedirs(args.l) # create log directory path
303 log_path = os.path.join(args.l, 'framework.log')
304 else:
305 log_path = 'framework.log'
On lines 307 and 309, we create our Framework object and then call its run() method. We pass the following arguments into the Framework constructor to instantiate the object: INPUT_DIR, OUTPUT_DIR, log_path, and excel. In the next section, we inspect the Framework class in greater detail:
307 framework = Framework(args.INPUT_DIR, args.OUTPUT_DIR,
308 log_path, excel=args.x)
309 framework.run()
The following flow chart highlights how the different methods in the framework.py script interact. Keep in mind that this flow chart only shows interactions within the immediate script and doesn't account for the various plugin, writer, and utility scripts:
![](assets/97d8fbb0-fb28-4318-9f48-7421e0049564.png)