Writing output with the csv

The csv_writer() function is similar to most of our previous CSV writers. A few special considerations need to be made due to the complexity of the data being written to the file. Additionally, we're only writing some of the data out to a file and discarding everything else. Dumping the data out to a serialized data structure, such as JSON, is left to the reader as a challenge. As with any csv_writer, we create a list of our headers, open csvfile, create our writer object, and write the headers to the first row:

371 def csv_writer(data, output_dir):
372     """
373     The csv_writer function writes frame, cell, and data to a CSV
374     output file.
375     :param data: The dictionary containing the parsed WAL file.
376     :param output_dir: The directory to write the CSV report to.
377     :return: Nothing.
378     """
379     headers = ['Frame', 'Salt-1', 'Salt-2', 'Frame Offset',
380         'Cell', 'Cell Offset', 'ROWID', 'Data']
381 
382     out_file = os.path.join(output_dir, 'wal_crawler.csv')
383 
384     if sys.version_info[0] == 2:
385         csvfile = open(out_file, "wb")
386     elif sys.version_info[0] == 3:
387         csvfile = open(out_file, "w", newline='',
388             encoding='utf-8')
389 
390     with csvfile:
391         writer = csv.writer(csvfile)
392         writer.writerow(headers)

Because of our nested structure, we need to create two for loops to iterate through the structure. On line 399, we check to see whether the cell actually contained any data. We noticed during development that sometimes empty cells would be generated and are discarded in the output. However, it might be relevant in a particular investigation to include empty cells, in which case we'd remove the conditional statements:

394         for frame in data['frames']:
395 
396             for cell in data['frames'][frame]['cells']:
397 
398             # Only write entries for cells that have data.
399             if ('data' in data['frames'][frame]['cells'][cell].keys() and
400             len(data['frames'][frame]['cells'][cell]['data']) > 0):

If there is data, we calculate the frame_offset and cell_offset relative to the beginning of the file. The offsets we parsed before were relative to the current position within the file. This relative value wouldn't be very helpful to an examiner who would have to backtrack to find where the relative offset position starts.

For our frame offset, we need to add the file header size (32 bytes), the total page size (frames * page size), and the total frame header size (frames * 24 bytes). The cell offset is a little simpler and is the frame offset plus the frame header size, and the parsed cell offset from the wal_attributes dictionary:

401                 # Convert relative frame and cell offsets to
402                 # file offsets.
403                 frame_offset = 32 + (
404                     frame * data['header']['pagesize']) + (
405                     frame * 24)
406                     cell_offset = frame_offset + 24 + data['frames'][frame]['cells'][cell]['offset']

Next, we create a list, cell_identifiers, on line 411, which will store the row data to write. This list contains the frame number, salt-1, salt-2, frame offset, cell number, cell offset, and the row ID:

408                 # Cell identifiers include the frame #, 
409                 # salt-1, salt-2, frame offset,
410                 # cell #, cell offset, and cell rowID.
411                 cell_identifiers = [frame, data['frames'][frame]['header']['salt1'],
412                     data['frames'][frame]['header']['salt2'],
413                     frame_offset, cell, cell_offset,
414                     data['frames'][frame]['cells'][cell]['rowid']]

Finally, on line 418, we write the row along with the payload data to CSV file writer:

416                 # Write the cell_identifiers and actual data
417                 # within the cell
418                 writer.writerow(
419                     cell_identifiers + data['frames'][frame]['cells'][cell]['data'])

If the cell had no payload, then the continue block is executed and we proceed to the next cell. Once the outer for loop finishes executing, that is, all frames are written to the CSV, we flush any remaining buffered content to the CSV and close the handle on the file:

421             else:
422                 continue
423 
424     csvfile.flush()
425     csvfile.close()

An example of the CSV output that might be generated from a WAL file is captured in the following screenshot: