The writer – xlsx_writer.py

The xlsx_writer function is a slightly modified version of xlsx_writer, which we created in Chapter 6, Extracting Artifacts from Binary Files. We use the same xlsxwriter third-party module to handle the excel output. On line 32, we use list comprehension to create a list of capitalized alphabetical characters from A to Z. We're going to use this list to designate the column letter based on the supplied headers length. This method works as long as there are less than 26 field names, which for the current set of plugins is true:

001 from __future__ import print_function
002 import xlsxwriter
...
032 ALPHABET = [chr(i) for i in range(ord('A'), ord('Z') + 1)]

On line 44, we create the xlsxwriter workbook and supply the output filename to save it as. Before going any further, we check whether the supplied headers are equal to none. This check is necessary, just as in csv_writer, to avoid writing invalid data from a bad call to the writer. On line 52, we set title_length equal to the letter that the right-most column will be, in case there are more than 26 columns. We've currently set the right-most value to be Z:

035 def writer(output, headers, output_data, **kwargs):
036 """
037 The writer function writes excel output for the framework
038 :param output: the output filename for the excel spreadsheet
039 :param headers: the name of the spreadsheet columns
040 :param output_data: the data to be written to the excel
041 spreadsheet
042 :return: Nothing
043 """
044 wb = xlsxwriter.Workbook(output)
045
046 if headers is None:
047 print('[-] Received empty headers... \n'
048 '[-] Skipping writing output.')
049 return
050
051 if len(headers) <= 26:
052 title_length = ALPHABET[len(headers) - 1]
053 else:
054 title_length = 'Z'

Next, on line 56, we create our worksheet. In a similar fashion to the csv_writer function, if recursion is specified, we loop through the list, adding a worksheet for each additional list to prevent them from writing over each other. We then use list comprehension to quickly order the dictionary values based on the order of the field names. In csv_writer, the writerow method from the DictWriter object orders the data automatically. For xlsx_writer, we need to use list comprehension to recreate that same effect:

056     ws = add_worksheet(wb, title_length)
057
058 if 'recursion' in kwargs.keys():
059 for i, data in enumerate(output_data):
060 if i > 0:
061 ws = add_worksheet(wb, title_length)
062 cell_length = len(data)
063 tmp = []
064 for dictionary in data:
065 tmp.append(
066 [str(dictionary[x]) if x in dictionary.keys() else '' for x in headers]
067 )

On line 69, we create a table from A3 to XY, where X is the alphabet character representing the length of the field names list and Y is the length of the output_data list. For example, if we have a dataset that has six field names and 10 entries, we want our table to span from A3 to F13. In addition, we pass along the ordered data and specify each column using list comprehension once again to specify a dictionary with one key-value pair for each header:

069             ws.add_table(
070 'A3:' + title_length + str(3 + cell_length),
071 {'data': tmp,
072 'columns': [{'header': x} for x in headers]})

On line 74, we handle the scenario where we don't supply the recursion keyword argument. In this case, we handle the same execution minus the additional for loop. Lastly, on line 84, we close the workbook:

074     else:
075 cell_length = len(output_data)
076 tmp = []
077 for data in output_data:
078 tmp.append([str(data[x]) if x in data.keys() else '' for x in headers])
079 ws.add_table(
080 'A3:' + title_length + str(3 + cell_length),
081 {'data': tmp,
082 'columns': [{'header': x} for x in headers]})
083
084 wb.close()

The add_worksheet() method is called on lines 56 and 61. This function is used to create the worksheet and writes the first two rows of the spreadsheet. On line 96, we create the title_format style, which contains the text properties we want for our two title rows. On lines 101 and 103, we create both of our title rows. Currently, the values of these title rows are hardcoded but could be programmed into the framework by adding them as optional switches in argparse:

087 def add_worksheet(wb, length, name=None):
088 """
089 The add_worksheet function creates a new formatted worksheet
090 in the workbook
091 :param wb: The workbook object
092 :param length: The range of rows to merge
093 :param name: The name of the worksheet
094 :return: ws, the worksheet
095 """
096 title_format = wb.add_format({'bold': True,
097 'font_color': 'black', 'bg_color': 'white', 'font_size': 30,
098 'font_name': 'Arial', 'align': 'center'})
099 ws = wb.add_worksheet(name)
100
101 ws.merge_range('A1:' + length + '1', 'XYZ Corp',
102 title_format)
103 ws.merge_range('A2:' + length + '2', 'Case ####',
104 title_format)
105 return ws