The write_csv() function uses a new method from the peewee library, allowing us to retrieve data from the database as dictionaries. Using the familiar Files.select().where() statement, we append the dicts() method to convert the result into Python dictionaries. This dictionary format is an excellent input for our reports, as the built-in CSV module has a class named DictWriter. As its name suggests, this class allows us to pass a dictionary of information to be written as a row of data in a CSV file. Now that we have our query staged, we can log to the user that we are starting to write the CSV report:
253 def write_csv(source, custodian_model):
254 """
255 The write_csv function generates a CSV report from the Files
256 table
257 :param source: The output filepath
258 :param custodian_model: Peewee model instance for the
259 custodian
260 :return: None
261 """
262 query = Files.select().where(
263 Files.custodian == custodian_model.id).dicts()
264 logger.info('Writing CSV report')
Next, we define our column names for our CSV writer and open the user-specified output file using the with...as... statement. To initialize the csv.DictWriter class, we pass the open file object and column headers that correspond to the table's column names (and therefore the dictionary key names). After initialization, we call the writeheader() method and write the table's header at the top of the spreadsheet. Finally, to write the row content, we open a for loop on our query object to iterate over the rows and write them to the file with the .writerow() method. Using the enumerate method, we can provide the user with a status update every 10,000 rows to let them know that our code is hard at work for larger file reports. After writing those status updates (and rows, of course), we add some additional log messages for the user and exit the function. Although we are calling the csv library, remember that it is actually our unicodecsv import. This means that we will encounter less encoding errors while generating our output versus using the standard csv library:
266 cols = [u'id', u'custodian', u'file_name', u'file_path',
267 u'extension', u'file_size', u'ctime', u'mtime',
268 u'atime', u'mode', u'inode']
269
270 with open(source, 'wb') as csv_file:
271 csv_writer = csv.DictWriter(csv_file, cols)
272 csv_writer.writeheader()
273 for counter, row in enumerate(query):
274 csv_writer.writerow(row)
275 if counter % 10000 == 0:
276 logger.debug('{:,} lines written'.format(counter))
277 logger.debug('{:,} lines written'.format(counter))
278
279 logger.info('CSV Report completed: ' + source)