Creating our fuzzy hashes

Before we dive into the code for our fuzz_file() function, let's talk briefly about the moving parts here:

The rolling hash is similar to our earlier example in that it's used to identify the boundaries that we'll summarize using our traditional hashes. In the case of ssdeep and spamsum, the reset point that the rolling hash is compared to (set to 7 in our prior example) is calculated based on the file's size. We'll show the exact function for determining this value in a bit, though we wanted to highlight that this means only files with the same block size can be compared. While there is more to talk about conceptually, let's start working through the code and applying these concepts.

We now move to the fun function: fuzz_file(). This function accepts a file path and uses the constants found at the beginning of the file to handle the calculation of the signature:

087 def fuzz_file(file_path):
088 """
089 The fuzz_file function creates a fuzzy hash of a file
090 :param file_path (str): file to read.
091 :return (str): spamsum hash
092 """