Iteration with the folder_traverse() function

This function recursively walks through folders to parse message items and indirectly generates summary reports on the folder. This function, initially provided the root directory, is generically developed to be capable of handling any folder item passed to it. This allows us to reuse the function for each discovered subfolder. On line 97, we use a for loop to recurse through the sub_folders iterator generated from our pypff.folder object. On line 98, we check whether the folder object has any additional subfolders and, if it does, call the folder_traverse() function again before checking the current folder for any new messages. We only check for messages in the event that there are no new subfolders:

089 def folder_traverse(base):
090 """
091 The folder_traverse function walks through the base of the
092 folder and scans for sub-folders and messages
093 :param base: Base folder to scan for new items within
094 the folder.
095 :return: None
096 """
097 for folder in base.sub_folders:
098 if folder.number_of_sub_folders:
099 folder_traverse(folder) # Call new folder to traverse
100 check_for_msgs(folder)

This is a recursive function because we call the same function within itself (a loop of sorts). This loop could potentially run indefinitely, so we must make sure the data input will have an end to it. A PST should have a limited number of folders and will therefore eventually exit the recursive loop. This is essentially our PST specific os.walk() function, which iteratively walks through filesystem directories. Since we're working with folders and messages within a file container, we have to create our own recursion. Recursion can be a tricky concept to understand; to guide you through it, please reference the following diagram when reading our explanation in the upcoming paragraphs:

In the preceding diagram, there're five levels in this PST hierarchy, each containing a mixture of blue folders and green messages. On level 1, we have Root Folder, which is the first iteration of the folder_traverse() loop. Since this folder has a single subfolder, Top of Personal Folders, as you can see on level 2, we rerun the function before exploring the message contents. When we rerun the function, we now evaluate the Top of Personal Folders folder and find that it also has subfolders.

Calling the folder_traverse() function again on each of the subfolders, we first process the Deleted Items folder on level 3. Inside the Deleted Items folder on level 4, we find that we only have messages in this folder and call the check_for_msgs() function for the first time.

After the check_for_msgs() function returns, we go back to the previous call of the folder_traverse() function on level 3 and evaluate the Sent Items folder. Since the Sent Items folder also doesn't have any subfolders, we process its messages before returning to level 3.

We then reach the Inbox folder on level 3 and call the folder_traverse() function on the Completed Cases subfolder on level 4. Now that we're in level 5, we process the two messages inside the Completed Cases folder. With these two messages processed, we step back to level 4 and process the two messages within the Inbox folder. Once these messages are processed, we've completed all items in levels 3, 4, and 5 and can finally move back to level 2. Within Root Folder, we can process the three message items there before the function execution concludes. Our recursion, in this case, works from the bottom up.

These four lines of code allow us to navigate through the entire PST and call additional processing on every message in every folder. Though this is usually provided to us through methods such as os.walk(), some libraries don't natively support recursion and require the developer to do so using the existing functionality within the library.