In practice

Let's explore two ways we can reuse existing code. After writing our code to replace strings in a ZIP file full of text files, we are later contracted to scale all the images in a ZIP file to 640 x 480. It looks like we could use a very similar paradigm to what we used in ZipReplace. Our first impulse might be to save a copy of that file and change the find_replace method to scale_image or something similar.

But, that's suboptimal. What if someday we want to change the unzip and zip methods to also open TAR files? Or maybe we'll want to use a guaranteed unique directory name for temporary files. In either case, we'd have to change it in two different places!

We'll start by demonstrating an inheritance-based solution to this problem. First, we'll modify our original ZipReplace class into a superclass for processing generic ZIP files:

import sys
import shutil
import zipfile
from pathlib import Path


class ZipProcessor:
def __init__(self, zipname):
self.zipname = zipname
self.temp_directory = Path(f"unzipped-{zipname[:-4]}")

def process_zip(self):
self.unzip_files()
self.process_files()
self.zip_files()

def unzip_files(self):
self.temp_directory.mkdir()
with zipfile.ZipFile(self.zipname) as zip:
zip.extractall(self.temp_directory)

def zip_files(self):
with zipfile.ZipFile(self.zipname, "w") as file:
for filename in self.temp_directory.iterdir():
file.write(filename, filename.name)
shutil.rmtree(self.temp_directory)

We changed the filename property to zipname to avoid confusion with the filename local variables inside the various methods. This helps make the code more readable, even though it isn't actually a change in design.

We also dropped the two parameters to __init__ (search_string and replace_string) that were specific to ZipReplace. Then, we renamed the zip_find_replace method to process_zip and made it call an (as yet undefined) process_files method instead of find_replace; these name changes help demonstrate the more generalized nature of our new class. Notice that we have removed the find_replace method altogether; that code is specific to ZipReplace and has no business here.

This new ZipProcessor class doesn't actually define a process_files method. If we ran it directly, it would raise an exception. Because it isn't meant to run directly, we removed the main call at the bottom of the original script. We could make this an abstract base class in order to communicate that this method needs to be defined in a subclass, but I've left it out for brevity.

Now, before we move on to our image processing application, let's fix up our original zipsearch class to make use of this parent class, as follows:

class ZipReplace(ZipProcessor):
def __init__(self, filename, search_string, replace_string):
super().__init__(filename)
self.search_string = search_string
self.replace_string = replace_string

def process_files(self):
"""perform a search and replace on all files in the
temporary directory"""
for filename in self.temp_directory.iterdir():
with filename.open() as file:
contents = file.read()
contents = contents.replace(self.search_string, self.replace_string)
with filename.open("w") as file:
file.write(contents)

This code is shorter than the original version, since it inherits its ZIP processing abilities from the parent class. We first import the base class we just wrote and make ZipReplace extend that class. Then, we use super() to initialize the parent class. The find_replace method is still here, but we renamed it process_files so the parent class can call it from its management interface. Because this name isn't as descriptive as the old one, we added a docstring to describe what it is doing.

Now, that was quite a bit of work, considering that all we have now is a program that is functionally not different from the one we started with! But having done that work, it is now much easier for us to write other classes that operate on files in a ZIP archive, such as the (hypothetically requested) photo scaler. Further, if we ever want to improve or bug fix the zip functionality, we can do it for all subclasses at once by changing only the one ZipProcessor base class. Therefore maintenance will be much more effective.

See how simple it is now to create a photo scaling class that takes advantage of the ZipProcessor functionality:

from PIL import Image 
 
class ScaleZip(ZipProcessor): 
  
    def process_files(self): 
        '''Scale each image in the directory to 640x480''' 
        for filename in self.temp_directory.iterdir(): 
            im = Image.open(str(filename)) 
            scaled = im.resize((640, 480)) 
            scaled.save(filename)
 
if __name__ == "__main__": 
    ScaleZip(*sys.argv[1:4]).process_zip() 

Look how simple this class is! All that work we did earlier paid off. All we do is open each file (assuming that it is an image; it will unceremoniously crash if a file cannot be opened or isn't an image), scale it, and save it back. The ZipProcessor class takes care of the zipping and unzipping without any extra work on our part.