Optimization is the process of making an application work more efficiently without modifying its functionality and accuracy. In the previous chapter, we learned how to identify performance bottlenecks and observe resource usage in code. In this chapter, we will learn how to use that knowledge to make an application work faster and use resources with greater efficiency.
Optimization is not a magical process. It is done by following a simple algorithm synthesized by Stefan Schwarzer at EuroPython 2006. The original pseudocode of this example is as follows:
def optimize(): """Recommended optimization""" assert got_architecture_right(), "fix architecture" assert made_code_work(bugs=None), "fix bugs" while code_is_too_slow(): wbn = find_worst_bottleneck(just_guess=False, profile=True) is_faster = try_to_optimize(wbn, run_unit_tests=True, new_bugs=None) if not is_faster: undo_last_code_change() # By Stefan Schwarzer, EuroPython 2006
This example may not be the neatest or clearest, but the code captures almost all of the important aspects of an organized optimization procedure. The main things we can learn from it include the following:
- Optimization is an iterative process where not every iteration leads to better results
- The main prerequisite is that code is verified by tests
- Optimizing the current application bottleneck is key
Making your code work faster is not an easy task. In the case of abstract mathematical problems, the solution often lies in choosing the right algorithm and proper data structures. However, it is difficult to provide generic or universal tips and tricks that can be used to solve any algorithmic problem. There are, of course, some generic methodologies for designing a new algorithm, or even meta-heuristics that can be applied to a large variety of problems, but they are generally language-agnostic and are thus beyond the scope of this book.
There is a wide range of performance issues that are caused by either code quality defects or application usage contexts. These kind of problems can often be solved using common programming approaches, either with specific performance-oriented libraries and services or with proper software architecture design. Common non-algorithmic culprits of bad application performance in Python include the following:
- Incorrect usage of basic built-in types
- Too much complexity
- Hardware resource usage patterns that do not match the execution environment
- Long response times from third-party APIs or backing services
- Requiring too much work in time-critical parts of the application
More often than not, solving such performance issues does not require advanced academic knowledge, only good software craftsmanship—and a big part of craftsmanship is knowing when to use the proper tools. Fortunately, there are some well-known patterns and solutions for dealing with performance problems.
In this chapter, we will discuss some popular and reusable solutions that allow you to non-algorithmically optimize your program, by covering the following topics:
- Defining complexity
- Reducing complexity
- Using architectural trade offs
- Caching