Caching

The caching decorator is quite similar to argument checking, but focuses on those functions whose internal state does not affect the output. Each set of arguments can be linked to a unique result. This style of programming is the characteristic of functional programming, and can be used when the set of input values is finite.

Therefore, a caching decorator can keep the output together with the arguments that were needed to compute it, and return it directly on subsequent calls.

This behavior is called memoizing, and is quite simple to implement as a decorator:

"""This module provides simple memoization arguments
that is able to store cached return results of
decorated function for specified period of time.
"""
import time import hashlib import pickle cache = {} def is_obsolete(entry, duration):
"""Check if given cache entry is obsolete""" return time.time() - entry['time']> duration def compute_key(function, args, kw):
"""Compute caching key for given value""" key = pickle.dumps((function.__name__, args, kw)) return hashlib.sha1(key).hexdigest() def memoize(duration=10):
"""Keyword-aware memoization decorator

It allows to memoize function arguments for specified
duration time.
""" def _memoize(function): def __memoize(*args, **kw): key = compute_key(function, args, kw) # do we have it already in cache? if (
key in cache and not is_obsolete(cache[key], duration)
):
# return cached value if it exists
# and isn't too old print('we got a winner') return cache[key]['value'] # compute result if there is no valid
# cache available result = function(*args, **kw)
# store the result for later use cache[key] = { 'value': result, 'time': time.time() } return result return __memoize return _memoize

A SHA hash key is built using the ordered argument values, and the result is stored in a global dictionary. The hash is made using a pickle, which is a bit of a shortcut to freeze the state of all objects that are passed as arguments, ensuring that all arguments are good candidates. If a thread or a socket is used as an argument, a PicklingError will occur (refer to https://docs.python.org/3/library/pickle.html). The duration parameter is used to invalidate the cached value when too much time has passed since the last function call.

Here's an example of the memoize decorator usage (assuming that the previous snippet is stored in the memoize module):

>>> from memoize import memoize
>>> @memoize()
... def very_very_very_complex_stuff(a, b): ... # if your computer gets too hot on this calculation ... # consider stopping it ... return a + b ... >>> very_very_very_complex_stuff(2, 2) 4 >>> very_very_very_complex_stuff(2, 2) we got a winner 4 >>> @memoize(1) # invalidates the cache after 1 second ... def very_very_very_complex_stuff(a, b): ... return a + b ... >>> very_very_very_complex_stuff(2, 2) 4 >>> very_very_very_complex_stuff(2, 2) we got a winner 4 >>> cache {'c2727f43c6e39b3694649ee0883234cf': {'value': 4, 'time': 1199734132.7102251)} >>> time.sleep(2) >>> very_very_very_complex_stuff(2, 2) 4

Caching expensive functions can dramatically increase the overall performance of a program, but it has to be used with care. The cached value could also be tied to the function instead of using a centralized dictionary itself to better manage the scope and life cycle of the cache. But, in any case, a more efficient decorator would use a specialized cache library and/or dedicated caching service based on advanced caching algorithms. Memcached is a well-known example of such a caching service and can be easily used in Python.

Chapter 14, Optimization – Some Powerful Techniques, provides detailed information and examples for various caching techniques.