Clearing the cache

The method to clear all the caches to get cold cache performance varies based on the operating system. Just stopping the database server isn't enough, because the operating system cache will still have plenty of information cached. In Linux, you can use the drop_caches feature to discard everything it has in its page cache. Here's a complete example of cleaning the data out of memory for this database on Linux:

$ pg_ctl stop
$ sudo su -
# sync
# echo 3 > /proc/sys/vm/drop_caches
# logout
$ pg_ctl start -l $PGLOG

The sync here is to try and flush all data to disk before we just blow away the caches. This drop_caches feature on Linux is not intended for regular production server use, it is more of a debugging feature that's potentially dangerous.

Rerunning the same benchmark shows a quite different performance:

Now, run the following:

Now you're seeing hard drive sequential read speeds-96MB per second from a laptop hard drive, and this data isn't necessarily even contiguous. It's hard to achieve full drive speed on something so small though.

Repeating the query now returns to the original speed we already know to expect from a hot cache run:

Tests against real datasets need to recognize whether their data is already in the cache or not. The usual technique is to run each query three times. If the first is much slower than the second and third, it probably started with a cold cache. If the times are all the same, the cache was likely hot before starting. And if all three vary in some other way, there's probably another variable involved besides the cache. It may take a larger number of queries to extract the pattern for why speeds vary.

Examples in this chapter are all run with a hot cache, so you're seeing processing time, but not disk access time. This makes them slightly unrealistic, but the Dell Store sample available for PostgreSQL is not big enough to be uncached for very long on modern hardware. The larger scale versions haven't been ported to the database yet.