Cost-based vacuuming

Running a regular VACUUM command is a pretty intensive operation. It's likely to become the most resource-intensive process running on your server, and if autovacuum ran in that fashion, it would be unusable in many environments.

Fortunately, there's another approach available. Cost-based vacuuming limits the amount of disk I/O any given vacuum or autovacuum process is expected to do per unit of time. It works by assigning an estimated cost to every I/O operation, accumulating a total for each operation it performs, then pausing once an upper limit on cost per iteration is exceeded.

The cost estimation presumes three basic operations that vacuum performs, each with its own presumed amount of work to handle:

Page hit: A data page was needed, but found in the shared buffer cache. It merely needs to be scanned for dead rows, with no physical I/O involved. This has a reference cost of 1 unit via vacuum_cost_page_hit, and you shouldn't ever adjust it.
Page miss: The data page needed isn't already buffered, so it must be read from disk before it can be scanned. This default set via vacuum_cost_page_miss makes this cost 10.
Page dirty: After being scanned, the page needed to be modified; it's now dirty and must be written out to disk. The default vacuum_cost_page_dirty is 20.

The preceding parameters are set globally for all tables in the database and impact both manual vacuum and autovacuum (which is described in more detail in the next section). The way these are described, such as saying things are read from disk, isn't quite right; they're read from the operating system cache, which may even have them in regular memory already. Partly because of that class of problem, it's difficult to adjust these values, and only recommended for very experienced PostgreSQL administrators to attempt. Measuring the real-world costs of these operations is hard, and there's not much evidence yet that doing so will improve significantly over the theoretical model used here. It's much better to start vacuum adjustment for workload by tweaking the higher-level settings--described next--instead of these.

A manual vacuum worker will execute until it has exceeded vacuum_cost_limit of the estimated I/O, defaulting to 200 units of work. At that point, it will then sleep for vacuum_cost_delay milliseconds, defaulting to 0--which disables the cost delay feature altogether with manually executed VACUUM statements.

However, autovacuum workers have their own parameters that work the same way. autovacuum_vacuum_cost_limit defaults to -1, which is shorthand for saying that they use the same cost limit structure (the ratios between individual costs) as manual vacuum. The main way that autovacuum diverges from a manual one is it defaults to the following cost delay:

    autovacuum_vacuum_cost_delay = 20ms

So where a regular VACUUM will just keep going each time it accumulates 200 or more cost units of operations, autovacuum will instead sleep for 20 ms each time it reaches that point.

Note that if you want to adjust a manual VACUUM to run with the cost logic, you don't need to adjust the server postgresql.conf file--this is a user setting. You can tweak it before issuing any manual vacuum and it will effectively limit its impact for just that session:

    postgres=# SET vacuum_cost_delay='20';
    postgres=# show vacuum_cost_delay;
    vacuum_cost_delay 
    -------------------
    20ms
    postgres=# VACUUM;

It's extremely difficult to turn all these estimated cost figures into a predicted real-world I/O figure. But if you combine adjusting this value with monitoring both the database and the I/O load at the operating system, it does allow some iterative tuning methods.