Transaction ID wraparound

The implementation of MVCC in PostgreSQL uses a transaction ID that is 32 bits in size. It's impractical to make it any longer because, as you've just seen, visibility information is stored in each row as xmin and xmax values. Having a larger ID would therefore increase the size of each row by a significant amount. A signed 32-bit number can only handle a range of about 2 billion transactions before rolling over to zero. When it exceeds its range, transactions that used to appear in the past will now appear to be from the future, which, as you might imagine, will wreak havoc. If this transaction wraparound ever happens to your database, it will fail to operate sanely, and the database will go far out of its way to keep that from happening.

The way that the 32-bit XID is mapped to handle many billions of transactions is that each table and database has a reference XID, and every other XID is relative to it. This gives an effective range of 2 billion transactions before and after that value. You can see how old these reference XID numbers are relative to current activity, starting with the oldest active entries, like this:

    SELECT relname,age(relfrozenxid) FROM pg_class WHERE relkind='r' 
      ORDER BY age(relfrozenxid) DESC;
    SELECT datname,age(datfrozenxid) FROM pg_database ORDER BY
      age(datfrozenxid) DESC;

One of the things VACUUM does is push forward the frozen value once a threshold of transactions have passed, set by the vacuum_freeze_max_age parameter, and autovacuum has its own setting as autovacuum_freeze_max_age. This maintenance is also critical to cleaning up the commit log information stored in the pg_clog directory. Some transactions will fall off the back here, if they have a transaction ID so old that it can't be represented relative to the new reference values. These will have their XID replaced by a special magic value called the FrozenXID. Once that happens, those transactions will appear in the past relative to all active transactions.

The values for these parameters are set very conservatively by default--things start to be frozen after only 200 million transactions, even though wraparound isn't a concern until 2 billion. One reason for this is to keep the commit log disk space from growing excessively. At the default value, it should never take up more than 50 MB, while increasing the free age to its maximum (2 billion) will instead use up to 500 MB of space. If you have large tables where that disk usage is trivial and you don't need to run vacuum regularly in order to reclaim space, increasing the maximum free age parameters can be helpful to keep autovacuum from doing more work than it has to in freezing your tables.

Adjusting the minimum values for when freezing happens is only really a good idea in one situation: just after a form of bulk loading where there's no way you will be modifying the transactions you just added. The problem with using a low vacuum_freeze_min_age during normal use is that the freezing process will replace transaction information with the special FrozenXID, which loses potentially useful data about when those transactions committed. If you ever end up in the unfortunate position where you have to dig into transaction ordering forensics to track down how a problem happened, discovering that the XID data that would have helped sort that out has been frozen is not good news.

Therefore, the basic trade-off that you need to decide on is how much of this detailed transaction number diagnostic information you want to keep around, knowing that if you keep too much of it around, it will take up more disk space and make pg_clog less efficient. At the same time, deciding to delay vacuum for as long as possible also means that when it does happen, you will need to do a lot of work. More frequent VACUUM work means more regular small disruptions, though, so some prefer to just schedule that cleanup rather than risk it popping up at a bad time. It's tricky to provide general advice that works for everyone.

Starting in PostgreSQL 8.3, you can use the system functions described at http://www.postgresql.org/docs/current/static/functions-info.html to determine the latest transaction ID in use on the server. txid_current() is the simplest value to inspect, as shown in some of the early examples in this chapter. Watching that number grow over time lets you estimate how many transactions are occurring on your system. In order to compensate for transaction wraparound, the value returned is actually 64 bits wide, with the top 32 bits holding a counter of how many times the XID has crossed back to zero. This makes the value returned by txid_current() always move forward, even as the actual XID rolls over its upper limit.