Servers are filled with write caches you need to be aware of:
- Operating system write cache: This cache can easily be gigabytes in size. Typically, you can flush data out of this cache by forcing a sync operation on the block that needs to be stored on disk. On POSIX systems (which includes all UNIX-like ones), this is done with the fsync or fdatasync calls. In some cases, it's possible to write directly in a sync mode, which is effectively a write followed by fsync. The postgresql.conf setting wal_sync_method controls which method is used, and it's possible to disable this altogether to optimize for speed instead of safety.
- Disk controller write cache: You'll find a write cache on most RAID controller cards, as well as inside external storage such as a SAN. Common sizes right now are 128 MB to 512 MB for cards, but gigabytes are common on a SAN. Typically, controllers can be changed to operate completely in write-through mode, albeit slowly. But by default, you'll normally find them in write-back mode. Writes that can fit in the controller's cache are stored there; the operating system is told the write is complete, and the card writes the data out at some future time. To keep this write from being lost if power is interrupted, the card must be configured with a battery. That combination is referred to as a battery-backed write cache (BBWC).
- Disk drive write cache: All SATA and SAS disks have a write cache on them that on current hardware is 8 MB to 32 MB in size. This cache is always volatile--if power is lost, any data stored in there will be lost. And they're always write-back caches if enabled.
How can you make sure you're safe given all these write-back caches that might lose your data? There are a few basic precautions to take:
- Make sure whatever filesystem you're using properly implements fsync calls, or whatever similar mechanism is used, fully. More details on this topic can be found in the wal_sync_method documentation and in information about filesystem tuning covered in later chapters.
- Monitor your driver controller battery--some controller cards will monitor their battery health, and automatically switch from write-back to write-though mode when there is no battery or it's not working properly. That's a helpful safety measure, but performance is going to drop hard when this happens.
- Disable any drive write caches--most hardware RAID controllers will do this for you, preferring their own battery-backed caches instead.