pgFouine

The most popular query log file analyzer for PostgreSQL is pgFouine, available at http://pgfouine.projects.postgresql.org/.

The program is available in packaged form for some operating systems that bundle large amounts of open source software with their distribution. pgFouine is available for RHEL and CentOS Linux systems via the EPEL repository, and is available as one of the standard packages on Debian/Ubuntu systems.

The program is a PHP script, and it has a moderate list of libraries it relies upon that are not bundled with PHP itself. As such, getting pgFouine working properly on a server that doesn't have a rich software library available can be difficult. This includes Windows and most older UNIX systems. Note that you don't necessarily have to run the program on the server itself. Copying the query log files daily from the database server onto another system, such as a web reporting host, is common practice from a performance point of view. You might, for example, have a Solaris database server (where installing the program could be difficult), then upload them to a Linux host, both to offload log file processing from the database hardware and to publish them out to a web application.

A basic test of pgFouine can use postgresql.conf with the default stderr log format, redirected into a file, with an appropriate prefix and logging all statements:

    logging_collector = on
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,remote=%r '
log_min_duration_statement = 0

Once you've generated some query traffic, you run the utility against the resulting log file:

    $ pgbench -T 60 -c 4 pgbench
$ pgfouine -logtype stderr -file $PGDATA/pg_log/postgresql-2010-03-
28_210540.log -top 10 -format text

Two of the sections left out--most frequent queries and slowest queries--are not very useful for this pgbench test, but might be useful for your data set. Note that even though individual queries are logged here, and we didn't use prepared statements, the program figured out the common parameters anyway. They show up in the last section there, replaced with the value of 0, which is what you want to see a normalizing query analysis tool do.

You can easily create HTML output instead:

    $ pgfouine -file $PGDATA/pg_log/postgresql-2010-03-28_210540.log -
logtype stderr -format html > fouine.htm

Many other types of reports are also available. You can just run pgFouine or read its documentation for an idea of the options. pgFouine also includes a VACUUM logging analyzer useful for monitoring that area of your database, as documented at http://pgfouine.projects.postgresql.org/vacuum.html.

Because pgFouine needs to handle multi-line queries intelligently, you will end up with missing data with a simple configuration when such a query appears. The program's authors have traditionally recommended using the syslog format for all logs to avoid this. Recent pgFouine releases (starting with 1.1) can instead import CSV format logs by passing the -logtype CSV parameter into the program.

pgFouine is powerful, but it has some limitations. The heavy installation dependencies can be overwhelming for some environments. As a PHP script, it's not particularly efficient when dealing with really large query log files. And it's not very flexible about things such as setting log_line_prefix; you're compelled to adopt its standard there. But if you can work within those limitations, pgFouine is a very capable tool for finding what the slowest queries in your application are.