PostgreSQL uses a multi-version approach to transactions within the database. A transaction is a formal term for a SQL statement’s effects being synchronized with the “current” data in the database. This doesn’t necessarily mean that the data is written to disk, but it becomes part of the “current” set of information stored in the database. When a statement’s results have effectively been processed in the current state of the database, the transaction is considered to be committed.
The issue of two users attempting to commit changes to the same database object is obviously a potential concern, as their modifications may be exclusive to one another. Some database systems rely on locking to prevent such conflicts.
Locking is a mechanism that disallows selecting from a database object while it is being modified, and vice versa. Locking presents several obvious performance concerns. For example, data which is being updated will not be selectable until the update transaction has completed.
PostgreSQL’s Multi-Version Concurrency Control (MVCC), however, allows for SQL statements to be performed within transaction-deferred blocks. This means that each connection to PostgreSQL essentially maintains a temporary snapshot of the database for objects modified within a transaction block, before the modifications are committed.
Without explicitly opening a transaction block, all SQL statements issued to PostgreSQL are auto-committed, meaning that the database is synchronized with the results of the statement immediately upon execution. When a transaction block is used, however, changes made to the database will not be visible to other users until the block is committed. This allows for several changes to various objects within a database to be made tentatively. They can then be either committed all at once, or rolled back.
Rolling back a transaction returns the state of any affected objects to the condition they were in before the transaction block began. This can be useful when recovering from a partially failed operation, in that any modifications made part-way into a process can be undone. Rolled back transactions are never actually committed; while the process appears to undo modifications to the user who performed the rollback, other users connected to the same database never know the difference.
PostgreSQL also supports cursors, which are flexible references to fully executed SQL queries. A cursor is able to traverse up and down a result set, and only retrieve those rows which are explicitly requested. Used properly, a cursor can aid an application in efficiently using a static result set. A cursor may only be executed within a transaction block.
The following sections cover the basic use of transactions and cursors. They show how to begin, commit, and roll back transactions, and also how to declare, move, and fetch data from a cursor.
Transaction blocks are explicitly started with the BEGIN
SQL command.
This keyword may optionally be followed by either of the noise terms WORK
or TRANSACTION
, though they have no effect on the statement, or the
transaction block.
Example 7-38 begins a transaction block within the booktown
database.
Any SQL statement made after the BEGIN
SQL command will appear to take
effect as normal to the user making the modifications. As stated earlier, however, other users
connected to the database will be oblivious to the modifications that appear to have been made
from within the transaction block until it is committed.
Transaction blocks are closed with the COMMIT
SQL command, which may
be followed by either of the optional noise terms WORK
or TRANSACTION
. Example 7-39 uses the COMMIT
SQL command to synchronize the database system with the result of an
UPDATE
statement.
Example 7-39. Committing a transaction
booktown=# BEGIN; BEGIN booktown=# UPDATE subjects SET location = NULL booktown-# WHERE id = 12; UPDATE 1 booktown=# SELECT location FROM subjects WHERE id = 12; location ---------- (1 row) booktown=# COMMIT; COMMIT
Again, even though the SELECT
statement immediately reflects the
result of the UPDATE
statement in Example 7-39, other users connected to the same database will not be aware of that modification until
after the COMMIT
statement is executed.
To roll back a transaction, the ROLLBACK
SQL command is used. Again,
either of the optional noise terms WORK
or TRANSACTION
may follow the ROLLBACK
command.
Example 7-40 begins a transaction block, makes a
modification to the subjects
table, and verifies the modification within
the block. The transaction is then rolled back, returning the table to the state it was in
before the transaction began.
Example 7-40. Rolling back a transaction
booktown=# BEGIN; BEGIN booktown=# SELECT * FROM subjects WHERE id = 12; id | subject | location ----+----------+---------- 12 | Religion | (1 row) booktown=# UPDATE subjects SET location = 'Sunset Dr' booktown-# WHERE id = 12; UPDATE 1 booktown=# SELECT * FROM subjects WHERE id = 12; id | subject | location ----+----------+----------- 12 | Religion | Sunset Dr (1 row) booktown=# ROLLBACK; ROLLBACK booktown=# SELECT * FROM subjects WHERE id = 12; id | subject | location ----+----------+---------- 12 | Religion | (1 row)
PostgreSQL is very strict about errors in SQL statements inside of transaction blocks.
Even an innocuous parse error, such as that shown in Example 7-41, will cause the transaction to enter into the
ABORT STATE
. This means that no further statements may be executed until
either the COMMIT
or ROLLBACK
command is used to end the
transaction block.
A SQL cursor in PostgreSQL is a read-only pointer to a fully executed SELECT
statement’s result set. Cursors are typically used within applications that
maintain a persistent connection to the PostgreSQL backend. By executing a cursor, and
maintaining a reference to its returned result set, an application can more efficiently manage
which rows to retrieve from a result set at different times, without having to re-execute the
query with different LIMIT
and OFFSET
clauses.
Used within a programming Application Programming Interface (API), cursors are often used to allow multiple queries to be executed to a single database backend, which are then tracked and managed separately by the application through references to the cursor. This prevents having to store all of the results in memory within the application.
Cursors are often abstracted within a programming API (such as
libpq++’s PgCursor
class), though they can also be
directly created and manipulated through standard SQL commands. For the sake of generality,
this section uses psql to demonstrate the fundamental concepts of cursors
with SQL. The four SQL commands involved with PostgreSQL cursors are DECLARE, FETCH,
MOVE
and CLOSE
.
The DECLARE
command both defines and opens a cursor, in effect
defining the cursor in memory, and then populating the cursor with information about the
result set returned from the executed query. The FETCH
command lets you
pull rows from an open cursor. The MOVE
command moves the “current”
location of the cursor within the result set, and the CLOSE
command closes
the cursor.
If you are interested in learning how to use cursors within a particular API, consult that API’s documentation.
A cursor is both created and executed with the DECLARE
SQL command.
This process is also referred to as “opening” a cursor. A cursor may be declared only within
an existing transaction block, so you must execute a BEGIN
command prior
to declaring a cursor. Here is the syntax for DECLARE:
DECLARE cursorname [ BINARY ] [ INSENSITIVE ] [ SCROLL ] CURSOR FOR query [ FOR { READ ONLY | UPDATE [ OF column [, ...] ] } ]
DECLARE
cursorname
cursorname
is the name of the cursor to
create.
[ BINARY ]
The optional BINARY
keyword causes output to be retrieved in
binary format instead of standard ASCII; this can be more efficient, though it is only
relevant to custom applications, as clients such as psql are not
built to handle anything but text output.
[ INSENSITIVE ] [ SCROLL ]
The INSENSITIVE
and SCROLL
keywords exist for
compliance with the SQL standard, though they each define PostgreSQL’s default behavior
and are never necessary. The INSENSITIVE
SQL keyword exists to ensure
that all data retrieved from the cursor remains unchanged from other cursors or
connections. Since PostgreSQL requires that cursors be defined within transaction blocks,
this behavior is already implied. The SCROLL
SQL keyword exists to
specify that multiple rows can be selected at a time from the cursor. This is the default
in PostgreSQL, even if unspecified.
CURSOR FOR
query
query
is the complete query whose result set
will be accessible by the cursor, when executed.
[ FOR { READ ONLY | UPDATE [ OF
column
[, ...] ] } ]
As of PostgreSQL 7.1.x, cursors may only be defined as READ ONLY
,
and the FOR
clause is therefore superfluous.
Example 7-42 begins a transaction block with the
BEGIN
keyword, and opens a cursor named all_books
with
SELECT * FROM books
as its executed SQL statement.
Example 7-42. Declaring a cursor
booktown=# BEGIN; BEGIN booktown=# DECLARE all_books CURSOR booktown-# FOR SELECT * FROM books; SELECT
The SELECT
message returned from Example 7-42 indicates that the statement was executed
successfully. This means that the rows retrieved by the query are now accessible from the
all_books
cursor.
You may retrieve rows from a cursor with the FETCH
SQL command. Here
is the syntax for the FETCH
SQL command:
FETCH [ FORWARD | BACKWARD | RELATIVE ]
[ # | ALL | NEXT | PRIOR ]
{ IN | FROM } cursor
In this syntax diagram, cursor
is the name of
the cursor from which to retrieve row data. A cursor always points to a “current” position in
the executed statement’s result set, and rows can be retrieved either ahead or behind of the
current location. The FORWARD
and BACKWARD
keywords may
be used to specify the direction, though the default is forward. The RELATIVE
keyword is a noise term made available for SQL92 compliance.
The ABSOLUTE
keyword can be used, but absolute cursor positioning
and fetching are not supported as of PostgreSQL 7.1.x; the cursor will still use relative
positioning and provide a notice regarding the state of absolute positioning being
unsupported.
Following the direction you may optionally specify a quantity. This quantity may either
be a literal number of rows to be returned (in the form of an integer constant) or one of
several keywords. The ALL
keyword causes returns all rows from the current
cursor position. The NEXT
keyword (the default) returns the next single
row from the current cursor position. The PRIOR
keyword causes the single
row preceding the current cursor position to be returned.
There is no functional difference between the IN
and FROM
keywords, but one of these must be specified.
Example 7-43 fetches the first four rows stored in the
result set pointed to by the all_books
cursor. As a direction is not
specified, FORWARD
is implied. It then uses a FETCH
statement with the NEXT
keyword to select the fifth row, and then another
FETCH
statement with the PRIOR
keyword to again select
the fourth retrieved row.
Example 7-43. Fetching rows from a cursor
booktown=# FETCH 4 FROM all_books; id | title | author_id | subject_id ------+-----------------------+-----------+------------ 7808 | The Shining | 4156 | 9 4513 | Dune | 1866 | 15 4267 | 2001: A Space Odyssey | 2001 | 15 1608 | The Cat in the Hat | 1809 | 2 (4 rows) booktown=# FETCH NEXT FROM all_books; id | title | author_id | subject_id ------+-----------------------------+-----------+------------ 1590 | Bartholomew and the Oobleck | 1809 | 2 (1 row) booktown=# FETCH PRIOR FROM all_books; id | title | author_id | subject_id ------+--------------------+-----------+------------ 1608 | The Cat in the Hat | 1809 | 2 (1 row)
A cursor maintains a position in the result set of its referenced SELECT
statement. You can use the MOVE
command to move the
cursor to a specified row position in that result set. Here is the syntax for the MOVE
command:
MOVE [ FORWARD | BACKWARD | RELATIVE ]
[ # | ALL | NEXT | PRIOR ]
{ IN | FROM } cursor
As you can see, the syntax is very similar to FETCH
. However, the
MOVE
command does not retrieve any rows and only moves the current
position of the specified cursor
. The amount is
specified by either an integer constant, the ALL
keyword (to move as far
as can be moved in the specified direction), NEXT
, or PRIOR
. Example 7-44 moves the cursor forward 10 rows
from its current position in the result set.
Use the CLOSE
command to explicitly close an open cursor. A cursor is
also implicitly closed if the transaction block that it resides within is committed with the
COMMIT
command, or rolled back with the ROLLBACK
command.
Here is the syntax for CLOSE
, where cursorname
is the name of the cursor intended to be closed:
CLOSE cursorname
Example 7-45 closes the
all_books
cursor, freeing the associated memory, and rendering the
cursor’s results inaccessible.