Automating Common Routines

As an object-relational DBMS, PostgreSQL has helped pioneer several non-standard SQL extensions. Several of these are designed to aid in the automation of commonly executed database routines.

This section covers two such extensions: sequences and triggers.

Sequences

A sequence in PostgreSQL is a database object that is essentially an automatically incrementing numeric value. For this reason, sequences are commonly known in other database products as auto-increment values. Sequences can be extremely useful in assigning non-random, unique identification numbers to tables that require such values. A sequence consists of a current numeric value, and a set of characteristics that determine how to automatically increment (or alternatively, decrement) that value upon use.

Along with its current value, a sequence also includes a minimum value, a maximum value, a starting value, and the amount to increment the sequence by. This increment is usually 1, but may be any whole integer.

In practice, sequences are not meant to be accessed directly. Instead, they are used through a set of functions built into PostgreSQL which either set, increment, or return the current value of the sequence.

Creating a sequence

Sequences are created with the CREATE SEQUENCE SQL command. The sequence can be specified to increment or decrement. The syntax for CREATE SEQUENCE is:

CREATE SEQUENCE sequencename
       [ INCREMENT increment ]
       [ MINVALUE minvalue ]
       [ MAXVALUE maxvalue ]
       [ START start ]
       [ CACHE cache ]
       [ CYCLE ]

In this syntax, sequencename is the name of the sequence to be created. This is the only required parameter. A sequence uses the integer data type, and it therefore shares its maximum and minimum limitations of 2147483647 and −2147483647, respectively.

The optional CREATE SEQUENCE clauses are as follows:

INCREMENT increment_val

Sets the numeric quantity with which to modify the sequence’s value to increment_val. This is used when the nextval( ) function is called on the sequence. Setting increment_val to a negative number results in a descending sequence. The default value is 1.

MINVALUE minvalue

Sets the fixed minimum value for the sequence to minvalue. Any attempt to lower a sequence below this value will result in an error, or in the value cycling to its maximum value (if the CYCLE keyword was used when the sequence was created).

The default value is 1 for ascending sequences, and –2147483647 for descending sequences.

MAXVALUE maxvalue

Sets the fixed maximum value for the sequence to maxvalue. Any attempt to raise a sequence above this value will result in an error, or in the value cycling to its minimum value.

The default value is 2147483647 for ascending sequences, and –1 for descending sequences.

START start_val

Sets the value that the sequence begins at. It may be any integer between the minimum and maximum values. The sequence defaults to start at its minimum value for ascending sequences, and its maximum value for descending sequences.

CACHE cache

Provides the ability for sequence values to be pre-allocated and stored in memory. This can result in faster access times to highly used sequences. The minimum and default value is 1; a higher value of cache results in more values being cached.

CYCLE

Enables the sequence to continue generating new values after it has reached its maximum or minimum value. When the limit is reached, the sequence starts over at the minimum value (for ascending sequences), or at the maximum value (descending sequences).

Example 7-28 creates a simple ascending sequence named shipments_ship_id_seq that starts at a value of 0, and will be incremented by the default increment of 1 until it reaches the default maximum limit of 2147483647. By not using the CYCLE keyword, the sequence is guaranteed to always return a unique value.

Example 7-28. Creating a sequence

booktown=# CREATE SEQUENCE shipments_ship_id_seq
booktown-#                 MINVALUE 0;
CREATE

Viewing a sequence

The output from the \d command within psql shows whether or not a database object is a sequence, table, view or index. More specifically, the \ds command can be used to view all sequences in the currently connected database. For example:

booktown=# \ds
             List of relations
          Name         |   Type   |  Owner
-----------------------+----------+---------
 book_ids              | sequence | manager
 shipments_ship_id_seq | sequence | manager
 subject_ids           | sequence | manager
(3 rows)

While not often necessary, sequences can be directly queried with SELECT statements, as if they were a table or view. When you query a sequence, you use the attributes of that sequence as columns in your select list. The attributes of a sequence are shown in Table 7-1.

Table 7-1. Sequence attributes

Attribute	Type
`sequence_name`	`name`
`last_value`	`integer`
`increment_by`	`integer`
`max_value`	`integer`
`min_value`	`integer`
`cache_value`	`integer`
`log_cnt`	`integer`
`is_cycled`	`"char"`
`is_called`	`"char"`

Example 7-29 illustrates a query to the shipments_ship_id_seq sequence. This query selects the last_value attribute, which is the most currently selected value from the sequence, and the increment_by attribute, which is the amount the sequence is to be incremented each time the nextval( ) function is called.

Example 7-29. Viewing a sequence

booktown=# SELECT last_value, increment_by
booktown-#        FROM shipments_ship_id_seq;
 last_value | increment_by
------------+--------------
          0 |            1
(1 row)

Since the sequence in question has just been created, its last_value is still set to 0.

Using a sequence

Sequences are typically not queried directly, but are instead used through functions. There are three functions in PostgreSQL which apply exclusively to sequences:

nextval('sequence_name'): Increments the value of the specified sequence named sequence_name, and returns the new value, which is of type integer.
currval('sequence_name'): Returns the most recently returned value from nextval('sequence_name'). This value is associated with a PostgreSQL session, and if the nextval( ) function has not yet been called in the connected session on the specified sequence sequence_name, there will be no “current” value returned.
setval('sequence_name', n): Sets the current value of the specified sequence to the numeric value n. The value returned by the next call to nextval( ) will return n + increment, where increment is the amount that the sequence increments by each iteration.
setval('sequence_name', n, b): Also sets the current value of the specified sequence to the numeric value n. However, if b (a value of type boolean) is false, the value returned by the next call to nextval( ) will be just n. If b is true, the next call to nextval( ) will return n + increment, as it would without specifying the Boolean argument.

The most commonly used sequence function is nextval( ). This is the function that actually pushes the increment of the value. It requires the name of the sequence as the argument (bound by single quotes), and returns a value of type integer.

Example 7-30 selects a couple of incremented values from the sequence named shipments_ship_id_seq.

Example 7-30. Incrementing a sequence

booktown=# SELECT nextval('shipments_ship_id_seq');
 nextval
---------
       1
(1 row)

booktown=# SELECT nextval('shipments_ship_id_seq');
 nextval
---------
       2
(1 row)

Note

The first call to nextval( ) will return the sequence’s initial value (set by the START keyword), since it has not yet been called to increment the starting value. All subsequent calls increment the last_value column.

Sequences are commonly used as default values for tables which require unique integer identifiers. The shipments table within the booktown database, shown in Table 7-2, exemplifies this.

Table 7-2. The shipments table

Column	Type	Modifier
`id`	`integer`	`NOT NULL DEFAULT` `nextval('shipments_ship_id_seq')`
`customer_id`	`integer`
`isbn`	`text`
`ship_date`	`timestamp with time zone`

The syntax to create the table in Table 7-2, with the auto-incrementing DEFAULT and PRIMARY KEY constraint, is:

CREATE TABLE shipments
       (id integer DEFAULT nextval('shipments_ship_id_seq') PRIMARY KEY,
        customer_id integer, isbn text, ship_date timestamp)

The default value for the id column in Table 7-2 is set to the nextval( )’s result on the shipments_ship_id_seq sequence. Insertion of row data that does not specify a value for id will therefore choose its value from the result of this function call.

Note

Merely placing a DEFAULT constraint on the id column does not enforce the use of that default. A user could still manually insert a value, potentially causing a conflict with future sequence values. This can be disallowed with the use of a trigger. See the section titled Triggers later in this chapter for more information.

After the nextval( ) function has been called on a sequence in a given session (a connection to PostgreSQL), the currval( ) function may be used on that same sequence to return the most recently returned value from the sequence. Note that this function may only be called on a sequence that has been called through nextval( ) in the active session.

Note

Sequences’ “current” values are associated with sessions in order to prevent multiple users from running into mistakes by accessing the same sequence at the same time. Two users may access the same sequence from separate sessions, but the currval( ) function will return only the most recently incremented value of the sequence from within the same session that calls currval( ).

Example 7-31 inserts a new row into the shipments column, without specifying the value for the id column. This causes the default value to be used, which (as noted in Table 7-2) is the result of the shipments_ship_id_seq being incremented by the nextval( ) function. The currval( ) function is then used to access the row that was just inserted.

Example 7-31. Using currval( )

booktown=# INSERT INTO shipments (customer_id, isbn, ship_date)
booktown-#             VALUES (221, '0394800753', 'now');
INSERT 3628625 1
booktown=# SELECT * FROM shipments
booktown-#          WHERE id = currval('shipments_ship_id_seq');
  id  | customer_id |    isbn    |       ship_date
------+-------------+------------+------------------------
 1002 |         107 | 0394800753 | 2001-09-22 11:23:28-07
(1 row)

Finally, a sequence may also have its last_value attribute reset to an arbitrary numeric value (within its maximum and minimum value range) by using the setval( ) function. This requires the name of the sequence as a single-quote bound character string for the first argument and an integer constant representing the new value for last_value for the second argument.

There are two ways to go about this. By default, setval( ) assumes that the new setting is for an initialized sequence; this means that the next value returned by nextval( ) will actually be incremented once past the value set by setval( ).

Alternatively, an optional false value of type boolean may be added as the last argument to setval( ), de-initializing the sequence. This modifies the sequence so that the next value returned by nextval( ) will be the same numeric value passed to setval( ) (though the sequence will of course be incremented on the next call to nextval( )).

Example 7-32 sets the shipments_ship_id_seq’s last_value to 1010 through each method, and selects the nextval( ) on the same sequence to illustrate the effective result.

Example 7-32. Setting a sequence value

booktown=# SELECT setval('shipments_ship_id_seq', 1010);
 setval
--------
   1010
(1 row)

booktown=# SELECT nextval('shipments_ship_id_seq');
 nextval
---------
    1011
(1 row)

booktown=# SELECT setval('shipments_ship_id_seq', 1010, false);
 setval
--------
   1010
(1 row)

booktown=# SELECT nextval('shipments_ship_id_seq');
 nextval
---------
    1010
(1 row)

Note

Sequences are commonly used to ensure unique values in a column. Be sure that you understand the application of a sequence before you reset its last_value attribute.

Destroying a sequence

To destroy a sequence, or several sequences simultaneously, use the DROP SEQUENCE SQL command. Here is the syntax for DROP SEQUENCE:

DROP SEQUENCE sequencename [, ...]

In this syntax, sequencename is the name of the sequence that you wish to remove. Multiple sequence names may be specified, separated by commas.

Example 7-33 removes the shipments_ship_id_seq sequence.

Example 7-33. Removing a sequence

booktown=# DROP SEQUENCE shipments_ship_id_seq;
DROP

Before destroying a sequence, make sure that the sequence is not used by another table, function, or any other object in the database. If this check is not performed, then other operations that rely on the sequence will fail. The following query will return the name of any relation which relies on a default sequence value, where sequence_name is the name of the sequence you are interesting in finding dependencies for:

SELECT p.relname, a.adsrc FROM pg_class p
       JOIN pg_attrdef a ON (p.relfilenode = a.adrelid)
       WHERE a.adsrc ~ '"sequence_name"';

Example 7-34 uses this query to look up the name of any table with a default value involving the shipments_ship_id_seq sequence.

Example 7-34. Checking sequence dependencies

booktown=# SELECT p.relname, a.adsrc FROM pg_class p JOIN pg_attrdef a
booktown-#                           ON (p.relfilenode = a.adrelid)
booktown-#        WHERE a.adsrc ~ '"shipments_ship_id_seq"';
  relname  |                  adsrc
-----------+------------------------------------------
 shipments | nextval('"shipments_ship_id_seq"'::text)
(1 row)

Triggers

Often, anticipated SQL events should precede or follow a particular action. This action might be a consistency check on a set of values to be inserted, the formatting of supplied data before it is inserted, or a modification to a separate table following the removal or modification of a set of rows. Traditionally, such actions are handled at the programmatic level within an application connected to the database, rather than by the database software itself.

To ease the responsibility of the application’s database interaction, PostgreSQL supports a non-standard programmatic extension known as a trigger. A trigger defines a function which occurs before, or after, another action on a table. A trigger is implemented through C, Pl/pgSQL or any other functional language (with the exception of SQL) that PostgreSQL can use to define a function (see the section titled Extending PostgreSQL later in this chapter for more on creating functions, or Chapter 11 for more on PL/pgSQL).

Note

As triggers are a PostgreSQL-specific extension, be sure not to implement a trigger-based solution when a high degree of portability to other RDBMS systems is important.

Triggers may affect any of the following SQL events on a table:

INSERT
UPDATE
DELETE

Creating a trigger

In order to create a trigger, a function must first exist for it to execute. PostgreSQL supports many types of functions, including those defined by SQL, PL/pgSQL, and C. As of PostgreSQL 7.1.x, a trigger may use a function defined in any language, with the exception that the function cannot be defined as a purely SQL function.

Once a function is defined, a trigger may be defined to call that function either before or after an event on a specified table. Here is the syntax to create a trigger, followed by a description of its syntax:

CREATE TRIGGER name { BEFORE | AFTER } { event [ OR event ... ] }
               ON tablename
               FOR EACH { ROW | STATEMENT }
               EXECUTE PROCEDURE functionname ( arguments )

CREATE TRIGGER name: name is any arbitrary name for the new trigger. A trigger may have the same name as an existing trigger in a database provided that it is defined to operate on a different table. Also, like most other non-system database objects, triggers must only have a unique name (and table to operate on) within the database they are created in.
{ BEFORE | AFTER }: The BEFORE keyword instructs the defined function to be executed before the event is attempted, which also precedes any built-in constraint checking on the values involved in the case of an INSERT or DELETE event. Alternatively, the AFTER keyword causes the function to be called only after the attempted action has finished.
{ event [ OR event ... ] }: event is any one of the supported SQL events; multiple events may be listed, separated by the OR keyword.
ON tablename: tablename is the name of the table which, when modified by event, initiates this trigger.
FOR EACH { ROW | STATEMENT }: The keyword following the FOR EACH clause determines how many times the function should be called when the defined event is triggered. Use the ROW keyword to specify that the function is to be executed once for each affected row. Conversely, if the function should be executed only once for the calling statement, the STATEMENT keyword is used.
EXECUTE PROCEDURE functionname ( arguments ): functionname is the name of the existing function to be executed, with passed arguments.

Note

Only the database object’s owner, or a super user, can create a trigger on a database object.

While PostgreSQL tables support constraints to perform simple checks against static criteria, sometimes more involved procedures may be needed to validate input values. This is a typical example of where a trigger might be useful.

A trigger may be used to validate input values by preparing a validation function to be executed before values are inserted into a table, or before values in a table are updated. The function can then be made responsible for verifying that the values meet a complex set of restrictions, and even return an appropriate error through PostgreSQL’s error logging system.

Suppose that you have written a function in a procedural language that validates attempted INSERT or UPDATE values on the shipments table, and that then performs an update on the stock table to decrement the inventory for the shipment. This function could be written in any language that PostgreSQL supports (with the noted exception of pure SQL).

Specifically, suppose that this function verifies that both the provided customer_id and isbn exist in their respective customers and editions tables. If at least one is missing, a meaningful error is returned. Otherwise, the SQL statement is allowed to execute, and on a successful INSERT statement, the stock table is automatically decremented to reflect the drop in stock from the shipment.

Example 7-35 creates a trigger to be “fired” immediately before an INSERT or UPDATE statement is processed on the shipments table. The trigger invokes the check_shipment_addition( ) function once per each modified row.

Example 7-35. Creating the check_shipment trigger

booktown=# CREATE TRIGGER check_shipment
booktown-#                BEFORE INSERT OR UPDATE
booktown-#                ON shipments FOR EACH ROW
booktown-#                EXECUTE PROCEDURE check_shipment_addition( );
CREATE

Since the check_shipment trigger is configured to execute the check_shipment_addition( ) function for both INSERT and UPDATE statements, the integrity of the customer_id and isbn columns are fairly robustly maintained. Its use of the ROW keyword ensures that each added or modified row will be processed by the check_shipment_addition( ) validation function.

No arguments are passed to the check_shipment_addition( ) function, as it uses internal PL/pgSQL variables to check incoming rows. See Example 7-53, in Chapter 11, for the implementation of the check_shipment_addition( ) function, written in PL/pgSQL.

Viewing a trigger

Triggers are stored in the pg_trigger PostgreSQL system table, and can have their characteristics queried after creation. The structure of the pg_trigger table is shown in Table 7-3.

Table 7-3. The pg_trigger table

Column	Type
`tgrelid`	`oid`
`tgname`	`name`
`tgfoid`	`oid`
`tgtype`	`smallint`
`tgenabled`	`boolean`
`tgisconstraint`	`boolean`
`tgconstrname`	`name`
`tgconstrrelid`	`oid`
`tgdeferrable`	`boolean`
`tginitdeferred`	`boolean`
`tgnargs`	`smallint`
`tgattr`	`int2vector`
`tgargs`	`bytea`

Most of the columns in the Table 7-3 column are unlikely to be useful in a direct query. The most immediately relevant attributes of the pg_trigger system table are tgrelid and tgname.

The tgrelid value is the trigger’s relation identifier number. This value is of type oid, and corresponds to the relfilenode column in the pg_class system table. The tgname is the identifier which represents the name of the trigger, as specified in the CREATE TRIGGER command when the trigger was created.

Removing a trigger

The DROP TRIGGER command removes a trigger permanently from the database. Similar to the CREATE TRIGGER command, using this command requires you to be either the owner of the trigger, or a superuser.

Here is the syntax to remove an existing trigger:

DROP TRIGGER name ON table

Example 7-36 drops the check_shipment trigger placed on the shipments table.

Example 7-36. Dropping a trigger

booktown=# DROP TRIGGER check_shipment ON shipments;
DROP

The DROP statement indicates that the trigger was successfully dropped. Notice that you must specify not only the name of the trigger that you wish to remove, but also the table on which it is placed.

If you are unsure which table a particular trigger is placed on, you can derive this information from PostgreSQL’s system tables. For example, you can perform a join between the pg_trigger system table’s tgrelid column and the pg_class system table’s relfilenode column, comparing the name of the trigger against the tgname column. Example 7-37 demonstrates such a query to check the assigned relation (relname) associated with the trigger named check_shipment.

Example 7-37. Selecting a trigger’s assigned table

booktown=# SELECT relname FROM pg_class
booktown-#                INNER JOIN pg_trigger
booktown-#                ON (tgrelid = relfilenode)
booktown-#                WHERE tgname = 'check_shipment';
  relname
-----------
 shipments
(1 row)

Note

If you drop a function that a trigger is defined to use, the trigger will fail, and redefining the function with the same name will not correct the problem. Such a trigger must be recreated after its function is recreated.