Operators

In Chapter 3, operators are defined syntactically as tokens that are used to perform operations on values (e.g., constants, or identifiers), and return the results of those operations. In addition to these syntactic character symbols, there are some SQL keywords that are considered operators due to their effect on values in a SQL statement. Throughout this section, both these symbols and keywords will be referred to as operators.

The function of each operator is highly dependent on its context. Applications of operators range from performing mathematical operations and concatenating character strings, to performing a wide variety of comparisons yielding Boolean results. This section describes the general usage of operators in SQL, with successive sections on the following families of operators:

Note

Operators

For an up-to-date and complete list of PostgreSQL supported operators, you can use psql’s \do slash command to view a list of available operators. Understand that many of the listed operators are PostgreSQL-specific, and therefore may not exist in other SQL-capable databases implementations.

Following the discussions of the various types of operators, you’ll find information on dealing with NULL values in expressions, and on the order in which operators are evaluated.

Operators operate on either a single value or a pair of values. The majority of operators operate on two values, with the operator placed between the values it is to operate upon (e.g., a - b). Operators that affect only one value are called unary operators, and either precede or follow the value they affect (e.g., the @ operator preceding a value is a unary operator indicating the absolute value).

Many operators, while invoked with the same keyword or character symbol, will have different effects depending on the data types to which they are applied. Further, operators will not always have a relevant use to every data type (see Chapter 3 for more information about what data types are available to PostgreSQL).

For example, you can use the addition operator (+) to add two integer values together, but you cannot use it to add an integer to a text type. This is an undefined (and therefore ambiguous and disallowed) use of the operator. The operator character itself (+, in this case) will still be recognized, but you will receive an error such as the one shown in Example 5-2 if you try to misuse an operator:

Consider the Book Town authors table, which correlates author’s names with numeric identifiers.

         Table "authors"
 Attribute  |  Type   | Modifier
------------+---------+----------
 id         | integer | not null
 last_name  | text    |
 first_name | text    |
Index: authors_pkey

Two identifiers in this table are the columns id, and last_name, which are types integer (a 4-byte integer) and text, respectively. Since the id column is type integer, it may be used with a mathematical operator along with another numeric value.

Example 5-1 demonstrates correct usage of the addition (+) operator.

Notice the result of trying to add incompatible types in Example 5-2.

Fortunately, as you can see in Example 5-2, PostgreSQL’s operator-misuse error messages supply a reason for failure, rather than blindly failing. These can be helpful in determining the next step in developing your statement, in order to make it a valid query.

PostgreSQL contains a comprehensive set of character string operators, from simple text concatenation and string comparison, to a strong set of regular expression matching. Character string operators are valid upon values of types char, varchar, and PostgreSQL’s own text type.

The following sections describe the basic comparison and concatenation operators, as well as the implementation of case-sensitive and case-insensitive regular expression operators.

Each of the basic character string comparison and concatenation operators supported by PostgreSQL are listed in Table 5-1.

Each of the string comparison operators returns a Boolean result of either true or false. The alphabetical sorting referred to by Table 5-1 compares each sequential character in a string, determining if one character is considered ‘greater than’ or ‘less than’ the other. If the leading characters in two strings are at all identical, each character is checked from left to right until two different characters are found for comparison. In this sorting scheme, characters are determined to be higher than one another based on their ASCII value, as demonstrated in the following example:

booktown=# SELECT letter, ascii(letter)
booktown-#        FROM text_sorting
booktown-#        ORDER BY letter ASC;
 letter | ascii
--------+-------
 0      |    48
 1      |    49
 2      |    50
 3      |    51
 A      |    65
 B      |    66
 C      |    67
 D      |    68
 a      |    97
 b      |    98
 c      |    99
 d      |   100
(12 rows)

If you are unsure of how a character will be sorted, you can use the ascii( ) function to determine the ASCII value of the character. This function is described further in the section titled Functions. Example 5-3 illustrates a comparative check on the books table, and returns all titles whose first letter would be sorted before the letter D.

For times when normal equivalence comparisons are inadequate, PostgreSQL has several operators designed to perform pattern matching against regular expressions. A regular expression is similar to any other string to be matched against, with the exception that some characters (such as the square braces, pipe, and backslash) have special meaning in a comparison. If you have used Unix programs such as sed, grep, or perl, you may already be familiar with this kind of syntax.

When a value is compared against a regular expression, the expression itself (or regex) may match both literal character sequences, as well as several variable character sequences. Both literal and variable sequences may be specified throughout the expression. Example 5-5 illustrates an example of such a sequence. It searches the Book Town authors table for names beginning with eitherA or T.

The ~ symbol is the regular expression operator, within the WHERE clause, and the regular expression sequence itself in Example 5-5 is ^A|^T. The special characters in this sequence are the caret (^), and the pipe (|), while the literal characters are A and T. The special characters used in regular expressions are explained in detail later in this section.

The most important syntactic difference between the use of the like( ) function and regular expression operators is that like( ) uses wild-card symbols (e.g., %) at the beginning and end of its expression in order to match a substring. In contrast, (with the beginning and end-line symbols found in Table 5-3) regular expression operators will implicitly look for the regular expression sequence anywhere in the compared character string unless otherwise instructed.

Table 5-2 lists the regular expression operators. These operators compare a text value (either an identifier or a constant) to a regular expression. Each operator provides a Boolean result, depending on the nature of the operator.

The special characters available to a regular expression are listed in Table 5-3. These are the characters which may be used in a regular expression string to represent special meaning.

Note

Regular expression symbols

Note that in order to use a literal version of any of the characters in Table 5-3, they must be prefixed with two backslashes (e.g., \\$ represents a literal dollar sign).

A common use of regular expressions is to search for a literal substring within a larger string. This can be achieved either with the ~ operator, if case is important, or with the ~* operator if the comparison should be case-insensitive. These operators are each demonstrated in Example 5-6.

As you can see in Example 5-6, two more rows are returned when using the ~* operator, as it matches not just “the” sequence, but modification of case on the same sequence (including the, tHe, ThE, and so on).

The same regular expression sequence can be modified to use the ^ symbol, to match only the character string The when it is at the beginning of the comparison string, as shown in Example 5-7. Additionally, the .* sequence is then appended, to indicate any number of characters may match until the next following grouped expression. In this case, the .* sequence is followed by a parenthetically grouped pair of strings (rabbit and heart), which are separated by the | symbol, indicating that either of the strings will be considered a match.

In Example 5-7, the results should fairly clearly indicate the effect of the regular expression comparison. Translated into English, the expression ^The.*(rabbit|heart) states that a match will be found only if the compared string begins with the character sequence The and, any amount of any characters thereafter, contain either the character sequence rabbit, or heart. The use of the ~* operator (rather than just the ~ operator) makes the comparison case-insensitive.

Example 5-8 executes an even more complicated regular expression comparison.

The regular expression used in Example 5-8 is a good example of how regular expressions can be intimidating! Breaking it down an element at a time, you can see that there are two parenthetically grouped expressions, separated by a | symbol. This means that if either of these expressions are found to match the title, the comparison will be considered a match.

Breaking it down further, you can see that the expression to the left of the | symbol consists of, from left to right: a caret (^) followed by the character t, a period (.) followed by an asterisk (*), and a pair of square brackets ([]) enclosing the characters r and i, followed by the character t. Translated into English, this sub-expression essentially says that in order to match, the compared string must begin with the letter t, and be followed by a sequence of zero or more characters until either the letter r, or i is found, which must be followed immediately by the letter t. If any of these conditions is not found, the comparison will not be considered a match.

The expression to the right of the | symbol is a bit simpler, consisting of two character string sequences (ing and une), each followed by the $ character, and separated by another | symbol. This sub-expression, translated into English, describes a match as a relationship in which either ends with the value ing, or une. If either of these are found, the expression is considered a match, because of the | symbol.

PostgreSQL’s numeric operator support can be divided into three general groups:

Mathematical operators can be used in the target list, in the WHERE clause of a SELECT statement, or anywhere else a numeric result may be appropriate. This sometimes will include the ORDER BY clause, a JOIN qualifier, or a GROUP BY clause.

Table 5-4 describes each of the mathematical operators available in PostgreSQL, along with example usage.

As an example of mathematical operators in the target list, the statement in Example 5-9 takes the retail price for each book and divides the cost with the / operator in order to determine the profit margin. This value is then typecast to a truncated numeric value with only two digits of precision. Finally, the integer constant 1 is subtracted from the division result, to yield only the percentage points over 100.

Notice that the column name is temporarily aliased to margin by using the AS keyword. Remember that the column name created by the AS keyword is a temporary name, and used only for the duration of the query.

Comparison operators are used to compare values of types such as integer or text to one another, but they will always return a value of type boolean. These operators are most commonly used in the WHERE clause, but may be used anywhere in a SQL statement where a value of type boolean would be valid.

Table 5-5 shows the available comparison operators.

For an example of mathematical comparison operator usage, observe Example 5-10. The query involved uses the <= operator first, to check if the retail value is less-than or equal-to 25. Subsequently, the != operator is employed with the AND keyword to ensure that only books which are in stock (whose stock value are not equal to 0) are returned.

Binary operators perform bitwise operations on the literal bits of a bit string or integer. These operators may affect integer values, or bit string values. Each of PostgreSQL’s binary operators are described in Table 5-6.

Example 5-13 demonstrates shifting a numeric value, and its equivalent bit string, two bits to the right with the >> operator. It also demonstrates the use of the binary to integer bittoint4( ) function, which is described later in the section titled Functions.

The AND, OR, and NOT keywords are PostgreSQL’s Boolean operators. They are commonly used to join or invert conditions in a SQL statement, particularly in the WHERE clause and the HAVING clause.

Table 5-7 illustrates the Boolean values returned for the AND, OR, and NOT keywords, with each possible value for a Boolean field (true, false, or NULL).

Example 5-14 sequentially uses the OR and AND keywords in two queries to combine a pair of conditions by which rows should be retrieved. In the first query, if a book has either a cost of greater than thirty dollars, or is out of stock, its information will be returned. As you can see from the result set, matching one or both of these conditions causes a row to be returned.

The second query in Example 5-14 uses the same conditions, but combines them with the AND keyword. This results in a stricter condition, as both criteria must be met. As such, only one row is returned, since only one book is found which both has a cost of greater than thirty dollars, and is out of stock.

If a table has NULL values in it, a special pair of comparison operators that can be used to include or omit NULL valued fields. You can check for fields set to NULL using the IS NULL keyword phrase. In order to check for a non-NULL value, use the IS NOT NULL keyword phrase.

Example 5-15 uses the IS NULL keyword to check for authors whose first_name column value are set to NULL.

Examining Example 5-15 and Example 5-16, you might think that the syntax in the two statements provided are identical. There is, however, a key difference.

PostgreSQL provides a translation from = NULL to IS NULL, and likewise for the != NULL operation with IS NOT NULL. This is provided only for compatibility with existing client applications (such as Microsoft Access).

When comparing a value to NULL in an expression, be in the habit of using the IS NULL and IS NOT NULL keyword operators rather than the = or != math-style operators. While this translation is provided for the sake of compatibility with other systems, it may be discontinued in the future, as it is not a standard SQL procedure (and it is certainly not guaranteed to be a portable procedure to other SQL-based database systems for the same reason).

Any as-yet undiscussed comparison operator used on a NULL value will return a NULL value, as NULL will never be larger, smaller, or otherwise related to any non-NULL value. (See Example 5-17.) A direct query on the result of a comparison against a NULL value will therefore return NULL. You can think of a NULL value as being a sort of SQL black hole, from which no comparison (outside of IS NULL, and its special = translation) may return true, and to which no values may be added, or concatenated.

When utilizing several operators in large expressions, it can be helpful to know in what order PostgreSQL processes operators. It is not, as you might think, strictly from left to right. If not properly understood, the order of execution can introduce potential for accidental side-effects, such as those shown in Example 5-18.

As you can see by the two column values returned in Example 5-18, the use of several operators without parentheses to enforce precedence can return very different results, despite the same numbers being manipulated in only a slightly different order. In this example, the multiplication is actually executed first (regardless of the fact that the addition sign (+) precedes it sequentially, from left to right).

Table 5-8 lists, in order of PostgreSQL’s execution from the top down, the precedence of each group of operators.

Note

Operator precedence

The operator precedence listed in Table 5-8 applies to user-defined operators that have the same character sequence as built-in operators. For example, if you define the plus symbol (+) operator for your own user-defined data type, it has the same precedence as the built in plus (+) operator, regardless of its function.