In Chapter 3, operators are defined syntactically as tokens that are used to perform operations on values (e.g., constants, or identifiers), and return the results of those operations. In addition to these syntactic character symbols, there are some SQL keywords that are considered operators due to their effect on values in a SQL statement. Throughout this section, both these symbols and keywords will be referred to as operators.
The function of each operator is highly dependent on its context. Applications of operators range from performing mathematical operations and concatenating character strings, to performing a wide variety of comparisons yielding Boolean results. This section describes the general usage of operators in SQL, with successive sections on the following families of operators:
Character string
Numeric
Logical
Many operators, while invoked with the same keyword or character symbol, will have different effects depending on the data types to which they are applied. Further, operators will not always have a relevant use to every data type (see Chapter 3 for more information about what data types are available to PostgreSQL).
For example, you can use the addition operator (+
) to add two integer
values together, but you cannot use it to add an integer to a text type. This is an undefined
(and therefore ambiguous and disallowed) use of the operator. The operator character itself
(+
, in this case) will still be recognized, but you will receive an error
such as the one shown in Example 5-2 if you try to misuse an
operator:
Consider the Book Town authors
table, which correlates author’s names
with numeric identifiers.
Table "authors" Attribute | Type | Modifier ------------+---------+---------- id | integer | not null last_name | text | first_name | text | Index: authors_pkey
Two identifiers in this table are the columns id
, and last_name
, which are types integer
(a 4-byte integer) and
text
, respectively. Since the id
column is type
integer
, it may be used with a mathematical operator along with another
numeric value.
Example 5-1 demonstrates correct usage of the addition
(+
) operator.
Notice the result of trying to add incompatible types in Example 5-2.
Fortunately, as you can see in Example 5-2, PostgreSQL’s operator-misuse error messages supply a reason for failure, rather than blindly failing. These can be helpful in determining the next step in developing your statement, in order to make it a valid query.
Each of the basic character string comparison and concatenation operators supported by PostgreSQL are listed in Table 5-1.
Note that the LIKE
and ILIKE
keywords, which call
to the like( )
function, are sometimes referred to as string comparison
operators. These keywords are covered in the section titled Functions.
Each of the string comparison operators returns a Boolean result of either true or false. The alphabetical sorting referred to by Table 5-1 compares each sequential character in a string, determining if one character is considered ‘greater than’ or ‘less than’ the other. If the leading characters in two strings are at all identical, each character is checked from left to right until two different characters are found for comparison. In this sorting scheme, characters are determined to be higher than one another based on their ASCII value, as demonstrated in the following example:
booktown=# SELECT letter, ascii(letter) booktown-# FROM text_sorting booktown-# ORDER BY letter ASC; letter | ascii --------+------- 0 | 48 1 | 49 2 | 50 3 | 51 A | 65 B | 66 C | 67 D | 68 a | 97 b | 98 c | 99 d | 100 (12 rows)
If you are unsure of how a character will be sorted, you can use the ascii(
)
function to determine the ASCII value of the character. This function is
described further in the section titled Functions. Example 5-3 illustrates a comparative check on the books
table, and returns all titles whose first letter would be sorted before the
letter D.
As an example, it might be used in the WHERE
clause in order to
constrain rows by comparing against a character string. Example 5-4 demonstrates how to use this operator.
When a value is compared against a regular expression, the expression itself (or
regex) may match both literal character sequences, as well as several
variable character sequences. Both literal and variable sequences may be specified throughout
the expression. Example 5-5 illustrates an example of such a
sequence. It searches the Book Town authors
table for names beginning with
eitherA or T.
The ~
symbol is the regular expression operator, within the WHERE
clause, and the regular expression sequence itself in Example 5-5 is ^A|^T. The special characters
in this sequence are the caret (^
), and the pipe (|
),
while the literal characters are A and T. The
special characters used in regular expressions are explained in detail later in this
section.
The most important syntactic difference between the use of the like(
)
function and regular expression operators is that like( )
uses wild-card symbols (e.g., %
) at the beginning and end of its
expression in order to match a substring. In contrast, (with the beginning and end-line
symbols found in Table 5-3) regular expression operators
will implicitly look for the regular expression sequence anywhere in the compared
character string unless otherwise instructed.
Table 5-2 lists the regular expression operators. These operators compare a text value (either an identifier or a constant) to a regular expression. Each operator provides a Boolean result, depending on the nature of the operator.
The special characters available to a regular expression are listed in Table 5-3. These are the characters which may be used in a regular expression string to represent special meaning.
Note that in order to use a literal version of any of the characters in Table 5-3, they must be prefixed with two
backslashes (e.g., \\$
represents a literal dollar sign).
A common use of regular expressions is to search for a literal substring within a larger string. This can be achieved either with the ~ operator, if case is important, or with the ~* operator if the comparison should be case-insensitive. These operators are each demonstrated in Example 5-6.
As you can see in Example 5-6, two more rows are returned when using the ~* operator, as it matches not just “the” sequence, but modification of case on the same sequence (including the, tHe, ThE, and so on).
The same regular expression sequence can be modified to use the ^ symbol, to match only the character string The when it is at the beginning of the comparison string, as shown in Example 5-7. Additionally, the .* sequence is then appended, to indicate any number of characters may match until the next following grouped expression. In this case, the .* sequence is followed by a parenthetically grouped pair of strings (rabbit and heart), which are separated by the | symbol, indicating that either of the strings will be considered a match.
In Example 5-7, the results should fairly
clearly indicate the effect of the regular expression comparison. Translated into English,
the expression ^The.*(rabbit|heart) states that a match will be found
only if the compared string begins with the character sequence The and,
any amount of any characters thereafter, contain either the character sequence
rabbit, or heart. The use of the
~* operator (rather than just the ~
operator) makes
the comparison case-insensitive.
Example 5-8 executes an even more complicated regular expression comparison.
The regular expression used in Example 5-8 is a good example of how regular expressions can be intimidating! Breaking it down an element at a time, you can see that there are two parenthetically grouped expressions, separated by a | symbol. This means that if either of these expressions are found to match the title, the comparison will be considered a match.
Breaking it down further, you can see that the expression to the left of the | symbol consists of, from left to right: a caret (^) followed by the character t, a period (.) followed by an asterisk (*), and a pair of square brackets ([]) enclosing the characters r and i, followed by the character t. Translated into English, this sub-expression essentially says that in order to match, the compared string must begin with the letter t, and be followed by a sequence of zero or more characters until either the letter r, or i is found, which must be followed immediately by the letter t. If any of these conditions is not found, the comparison will not be considered a match.
PostgreSQL’s numeric operator support can be divided into three general groups:
Table 5-4 describes each of the mathematical operators available in PostgreSQL, along with example usage.
As
an example of mathematical operators in the target list, the statement in Example 5-9 takes the retail price for each book and divides
the cost with the /
operator in order to determine the profit margin. This
value is then typecast to a truncated numeric value with only two digits of precision.
Finally, the integer constant 1 is subtracted from the division result, to yield only the
percentage points over 100.
Table 5-5 shows the available comparison operators.
For an example of mathematical comparison operator usage, observe Example 5-10. The query involved uses the <=
operator first, to check if the retail
value is less-than or equal-to 25.
Subsequently, the !=
operator is employed with the AND
keyword to ensure that only books which are in stock (whose stock
value
are not equal to 0) are returned.
The
BETWEEN
keyword (sometimes called an operator) allows you to check a
value for existence within a range of values. For instance, Example 5-11
shows a SELECT
statement that looks for books with cost between 10 and 17
dollars.
You can achieve the same output using the less-than-or-equal-to operator (<=
) in conjunction with the greater-than-or-equal-to (>=
) operator. See Example 5-12.
Binary operators perform bitwise operations on the literal bits of a bit string or integer. These operators may affect integer values, or bit string values. Each of PostgreSQL’s binary operators are described in Table 5-6.
Example 5-13 demonstrates shifting a numeric value, and its equivalent
bit string, two bits to the right with the >>
operator. It also
demonstrates the use of the binary to integer bittoint4( )
function, which
is described later in the section titled Functions.
Table 5-7 illustrates the Boolean values
returned for the AND, OR
, and NOT
keywords, with each
possible value for a Boolean field (true, false, or NULL
).
Example 5-14 sequentially uses the OR
and AND
keywords in two queries to combine a pair of
conditions by which rows should be retrieved. In the first query, if a book has either a cost
of greater than thirty dollars, or is out of stock, its information will be returned. As you
can see from the result set, matching one or both of these conditions causes a row to be
returned.
The second query in Example 5-14 uses the
same conditions, but combines them with the AND
keyword. This results in a
stricter condition, as both criteria must be met. As such, only one row is returned, since
only one book is found which both has a cost of greater than thirty dollars, and is out of
stock.
Example 5-15 uses the IS NULL
keyword
to check for authors whose first_name
column value are set to NULL
.
Examining Example 5-15 and Example 5-16, you might think that the syntax in the two statements provided are identical. There is, however, a key difference.
Any as-yet undiscussed comparison operator used on a NULL
value will
return a NULL
value, as NULL
will never be larger,
smaller, or otherwise related to any non-NULL
value. (See Example 5-17.) A direct query on the result of a comparison
against a NULL
value will therefore return NULL
. You can
think of a NULL
value as being a sort of SQL black hole, from which no
comparison (outside of IS NULL
, and its special =
translation) may return true
, and to which no values may be added, or
concatenated.
When utilizing several operators in large expressions, it can be helpful to know in what order PostgreSQL processes operators. It is not, as you might think, strictly from left to right. If not properly understood, the order of execution can introduce potential for accidental side-effects, such as those shown in Example 5-18.
As you can see by the two column values returned in Example 5-18, the use of several operators without parentheses to
enforce precedence can return very different results, despite the same numbers being
manipulated in only a slightly different order. In this example, the multiplication is
actually executed first (regardless of the fact that the addition sign (+
)
precedes it sequentially, from left to right).
Table 5-8 lists, in order of PostgreSQL’s execution from the top down, the precedence of each group of operators.
The operator precedence listed in Table 5-8 applies
to user-defined operators that have the same character sequence as built-in operators. For
example, if you define the plus symbol (+
) operator for your own
user-defined data type, it has the same precedence as the built in plus (+
) operator, regardless of its function.