5
Getting More Than Simple Columns

“Facts are stubborn things.”

—TOBIAS SMOLLETT GIL BLAS DE SANTILLANE

Topics Covered in This Chapter

What Is an Expression?

What Type of Data Are You Trying to Express?

Changing Data Types: The CAST Function

Specifying Explicit Values

Types of Expressions

Using Expressions in a SELECT Clause

That “Nothing” Value: Null

Sample Statements

Summary

Problems for You to Solve

In Chapter 4, “Creating a Simple Query,” you learned how to use a SELECT statement to retrieve information from one or more columns in a table. This technique is useful if you’re posing only simple requests to the database for some basic facts. However, you’ll need to expand your SQL vocabulary when you begin working with complex requests. In this chapter, I’ll introduce you to the concept of an expression as a way to manipulate the data in your tables to calculate or generate new columns of information. Next, I’ll discuss how the type of data stored in a column can have an important impact on your queries and the expressions you create. I’ll take a brief detour to the CAST function, which you can use to actually change the type of data you include in your expressions. You’ll learn to create a constant (or literal) value that you can use in creative ways in your queries. You’ll learn to use literals and values from columns in your table to create expressions. You’ll learn how to adjust the scope of information you retrieve with a SELECT statement by using expressions to manipulate the data from which the information is drawn. Finally, you’ll explore the special Null value and learn how it can impact how you work with expressions that use columns from your tables.

What Is an Expression?

To get more than simple columns, you need to create an expression. An expression is some form of operation involving numbers, character strings, or dates and times. It can use values drawn from specific columns in a table, constant (literal) values, or a combination of both. I’ll show you how to generate literal values later in this chapter. After your database completes the operation defined by the expression, the expression returns a value to the SQL statement for further processing. You can use expressions to broaden or narrow the scope of the information you retrieve from the database. Expressions are especially useful when you are asking “what if” questions. Here’s a sample of the types of requests you can answer using expressions:

“What is the total amount for each line item?”

“Give me a mailing list of employees, last name first.”

“Show me the start time, end time, and duration for each class.”

“Show the difference between the handicap score and the raw score for each bowler.”

“What is the estimated per-hour rate for each engagement?”

“What if we raised the prices of our products by 5 percent?”

As you’ll learn as you work through this chapter, expressions are a very valuable technique to add to your knowledge of SQL. You can use expressions to “slice and dice” the plain-vanilla data in your columns to create more meaningful results in your queries. You’ll also find that expressions are very useful when you move on to Chapter 6, “Filtering Your Data,” and beyond. You’ll use expressions to filter your data or to link data from related tables.

What Type of Data Are You Trying to Express?

The type of data used in an expression impacts the value the expression returns, so let’s first look at some of the data types the SQL Standard provides. Every column in the database has an assigned data type that determines the kind of values the column can store. The data type also determines the operations that can be performed on the column’s values. You need to understand the basic data types before you can begin to create literal values or combine columns and literals in an expression that is meaningful and that returns a proper value.

The SQL Standard defines seven general categories of types of data—character, national character, binary, numeric, Boolean, datetime, and interval. In turn, each category contains one or more uniquely named data types. Here’s a brief look at each of these categories and their data types. (In the following list, I’ve broken the numeric category into two subcategories: exact numeric and approximate numeric.)

CHARACTER

The character data type stores a fixed- or varying- length character string of one or more printable characters. The characters it accepts are usually based upon the American Standard Code for Information Interchange (ASCII) or the Extended Binary Coded Decimal Interchange Code (EBCDIC) character sets. A fixed-length character data type is known as CHARACTER or CHAR, and a varying-length character data type is known as CHARACTER VARYING, CHAR VARYING, or VARCHAR. You can define the length of data that you want to store in a character data type, but the maximum length that you can specify is defined by your database system. (This rule applies to the national character data types as well.) When the length of a character string exceeds a system-defined maximum (usually 255 or 1,024 characters), you must use a CHARACTER LARGE OBJECT, CHAR LARGE OBJECT, or CLOB data type. In many systems, the alias for CLOB is TEXT or MEMO.

NATIONAL CHARACTER

The national character data type is the same as the character data type except that it draws its characters from ISO-defined foreign language character sets. NATIONAL CHARACTER, NATIONAL CHAR, and NCHAR are names used to refer to a fixed-length national character, and NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, and NCHAR VARYING are names used to refer to a varying-length national character. When the length of a character string exceeds a system-defined maximum (usually 255 or 1,024 characters), you must use a NATIONAL CHARACTER LARGE OBJECT, NCHAR LARGE OBJECT, or NCLOB data type. In many systems, the alias for NCLOB is NTEXT.

BINARY

Use the BINARY LARGE OBJECT (or BLOB) data type to store binary data such as images, sounds, videos, or complex embedded documents such as word processing files or spreadsheets. In many systems, the names used for this data type include BINARY, BIT, and BIT VARYING.

EXACT NUMERIC

This data type stores whole numbers and numbers with decimal places. The precision (the number of significant digits) and the scale (the number of digits to the right of the decimal place) of an exact numeric can be user-defined and can only be equal to or less than the maximum limits allowed by the database system. NUMERIC, DECIMAL, DEC, SMALLINT, INTEGER, INT, and BIGINT are all names used to refer to this data type. One point you must remember is that the SQL Standard—as well as most database systems—defines a BIGINT as having a greater range of values than INTEGER, and INTEGER as having a greater range of values than a SMALLINT. Check your database system’s documentation for the applicable ranges. Some systems also support a TINYINT data type that can hold a smaller range of values than SMALLINT.

APPROXIMATE NUMERIC

This data type stores numbers with decimal places and exponential numbers. Names used to refer to this data type include FLOAT, REAL, and DOUBLE PRECISION. The approximate numeric data types don’t have a precision and scale per se, but the SQL Standard does allow a user-defined precision only for a FLOAT data type. Any scale associated with these data types is always defined by the database system. Note that the SQL Standard and most database systems define the range of values for a DOUBLE PRECISION data type to be greater than those of a REAL or FLOAT data type. Check your documentation for these ranges as well.

BOOLEAN

This data type stores true and false values, usually in a single binary bit. Some systems use BIT, INT, or TINYINT to store this data type.

DATETIME

Dates, times, and combinations of both are stored in this data type. The SQL Standard defines the date format as year-month-day and specifies time values as being based on a 24-hour clock. Although most database systems allow you to use the more common month/day/year or day/month/year date format and time values based on an A.M./P.M. clock, I use the date and time formats specified by the SQL Standard throughout the book. The three names used to refer to this data type are DATE, TIME, and TIMESTAMP. You can use the TIMESTAMP data type to store a combination of a date and time. Note that the names and usages for these data types vary depending on the database system you are using. Some systems store both date and time in the DATE data type, while others use TIMESTAMP or a data type called DATETIME. Consult your system documentation for details.

INTERVAL

This data type stores the quantity of time between two datetime values, expressed as either year, month; year/month; day, time; or day/time. Not all major database systems support the INTERVAL data type, so consult your system documentation for details.

Many database systems provide additional data types known as extended data types beyond those specified by the SQL Standard. (I listed a few of them in the previous list of data type categories.) Examples of extended data types include MONEY/CURRENCY and SERIAL/ROWID/AUTOINCREMENT/IDENTITY (for unique row identifiers).

Because our primary focus is on the data manipulation portion of SQL, you need be concerned only with the appropriate range of values for each data type your database system supports. This knowledge will help ensure that the expressions you define will execute properly, so be sure to familiarize yourself with the data types provided by your RDBMS program.

Changing Data Types: The CAST Function

You must be careful when you create an expression to make sure that the data types of the columns and literals are compatible with the operation you are requesting. For example, it doesn’t make sense to try to add character data to a number. But if the character column or literal contains a number, you can use the CAST function to convert the value before trying to add another number. Figure 5-1 shows you the CAST function, which is supported in nearly all commercial database systems.

Image

Figure 5-1The syntax diagram for the CAST function

The CAST function converts a literal value or the value of a column into a specific data type. This helps to ensure that the data types of the values in the expression are compatible. By compatible I mean that all columns or literals in an expression are either characters, numbers, or datetime values. (As with any rule, there are exceptions that I’ll mention later.) All the values you use in an expression must generally be compatible in order for the operation defined within the expression to work properly. Otherwise, your database system might raise an error message.

Note: Although most commercial database systems support the CAST function, some do not. Those systems that do not support CAST do have available a set of custom functions to achieve the same result. Consult your system documentation for details.

Converting a value in a column or a literal from one data type to another is a relatively intuitive and straightforward task. However, you’ll have to keep the following restrictions in mind when you convert a value from its original data type to a different data type:

• Let’s call this the “don’t put a ten-pound sack in a five-pound box” rule. As mentioned earlier, you can define the maximum length of the data you want to store in a character data type. If you try to convert from one type of character field (for example, VARCHAR) to another character type (such as CHARACTER) and the data stored in the original column or literal is larger than the maximum length specified in the receiving data type, your database system will truncate the original character string. Your database system should also give you a warning that the truncation is about to occur.

• Let’s call this the “don’t put a square peg in a round hole” rule. You can convert a character column or literal to any other data type, but the character data in the source column or literal must be convertible to the target data type. For example, you can convert a five-character ZIP Code to a number, but you will encounter an error if your ZIP Code column contains Canadian or European postal codes that have letters. Note that the database system ignores any leading and/or trailing spaces when it converts a character column value to a numeric or datetime value. Also, most commercial systems support a wide range of character strings that are recognizable as date or time values. Consult your system documentation for details.

• This is the “ten-pound sack” rule, version 2. When you convert a numeric column’s value to another numeric data type, the current contents of the convert-from column or literal had better fit in the target data type. For example, you will likely get an error if you attempt to convert a REAL value greater than 32,767 to a SMALLINT. Additionally, numbers to the right of the decimal place will be truncated or rounded as appropriate when you convert a number that has a decimal fraction to an INTEGER or SMALLINT. The amount of truncation or rounding is determined by the database system.

• But you can put “a square peg in a round hole” with certain limitations. When you convert the value of a numeric column to a character data type, one of three possible results will occur:

1. It will convert successfully.

2. Your system will pad it with blanks if its length is shorter than the defined length of the character column.

3. The database system will raise an error if the character representation of the numeric value is longer than the defined length of the character column.

Note: Although the SQL Standard defines these restrictions, your database system might allow you some leeway when you convert a value from one data type to another. Some database systems provide automatic conversion for you without requiring you to use the CAST function. For example, some systems allow you to concatenate a number with text or to add text containing a number to another number without an explicit conversion. Refer to your database system’s documentation for details.

It’s important to note that this list does not constitute the entire set of restrictions defined by the SQL Standard. I listed only those restrictions that apply to the data types I use in this book. For a more in-depth discussion on data types and data conversion issues, please refer to any of the books listed in Appendix D, “Suggested Reading.”

Keep the CAST function in mind as you work through the rest of this book. You’ll see me use it whenever appropriate to make sure I’m working with compatible data types.

Specifying Explicit Values

The SQL Standard provides flexibility for enhancing the information returned from a SELECT statement by allowing the use of constant values such as character strings, numbers, dates, times, or a suitable combination of these items, in any valid expression used within a SELECT statement. The SQL Standard categorizes these types of values as literal values and specifies the manner in which they are defined.

Character String Literals

A character string literal is a sequence of individual characters enclosed in single quotes. Yes, I know that you are probably used to using double quotes to enclose character strings, but I’m presenting these concepts as the SQL Standard defines them. Figure 5-2 shows the diagram for a character string literal.

Image

Figure 5-2The syntax diagram of a character string literal

Here are a few examples of the types of character string literals you can define:

'This is a sample character string literal.'
'Here's yet another! '
'B-28' 'Seattle'

You probably noticed what seemed to be a double quote in both the diagram and the second line of the previous example. Actually, it’s not a double quote but two consecutive single quotes with no space between them. The SQL Standard states that a single quote embedded within a character string is represented by two consecutive single quotes. The SQL Standard defines it this way so that your database system can distinguish between a single quote that defines the beginning or end of a character string literal and a quote that you want to be included within the literal. The following two lines illustrate how this works:

SQL

Displayed as

'The Vendor's name is:'

The Vendor's name is:

As I mentioned earlier, you can use character string literals to enhance the information returned by a SELECT statement. Although the information you see in a result set is usually easy to understand, it’s very likely that the information can be made clearer. For example, if you execute the following SELECT statement, the result set displays only the vendor’s Web site address and the vendor’s name:

SQL

SELECT VendWebPage, VendName

FROM Vendors

In some instances, you can enhance the clarity of the information by defining a character string that provides supplementary descriptive text and then adding it to the SELECT clause. Use this technique judiciously because the character string literal will appear in each row of the result set. Here’s how you might modify the previous example with a character string literal:

SQL

SELECT VendWebPage, 'is the Web site for',

VendName

FROM Vendors

A row in the result set generated by this SELECT statement looks like this:

www.viescas.com

is the Web site for

John Viescas Consulting

This somewhat clarifies the information displayed by the result set by identifying the actual purpose of the web address. Although this is a simple example, it illustrates what you can do with character string literals. Later in this chapter, you’ll see how you can use them in expressions.

Note: You’ll find this technique especially useful when working with legacy databases that contain cryptic column names. However, you won’t have to use this technique very often with your own databases if you follow the recommendations in Chapter 2, “Ensuring Your Database Structure Is Sound.”

Numeric Literals

A numeric literal is another type of literal you can use within a SELECT statement. As the name implies, it consists of an optional sign and a number and can include a decimal place, the exponent symbol, and an exponential number. Figure 5-3 shows the diagram for a numeric literal.

Image

Figure 5-3The syntax diagram of a numeric literal

Examples of numeric literals include the following:

427
–11.253
.554 0.3E–3

Numeric literals are most useful in expressions (for example, to multiply by or to add a fixed number value), so I’ll postpone further discussion until later in this chapter.

Datetime Literals

You can supply specific dates and times for use within a SELECT statement by using date literals, time literals, and timestamp literals. The SQL Standard refers to these literals collectively as datetime literals. Defining these literals is a simple task, as Figure 5-4 shows.

Image

Figure 5-4The syntax diagram of date and time literals

Bear in mind a few points, however, when using datetime and interval literals:

DATE

The format for a date literal is year-month-day, which is the format I follow throughout the book. However, many SQL databases allow the more common month/day/year format (United States) or day/month/year format (most non-U.S. countries). The SQL Standard also specifies that you include the DATE keyword before the literal, but nearly all commercial implementations allow you to simply specify the literal value surrounded by delimiter characters—usually single quotes. I found one case, the MySQL system, that requires you to specify a date literal in quotes and then to use the CAST function to convert the string to the DATE data type before you can use it in date calculations. Microsoft Office Access requires you to use a hashtag (#) character as the delimiter for DATE literals.

TIME

The hour format is based on a 24-hour clock. For example, 07:00 P.M. is represented as 19:00. The SQL Standard also specifies that you include the TIME keyword before the literal, but nearly all commercial implementations allow you to simply specify the literal value surrounded by delimiter characters—usually single quotes. I found one case, the MySQL system, that requires you to specify a time literal in quotes and then to use the CAST function to convert the string to the TIME data type before you can use it in time calculations. Microsoft Office Access requires you to use a hashtag (#) character as the delimiter for TIME literals.

TIMESTAMP

A timestamp literal is simply the combination of a date and a time separated by a single space. The rules for formatting the date and the time within a timestamp follow the individual rules for date and time. The SQL Standard also specifies that you include the TIMESTAMP keyword before the literal, but all commercial implementations that support the TIMESTAMP data type allow you to simply specify the literal value surrounded by delimiter characters—usually single quotes.

Note: In some systems, you can also define an interval literal to use in calculated expressions with datetime literals, but I won’t be covering that type of literal in this book. See your system documentation for details.

You can find the diagrams for DATE, TIME, TIMESTAMP, and INTERVAL as defined by the SQL Standard in Appendix A, “SQL Standard Diagrams.”

Here are some examples of datetime literals:

'2007-05-16'
'2016-11-22'
'21:00'
'03:30:25' '2008-09-29 14:25:00'

Note that when using MySQL, you must explicitly convert any character literal containing a date or a time or a date and a time by using the CAST function. Here are some examples:

CAST('2016-11-22' AS DATE)
CAST('03:30:25' AS TIME)
CAST('2008-09-29 14:25:00' AS DATETIME)

As I noted previously, in order to follow the SQL Standard, you must precede each literal with a keyword indicating the desired value. Although the DATE and TIME keywords are defined in the SQL Standard as required components of date and time literals, respectively, most database systems rarely support these keywords in this particular context and require only the character string portion of the literal. Therefore, I’ll refrain from using the keywords and instead use single quotes to delimit a date or time literal that appears in any example throughout the remainder of the book. I show you how to use dates and times in expressions later in this chapter. See Appendix A for more details on forming datetime literals that follow the SQL Standard.

Types of Expressions

You will generally use the following three types of expressions when working with SQL statements:

CONCATENATION

Combining two or more character columns or literals into a single character string

MATHEMATICAL

Adding, subtracting, multiplying, and dividing numeric columns or literals

DATE AND TIME ARITHMETIC

Applying addition or subtraction to dates and times

Concatenation

The SQL Standard defines two sequential vertical bars as the concatenation operator. You can concatenate two character items by placing a single item on either side of the concatenation operator. The result is a single string of characters that is a combination of both items. Figure 5-5 shows the syntax diagram for the concatenation expression.

Image

Figure 5-5The syntax diagram for the concatenation expression

Note: Of the major database systems, I found that only IBM’s DB2 and Informix, Oracle’s Oracle, and PostgreSQL support the SQL Standard operator for concatenation. Microsoft Office Access supports & and + as concatenation operators, Microsoft SQL Server and Ingres support +, and in MySQL you must use the CONCAT function. In all the examples in the book, I use the SQL Standard || operator. In the sample databases on the website for the book, I use the appropriate operator for each database type (Microsoft Access, Microsoft SQL Server, MySQL, and PostgreSQL).

Here’s a general idea of how the concatenation operation works:

Expression

ColumnOne || ColumnTwo

Result

ContentsOfColumnOneContentsOfColumnTwo

Let’s start with the easiest example in the world: concatenating two character string literals, such as a first name and a last name:

Expression

'Mike' || 'Hernandez'

Result

MikeHernandez

There are two points to consider in this example: First, single quotes are required around each name because they are character string literals. Second, the first and last names are right next to each other. Although the operation combined them correctly, it might not be what you expected. The solution is to add a space between the names by inserting another character literal that contains a single space.

Expression

'Mike' || ' ' || 'Hernandez'

Result

Mike Hernandez

The previous example shows that you can concatenate additional character values by using more concatenation operators. There is no limit to the number of character values you can concatenate, but there is a limit to the length of the character string the concatenation operation returns. In general, the length of the character string returned by a concatenation operation can be no greater than the maximum length allowed for a varying-length character data type. Your database system might handle this issue slightly differently, so check your documentation for further details.

Concatenating two or more character strings makes perfect sense, but you can also concatenate the values of two or more character columns in the same fashion. For example, suppose you have two columns called CompanyName and City. You can create an expression that concatenates the value of each column by using the column names within the expression. Here’s an example that concatenates the values of both columns with a character string:

Expression

CompanyName || ' is based in ' || City

Result

DataTex Consulting Group is based in Seattle

You don’t need to surround CompanyName or City with single quotes because they are column references. (Remember column references from the previous chapter?) You can use a column reference in any type of expression, as you’ll see in the examples throughout the remainder of the book.

Notice that all the concatenation examples so far concatenate characters with characters. I suppose you might be wondering if you need to do anything special to concatenate a number or a date. Most database systems give you some leeway in this matter. When the system sees you trying to concatenate a character column or literal with a number or a date, the system automatically casts the data type of the number or date for you so that the concatenation works with compatible data types.

But you shouldn’t always depend on your database system to quietly do the conversion for you. To concatenate a character string literal or the value of a character column with a date literal or the value of a numeric or date column, use the CAST function to convert the numeric or date value to a character string. Here’s an example of using CAST to convert the value of a date column called DateEntered:

Expression

EntStageName || ' was signed with our agency
on ' || CAST(DateEntered as CHARACTER(10))

Result

Modern Dance was signed with our agency on 1995-05-16

Note: I specified an explicit length for the CHARACTER data type because the SQL Standard specifies that the absence of a length specification defaults to a length of 1. I found that most major implementations give you some leeway in this regard and generate a character string long enough to contain what you’re converting. You can check your database documentation for details, but if you’re in doubt, always specify an explicit length.

You should also use the CAST function to concatenate a numeric literal or the value of a numeric column to a character data type. In the next example, I use CAST to convert the value of a numeric column called RetailPrice:

Expression

ProductName || 'sells for' ||
CAST(RetailPrice AS CHARACTER(8))

Result

Trek 9000 Mountain Bike sells for 1200.00

A concatenation expression can use character strings, datetime values, and numeric values simultaneously. The following example illustrates how you can use all three data types within the same expression:

Expression

'Order Number ' || CAST(OrderNumber AS
CHARACTER(2)) ||
' was placed on ' ||
CAST(OrderDate AS CHARACTER(10))

Result

Order Number 1 was placed on 2017-09-02

Note: The SQL Standard defines a variety of functions that you can use to extract information from a column or calculate a value across a range of rows. I’ll cover some of these in more detail in Chapter 12, “Simple Totals.” Most commercial database systems also provide various functions to manipulate parts of strings or to format date, time, or currency values. Check your system documentation for details.

Now that I’ve shown how to concatenate data from various sources into a single character string, let’s look at the different types of expressions you can create using numeric data.

Mathematical Expressions

The SQL Standard defines addition, subtraction, multiplication, and division as the operations you can perform on numeric data. The standard also defines common mathematical functions to calculate values such as absolute value, modulus, exponentiation, and logarithms. Here are the mathematical functions defined by the standard:

Function

Purpose

ABS(<numeric expression>)

Returns the absolute value of the expression

MOD(<dividend>, <divisor>)

Returns the remainder produced by dividing the dividend by the divisor

LN(<numeric expression>)

Returns the natural logarithm of the expression

EXP(<numeric expression>)

Returns the value of the natural logarithm raised to the power of the expression

POWER(<numeric base>, <numeric exponent>)

Returns the value of the base raised to the power of the exponent

SQRT(<numeric expression>)

Returns the square root of the expression

FLOOR(<numeric expression>)

Returns the largest integer less than or equal to the expression

CEIL(<numeric expression>) CEILING(<numeric expression>)

Returns the smallest integer greater than or equal to the expression

WIDTH_BUCKET(<numeric value>, <numeric lower bound>, <numeric upper bound>, <numeric bucket count>)

Divides the range between the lower bound and the upper bound into the number of equal “buckets” specified by the count and returns a number between 0 and the bucket count + 1 indicating where in the range the first argument resides

Most RDBMS programs provide these operations, as well as a wide array of scientific, trigonometric, statistical, and mathematical functions. In this book, however, I focus only on the four basic operations defined by the SQL Standard—addition, subtraction, multiplication, and division.

The order in which the four basic mathematical operations are performed—known as the order of precedence—is an important issue when you create mathematical expressions. The SQL Standard gives equal precedence to multiplication and division and specifies that they should be performed before any addition or subtraction. This is slightly contrary to the order of precedence you probably learned back in school, where multiplication is done before division, division before addition, and addition before subtraction, but it matches the order of precedence used in most modern programming languages. Mathematical expressions are evaluated from left to right. This could lead to some interesting results, depending on how you construct the expression! So, I strongly recommend that you make extensive use of parentheses in complex mathematical expressions to ensure that they evaluate properly.

If you remember how you created mathematical expressions back in school, then you already know how to create them in SQL. In essence, you use an optionally signed numeric value, a mathematical operator, and another optionally signed numeric value to create the expression. Figure 5-6 shows a diagram of this process.

Image

Figure 5-6The syntax diagram for a mathematical expression

Here are some examples of mathematical expressions using numeric literal values, column references, and combinations of both:

25 + 35
–12 * 22
RetailPrice * QuantityOnHand
TotalScore / GamesBowled
RetailPrice – 2.50
TotalScore / 12

As mentioned earlier, you need to use parentheses to ensure that a complex mathematical expression evaluates properly. Here’s a simple example of how you might use parentheses in such an expression:

Expression

(114) + (12 * 3)

Result

43

Pay close attention to the placement of parentheses in your expression because it affects the expression’s resulting value. The two expressions in the following example illustrate this quite clearly. Although both expressions have the same numbers and operators, the placement of the parentheses is entirely different and causes the expressions to return completely different values.

Expression

(23 * 11) + 12

Result

265

Expression

23 * (11 + 12)

Result

529

It’s easy to see why you need to be careful with parentheses, but don’t let this stop you from using them. They are invaluable when working with complex expressions.

You can also use parentheses as a way to nest operations within an expression. When you use nested parenthetical operations, your database system evaluates them left to right and then in an “innermost to outermost” fashion. Here’s an example of an expression that contains nested parenthetical operations:

Expression

(12 * (3 + 4))(24 / (10 + (64)))

Result

82

Executing the operations within the expression is not really as difficult as it seems. Here’s the order in which your database system evaluates the expression:

(3 + 4) = 7

(12 * 7) = 84

12 times the result of the first operation

(6 − 4) = 2

(10 + 2) = 12

10 plus the result of the third operation

(24 / 12) = 2

24 divided by the result of the fourth operation

84 − 2 = 82

84 minus the result of the second operation

As you can see, the system proceeds left to right but must evaluate inner expressions when encountering an expression surrounded by parentheses. Effectively, (12 * (3 + 4)) and (24 / (10 + (6 − 4))) are on an equal level, so your system will completely evaluate the leftmost expression first, innermost to outermost. It then encounters the second expression surrounded by parentheses and evaluates that one innermost to outermost. The final operation subtracts from the result of the left expression the result of evaluating the right expression. (Does your head hurt yet? Mine does!)

Although I used numeric literals in the previous example, I could just as easily have used column references or a combination of numeric literals and column references as well. The key point to remember here is that you should plan and define your mathematical expressions carefully so that they return the results you seek. Use parentheses to clearly define the sequence in which you want operations to occur, and you’ll get the result you expect.

When working with a mathematical expression, be sure that the values used in the expression are compatible. This is especially true of an expression that contains column references. You can use the CAST function for this purpose exactly as you did within a concatenation expression. For example, say you have a column called TotalLength based on an INTEGER data type that contains the whole number value 345, and a column called Distance based on a REAL data type that contains the value 138.65. To add the value of the Distance column to the value of the TotalLength column, you should use the CAST function to convert the Distance column’s value into an INTEGER data type or the TotalLength column’s value into a REAL data type, depending on whether you want the final result to be an INTEGER or a REAL data type. Assuming you’re interested in adding only the integer values, you would accomplish this with the following expression:

Expression

TotalLength + CAST(Distance AS INTEGER)

Result

483

Not the answer you expected? Maybe you thought converting 138.65 to an integer would round the value up? Although the SQL Standard states that rounding during conversion using the CAST function depends on your database system, most systems truncate a value with decimal places when converting to an integer. So, I’m assuming my system also does that and thus added 345 to 138, not the rounded value 139.

If you forget to ensure the compatibility of the column values within an expression, your database system might raise an error message. If it does, it will probably cancel the execution of the operations within the expression as well. Most RDBMS systems handle such conversions automatically without warning you, but they usually convert all numbers to the most complex data type before evaluating the expression. In the previous example, your RDBMS would most likely convert TotalLength to REAL (the more complex of the two data types). Your system will use REAL because all INTEGER values can be contained within the REAL data type. However, this might not be what you wanted. Those RDBMS programs that do not perform this sort of automatic conversion are usually good about letting you know that it’s a data type mismatch problem, so you’ll know what you need to do to fix your expression.

As you just learned, creating mathematical expressions is a relatively easy task as long as you do a little planning and know how to use the CAST function to your advantage. In our last discussion for this section, I’ll show you how to create expressions that add and subtract dates and times.

Date and Time Arithmetic

The SQL Standard defines addition and subtraction as the operations you can perform on dates and times. Contrary to what you might expect, many RDBMS programs differ in the way they implement these operations. Some database systems allow you to define these operations as you would in a mathematical expression, while others require you to use special built-in functions for these tasks. Refer to your database system’s documentation for details on how your particular RDBMS handles these operations. In this book, I discuss date and time expressions only in general terms so that I can give you an idea of how these operations should work.

Date Expressions

Figure 5-7 shows the syntax for a date expression as defined by the SQL Standard. As you can see, creating the expression is simple enough—take one value and add it to or subtract it from a second value.

Image

Figure 5-7The syntax diagram for a date expression

The SQL Standard further defines the valid operations and their results as follows:

DATE plus or minus INTERVAL yields DATE

DATE minus DATE yields INTERVAL

INTERVAL plus DATE yields DATE

INTERVAL plus or minus INTERVAL yields INTERVAL

INTERVAL times or divided by NUMBER yields INTERVAL

Note that in the SQL Standard you can subtract only a DATE from a DATE or add only a DATE to an INTERVAL.

When you use a column reference, make certain it is based on a DATE or INTERVAL data type, as appropriate. If the column is not an acceptable data type, you might have to use the CAST function to convert the value you are adding or subtracting. The SQL Standard explicitly specifies that you can perform these operations only using the indicated data types, but many database systems convert the column’s data type for you automatically. Your RDBMS will ultimately determine whether the conversion is required, so check your documentation.

Although only a few commercial systems support the INTERVAL data type, many of them allow you to use an integer value (such as SMALLINT or INT) to add to or subtract from a date value. You can think of this as adding and subtracting days. This allows you to answer questions such as “What is the date nine days from now?” and “What was the date five days ago?” Note also that some database systems allow you to add to or subtract from a datetime value using a fraction. For example, adding 3.5 to a datetime value in Microsoft Access adds three days and 12 hours.

When you subtract a date from another date, you are calculating the interval between the two dates. For example, you might need to subtract a hire date from the current date to determine how long an employee has been with the company. Although the SQL Standard indicates that you can add only an interval to a date, many database systems (especially those that do not support the INTERVAL data type) allow you to add either a number or a date anyway. You can use this sort of calculation to answer questions such as “When is the employee’s next review date?”

Note: The SQL Standard defines a variety of functions that you can use to extract information from a column or calculate a value across a range of rows. I’ll cover some of these in more detail in Chapter 12, “Simple Totals.” Most commercial database systems also provide various functions to manipulate parts of strings or to format date, time, or currency values. Check your system documentation for details.

In this book, I’ll show you simple calculations using dates and assume that you can at least add an integer number of days to a date value. I’ll also assume that subtracting one date from another yields an integer number of days between the two dates. If you apply these simple concepts, you can create most of the date expressions that you’ll need. Here are some examples of the types of date expressions you can define:

'2017-05-16' – 5
'2017-11-14' + 12
ReviewDate + 90
EstimateDate – DaysRequired
'2017-07-22' – '2017-06-13'
ShipDate – OrderDate

Time Expressions

You can create expressions using time values as well, and Figure 5-8 shows the syntax. Date and time expressions are very similar, and the same rules and restrictions that apply to a date expression also apply to a time expression.

Image

Figure 5-8The syntax diagram for a time expression

The SQL Standard further defines the valid operations and their results as follows:

TIME plus or minus INTERVAL yields TIME

TIME minus TIME yields INTERVAL

INTERVAL plus or minus INTERVAL yields INTERVAL

INTERVAL times or divided by NUMBER yields INTERVAL

Note that in the SQL Standard you can subtract only a TIME from a TIME or add only a TIME to an INTERVAL.

All the same “gotchas” I noted for date expressions apply to time expressions. Also, for systems that support a combination DATETIME data type, the time portion of the value is stored as a fraction of a day accurate at least to seconds. When using systems that support datetime, you can also usually add or subtract a decimal fraction value to a datetime value. For example, 0.25 is 6 hours (one-fourth of a day). In this book, I’ll assume that your system supports both adding and subtracting time literals or columns. I make no assumption about adding or subtracting decimal fractions. Again, check your documentation to find out what your system actually supports.

Given our assumptions, here are some general examples of time expressions:

'14:00' + '00:22'
'19:00' – '16:30'
StartTime + '00:19'
StopTime – StartTime

I said earlier that I would present date and time expressions only in general terms. My goal was to make sure that you understood date and time expressions conceptually and that you had a general idea of the types of expressions you should be able to create. Unfortunately, most database systems do not implement the SQL Standard’s specification for time expressions exactly, and many only partially support the specification for the date expression. As I noted, however, all database systems provide one or more functions that allow you to work with dates and times. You can find a summary of these functions for six major implementations in Appendix C, “Date and Time Types, Operations, and Functions.” I strongly recommend that you study your database system’s documentation to learn what types of functions your system provides.

Now that you know how to create the various types of expressions, the next step is to learn how to use them.

See Also Take a look at Appendix C for an overview of how six of the most popular database systems deal with dates and times. I list the data types and arithmetic operations supported along with a comprehensive list of date and time functions for each.

Using Expressions in a SELECT Clause

Knowing how to use expressions is arguably one of the most important concepts you’ll learn in this book. You’ll use expressions for a variety of purposes when working with SQL. For example, you would use an expression to

• Create a calculated column in a query

• Search for a specific column value

• Filter the rows in a result set

• Connect two tables in a JOIN operation

I’ll show you how to do this (and more) as you work through the rest of the book. I begin by showing you how to use basic expressions in a SELECT clause.

Note: Throughout this chapter, I use the “Request/Translation/Clean Up/SQL” technique introduced in Chapter 4.

You can use basic expressions in a SELECT clause to clarify information in a result set and to expand the result set’s scope of information. For example, you can create expressions to concatenate first and last names, calculate the total price of a product, determine how long it took to complete a project, or specify a date for a patient’s next appointment. Let’s look at how you might use a concatenation expression, a mathematical expression, and a date expression in a SELECT clause. First, I’ll work with the concatenation expression.

Working with a Concatenation Expression

Unlike mathematical and date expressions, you use concatenation expressions only to enhance the readability of the information contained in the result set of a SELECT statement. Suppose you are posing the following request:

“Show me a current list of our employees and their phone numbers.”

When translating this request into a SELECT statement, you can improve the output of the result set somewhat by concatenating the first and last names into a single column. Here’s one way you can translate this request:

Translation

Select the first name, last name, and phone number of all our employees from the employees table

Clean Up

Select the first name, last name, and phone number of all our employees from the employees table

SQL

SELECT EmpFirstName || ' ' || EmpLastName,

'Phone Number: ' || EmpPhoneNumber

FROM Employees

The result for one of the rows will look something like this:

Mary Thompson

Phone Number: 555-2516

You probably noticed that in addition to concatenating the first name column, a space, and the last name column, I also concatenated the character literal string “Phone Number:” with the phone number column. This example clearly shows that you can easily use more than one concatenation expression in a SELECT clause to enhance the readability of the information in the result set. Remember that you can also concatenate values with different data types by using the CAST function. For instance, I concatenate a character column value with a numeric column value in the next example:

“Show me a list of all our vendors and their identification numbers.”

Translation

Select the vendor name and vendor ID from the vendors table

Clean Up

Select the vendor name and vendor ID from the vendors table

SQL

SELECT 'The ID Number for ' || VendName ||

 ' is ' || CAST(VendorID AS CHARACTER)

FROM Vendors

Although the concatenation expression is a useful tool in a SELECT statement, it is one that you should use judiciously. When you use concatenation expressions containing long character string literals, keep in mind that the literals will appear in every row of the result set. You might end up cluttering the final result with repetitive information instead of enhancing it. Carefully consider your use of literals in concatenation expressions so that they work to your advantage.

Naming the Expression

When you use an expression in a SELECT clause, the result set includes a new column that displays the result of the operation defined in the expression. This new column is known as a calculated (or derived) column. For example, the result set for the following SELECT statement will contain three columns—two “real” columns and one calculated column:

SQL

SELECT EmpFirstName || ' ' || EmpLastName,

EmpPhoneNumber, EmpCity

FROM Employees

The two real columns are, of course, EmpPhoneNumber and EmpCity, and the calculated column is derived from the concatenation expression at the beginning of the SELECT clause.

According to the SQL Standard, you can optionally provide a name for the new column by using the AS keyword. (In fact, you can assign a new name to any column using the AS clause.) Almost every database system, however, requires a name for a calculated column. Some database systems require you to provide the name explicitly, while others actually provide a generated name for you. Determine how your database system handles this before you work with the examples. If you plan to reference the result of the expression in your query, you should provide a name.

Figure 5-9 shows the syntax for naming an expression. You can use any valid character string literal (enclosed in single quotes) for the name. Some database systems relax this requirement when you’re naming an expression and require quotes only when your column name includes embedded spaces. However, I strongly recommend that you not use spaces in your names because the spaces can be difficult to deal with in some database programming languages.

Image

Figure 5-9The syntax diagram for naming an expression

Now I’ll modify the SELECT statement in the previous example and supply a name for the concatenation expression:

SQL

SELECT EmpFirstName || ' ' || EmpLastName AS

EmployeeName, EmpPhoneNumber, EmpCity

FROM Employees

The result set for this SELECT statement will now contain three columns called EmployeeName, EmpPhoneNumber, and EmpCity.

In addition to supplying a name for expressions, you can use the AS keyword to supply an alias for a real column name. Suppose you have a column called DOB and are concerned that some of your users might not be familiar with this abbreviation. You can eliminate any possible misinterpretation of the name by using an alias, as shown here:

SQL

SELECT EmpFirstName || ' ' || EmpLastName AS

 EmployeeName, DOB AS DateOfBirth

FROM Employees

This SELECT statement produces a result set with two columns called EmployeeName and DateOfBirth. You’ve now effectively eliminated any possible confusion of the information displayed in the result set.

Providing names for your calculated columns has a minor effect on the translation process. For example, here’s one possible version of the translation process for the previous example:

“Give me a list of employee names and their dates of birth.”

Translation

Select first name and last name as employee name and DOB as date of birth from the employees table

Clean Up

Select first name and || ‘ ’ || last name as EmployeeName and DOB as DateOfBirth from the employees table

SQL

SELECT EmpFirstName || ' ' || EmpLastName

 AS EmployeeName, DOB AS DateOfBirth

FROM Employees

After you become accustomed to using expressions, you won’t need to state them quite as explicitly in your translation statements as I did here. You’ll eventually be able to easily identify and define the expressions you need as you construct the SELECT statement itself.

Note: Throughout the remainder of the book, I provide names for all calculated columns within an SQL statement, as appropriate.

Working with a Mathematical Expression

Mathematical expressions are possibly the most versatile of the three types of expressions, and you’ll probably use them quite often. For example, you can use a mathematical expression to calculate a line item total, determine the average score from a given set of tests, calculate the difference between two lab results, and estimate the total seating capacity of a building. The real trick is to make certain your expression works, and that is just a function of doing a little careful planning.

Here’s an example of how you might use a mathematical expression in a SELECT statement:

“Display for each agent the agent name and projected income (salary plus commission), assuming each agent will sell $50,000 worth of bookings.”

Translation

Select first name and last name as agent name and salary plus 50000 times commission rate as projected income from the agents table

Clean Up

Select first name and || ‘ ‘ || last name as AgentName, and salary plus + 50000 times * commission rate as ProjectedIncome from the agents table

SQL

SELECT AgtFirstName || ' ' || AgtLastName

AS AgentName,

Salary + (50000 * CommissionRate)

AS ProjectedIncome

FROM Agents

Notice that I added parentheses to make it crystal clear that I expect the commission rate to be multiplied by 50,000 and then add the salary, not add 50,000 to the salary and then multiply by the commission rate. As the example shows, you’re not limited to using a single type of expression in a SELECT statement. Rather, you can use a variety of expressions to retrieve the information you need in the result set. Here’s another way you can write the previous SQL statement:

SQL

SELECT AgtFirstName || ' ' || AgtLastName

|| ' has a projected income of ' ||

CAST(Salary + (50000 * CommissionRate)

AS CHARACTER) AS ProjectedIncome

FROM Agents

The information you can provide by using mathematical expressions is virtually limitless, but you must properly plan your expressions and use the CAST function as appropriate.

Working with a Date Expression

Using a date expression is similar to using a mathematical expression in that you’re simply adding or subtracting values. You can use date expressions for all sorts of tasks. For example, you can calculate an estimated ship date, project the number of days it will take to finish a project, or determine a follow-up appointment date for a patient. Here’s an example of how you might use a date expression in a SELECT clause:

“How many days did it take to ship each order?”

Translation

Select the order number and ship date minus order date as days to ship from the orders table

Clean Up

Select the order number and ship date minus – order date as DaysToShip from the orders table

SQL

SELECT OrderNumber,

 CAST(ShipDate – OrderDate AS INTEGER)

 AS DaysToShip

FROM Orders

You can use time expressions in the same manner.

“What would be the start time for each class if we began each class ten minutes later than the current start time?”

Translation

Select the start time and start time plus 10 minutes as new start time from the classes table

Clean Up

Select the start time and start time plus + ‘00:10’ minutes as NewStartTime from the classes table

SQL

SELECT StartTime, StartTime + '00:10'

AS NewStartTime

FROM Classes

As I mentioned earlier, all database systems provide a function or set of functions for working with date values. I did want to give you an idea of how you might use dates and times in your SELECT statements, however, and I again recommend that you refer to your database system’s documentation for details on the date and time functions your database system provides.

A Brief Digression: Value Expressions

You now know how to use column references, literal values, and expressions in a SELECT clause. You also know how to assign a name to a column reference or an expression. Now I’ll show you how this all fits into the larger scheme of things.

The SQL Standard refers to a column reference, literal value, and expression collectively as a value expression. Figure 5-10 shows how to define a value expression.

Image

Figure 5-10The syntax diagram for a value expression

Let’s take a closer look at the components of a value expression:

• The syntax begins with an optional plus or minus sign. You use either of these signs when you want the value expression to return a signed numeric value. The value itself can be a numeric literal, the value of a numeric column, a call to a function that returns a numeric value (see our discussion of the CAST function earlier in this chapter), or the return value of a mathematical expression. You cannot use the plus or minus sign before an expression that returns a character or datetime data type.

• You can see that the first list in the figure also includes (Value Expression). This means that you can use a complex value expression comprised of other value expressions that include concatenation or mathematical operators of their own. The parentheses force the database system to evaluate this value expression first. (Don’t worry about (SELECT Expression) and CASE Expression just yet—I cover those in detail in Chapter 11, “Subqueries,” and Chapter 19, “Condition Testing,” respectively.)

• The next item in the syntax is a list of operators. As you can see in the inset box, the type of expression you use at the beginning of the syntax determines which operators you can select from this list.

• No, you’re not seeing things: Value Expression does appear after the list of operators as well. The fact that you can use other value expressions within a value expression allows you to create very complex expressions.

By its very definition, a value expression returns a value that is used by some component of an SQL statement. The SQL Standard specifies the use of a value expression in a variety of statements and defined terms. No matter where you use it, you’ll always define a value expression in the same manner as you’ve learned here.

I’ll put this all into some perspective by showing you how a value expression is used in a SELECT statement. Figure 5-11 shows a modified version of the SELECT statement syntax diagram presented in Figure 4-9 in Chapter 4. This new syntax gives you the flexibility to use literals, column references, expressions, or any combination of these within a single SELECT statement. You can optionally name your value expressions with the AS keyword.

Image

Figure 5-11The syntax diagram for the SELECT statement that includes a value expression

Throughout the remainder of the book, I use the term value expression to refer to a column reference, a literal value, or an expression, as appropriate. In later chapters, I discuss how to use a value expression in other statements and show you a couple of other items that a value expression represents.

Now, back to our regularly scheduled program.

That “Nothing” Value: Null

As you know, a table consists of columns and rows. Each column represents a characteristic of the subject of the table, and each row represents a unique instance of the table’s subject. You can also think of a row as one complete set of column values—each row contains exactly one value from each column in the table. Figure 5-12 shows an example of a typical table.

Image

Figure 5-12A typical Customers table

So far I’ve shown how to retrieve information from the data in a table with a SELECT statement and how to manipulate that data by using value expressions. All of this works just fine because I’ve continually made the assumption that each column in the table contains data. But as Figure 5-12 clearly illustrates, a column sometimes might not contain a value for a particular row in the table. Depending on how you use the data, the absence of a value might adversely affect your SELECT statements and value expressions. Before I discuss any implications, let’s first examine how SQL regards missing values.

Introducing Null

In SQL, a Null represents a missing or an unknown value. You must understand from the outset that a Null does not represent a zero, a character string of one or more blank spaces, or a “zero-length” character string. The reasons are quite simple:

• A zero can have a very wide variety of meanings. It can represent the state of an account balance, the current number of available first-class ticket upgrades, or the current stock level of a particular product.

• Although a character string of one or more blank spaces is guaranteed to be meaningless to most of us, it is something that is definitely meaningful to SQL. A blank space is a valid character as far as SQL is concerned, and a character string composed of three blank spaces (' ') is just as legitimate as a character string composed of several letters (‘a character string’).

• A zero-length string—two consecutive single quotes with no space in between ('')—can be meaningful under certain circumstances. In an employee table, for example, a zero-length string value in a column called MiddleInitial might represent the fact that a particular employee does not have a middle initial in her name. Note, however, that some implementations (notably Oracle) treat a zero-length string in a VARCHAR as Null.

A Null is quite useful when used for its stated purpose, and the Customers table in Figure 5-12 shows a clear example of this. In the CustCounty column, each blank cell represents a missing or unknown county name for the row in which it appears—a Null. In order to use Nulls correctly, you must understand why they occur in the first place.

Missing values are commonly the result of human error. Consider the row for Robert Brown, for example. If you’re entering the data for Mr. Brown and you fail to ask him for the name of the county he lives in, that data is considered missing and is represented in the row as a Null. After you recognize the error, however, you can correct it by calling Mr. Brown and asking him for the county name.

Unknown values appear in a table for a variety of reasons. One reason might be that a specific value you need for a column is as yet undefined. For example, you might have a Categories table in a School Scheduling database that doesn’t have a category for a new set of classes that you want to offer beginning in the fall session. Another reason a table might contain unknown values is that the values are truly unknown. Let’s use the Customers table in Figure 5-12 once again and consider the row for Dean McCrae. Say that you’re entering the data for Mr. McCrae, and you ask him for the name of the county he lives in. If neither of you knows the county that includes the city in which he lives, then the value for the county column in his row is truly unknown. This is represented in his row as a Null. Obviously, you can correct the problem after either of you determines the correct county name.

A column value might also be Null if none of its values apply to a particular row. Let’s assume for a moment that you’re working with an employee table that contains a Salary column and an HourlyRate column. The value for one of these two columns is always going to be Null because an employee cannot be paid both a fixed salary and an hourly rate.

It’s important to note that there is a very slim difference between “does not apply” and “is not applicable.” In the previous example, the value of one of the two columns literally does not apply. But let’s assume you’re working with a patient table that contains a column called HairColor and you’re currently updating a row for an existing male patient. If that patient is bald, then the value for that column is definitely not applicable. Although you could just use a Null to represent a value that is not applicable, I recommend that you use a true value such as “N/A” or “Not Applicable.” This will make the information clearer in the long run.

As you can see, whether you allow Nulls in a table depends on the manner in which you’re using the data. Now that I’ve shown you the positive side of using Nulls, let’s take a look at the negative implication of using Nulls.

The Problem with Nulls

The major drawback of Nulls is their adverse effect on mathematical operations. Any operation involving a Null evaluates to Null. This is logically reasonable—if a number is unknown, then the result of the operation is necessarily unknown. Note how a Null alters the outcome of the operation in the next example:

(25 * 3) + 4 = 79
(Null * 3) + 4 = Null
(25 * Null) + 4 = Null
(25 * 3) + Null = Null

The same result occurs when an operation involves columns containing Null values. For example, suppose you execute the following SELECT statement (the statement is just an example—it won’t work as coded in the sample database) and it returns the result set shown in Figure 5-13.

SQL

SELECT ProductID, ProductDescription, Category,

 Price, QuantityOnHand, Price *

QuantityOnHand AS TotalValue

FROM Products

The operation represented by the TotalValue column is completed successfully as long as both the Price and QuantityOnHand columns have valid numeric values. Otherwise, TotalValue will contain a Null if either Price or QuantityOnHand contains a Null. The good news is that TotalValue will contain an appropriate value after you replace the Nulls in Price and QuantityOnHand with valid numeric values. You can avoid this problem completely by ensuring that the columns you use in a mathematical expression do not contain Null values.

Image

Figure 5-13Nulls involved in a mathematical expression

This is not the only time I’ll be concerned with Nulls. In Chapter 12, we’ll see how Nulls impact SELECT statements that summarize information.

Sample Statements

Now that you know how to use various types of value expressions in the SELECT clause of a SELECT statement, let’s take a look, on the next few pages, at some examples using the tables from four of the sample databases. These examples illustrate the use of expressions to generate an output column.

I’ve also included sample result sets that would be returned by these operations and placed them immediately after the SQL syntax line. The name that appears immediately above a result set is the name I gave each query in the sample data on the companion website for the book, www.informit.com/title/9780134858333. I stored each query in the appropriate sample database (as indicated within the example) and prefixed the names of the queries relevant to this chapter with “CH05.” You can follow the instructions in the Introduction of this book to load the samples onto your computer and try them.

Note: I’ve combined the Translation and Clean Up steps in the following examples so that you can begin to learn how to consolidate the process. Although you’ll still work with all three steps during the body of any given chapter, you’ll get a chance to work with the consolidated process in each “Sample Statements” section.

Sales Orders Database

“What is the inventory value of each product?”

Translation/Clean Up

Select the product name, retail price times * quantity on hand as InventoryValue from the products table

SQL

SELECT ProductName,

 RetailPrice * QuantityOnHand AS

 InventoryValue

FROM Products

CH05_Product_Inventory_Value (40 Rows)

ProductName

InventoryValue

Trek 9000 Mountain Bike

  $7,200.00

Eagle FS-3 Mountain Bike

$14,400.00

Dog Ear Cyclecomputer

  $1,500.00

Victoria Pro All Weather Tires

  $1,099.00

Dog Ear Helmet Mount Mirrors

       $89.40

Viscount Mountain Bike

  $3,175.00

Viscount C-500 Wireless Bike Computer

  $1,470.00

Kryptonite Advanced 2000 U-Lock

  $1,000.00

<< more rows here >>

“How many days elapsed between the order date and the ship date for each order?”

Translation/Clean Up

Select the order number, order date, ship date, ship date minus – order date as DaysElapsed from the orders table

SQL

SELECT OrderNumber, OrderDate, ShipDate,

CAST(ShipDate – OrderDate AS INTEGER)

AS DaysElapsed

FROM Orders

CH05_Shipping_Days_Analysis (944 Rows)

OrderNumber

OrderDate

ShipDate

DaysElapsed

1

2017-09-02

2017-09-05

3

2

2017-09-02

2017-09-04

2

3

2017-09-02

2017-09-05

3

4

2017-09-02

2017-09-04

2

5

2017-09-02

2017-09-02

0

6

2017-09-02

2017-09-06

4

7

2017-09-02

2017-09-05

3

8

2017-09-02

2017-09-02

0

9

2017-09-02

2017-09-05

3

10

2017-09-02

2017-09-05

3

<< more rows here >>

Entertainment Agency Database

“How long is each engagement due to run?”

Translation/Clean Up

Select the engagement number, end date minus – start date plus one + 1 as DueToRun from the engagements table

SQL

SELECT EngagementNumber,

 CAST(CAST(EndDate – StartDate

  AS INTEGER) + 1 AS CHARACTER)

 || ' day(s)' AS DueToRun

FROM Engagements

CH05_Engagement_Lengths (111 Rows)

EngagementNumber

DueToRun

2

  5 day(s)

3

  6 day(s)

4

  7 day(s)

5

  4 day(s)

6

  5 day(s)

7

  8 day(s)

8

  8 day(s)

9

11 day(s)

10

10 day(s)

11

  2 day(s)

<< more rows here >>

Note: You have to add “1” to the date expression in order to account for each date in the engagement. Otherwise, you’ll get “0 day(s)” for an engagement that starts and ends on the same date. You can also see that I CAST the result of subtracting the two dates first as INTEGER (in MySQL, Signed Integer) so that I could add the value 1, then CAST the result of that to CHARACTER to ensure the concatenation works as expected.

“What is the net amount for each of our contracts?”

Translation/Clean Up

Select the engagement number, contract price, contract price times * 0.12 as OurFee, contract price minus – (contract price times * 0.12) as NetAmount from the engagements table

SQL

SELECT EngagementNumber, ContractPrice,

ContractPrice * 0.12 AS OurFee, ContractPrice

−(ContractPrice * 0.12)

AS NetAmount

FROM Engagements

CH05_Net_Amount_Per_Contract (111 Rows)

EngagementNumber

ContractPrice

OurFee

NetAmount

2

$200.00

$24.00

$176.00

3

$590.00

$70.80

$519.20

4

$470.00

$56.40

$413.60

5

$1,130.00

$135.60

$994.40

6

$2,300.00

$276.00

$2,024.00

7

$770.00

$92.40

$677.60

8

$1,850.00

$222.00

$1,628.00

9

$1,370.00

$164.40

$1,205.60

10

$3,650.00

$438.00

$3,212.00

11

$950.00

$114.00

$836.00

<< more rows here >>

School Scheduling Database

“List how many complete years each staff member has been with the school as of October 1, 2017, and sort the result by last name and first name.”

Translation/Clean Up

Select last name || ‘, ’ || and first name concatenated with a comma as Staff, date hired, and ((‘2017-10-01’ minus – date hired) divided by / 365) as YearsWithSchool from the staff table and sort order by last name and first name

SQL

SELECT StfLastName || ', ' || StfFirstName

   AS Staff,

DateHired,

CAST(CAST('2017-10-01' − DateHired

   AS INTEGER) / 365 AS INTEGER)

 AS YearsWithSchool

FROM Staff

ORDER BY StfLastName, StfFirstName

CH05_Length_Of_Service (27 Rows)

Staff

DateHired

YearsWithSchool

Alborous, Sam

1990-11-20

26

Black, Alastair

1996-12-11

20

Bonnicksen, Joyce

1994-03-02

23

Brehm, Peter

1994-07-16

23

Brown, Robert

1997-02-09

20

Coie, Caroline

1991-01-28

26

DeGrasse, Kirk

1996-03-02

21

Ehrlich, Katherine

1993-03-08

24

Glynn, Jim

1993-08-02

23

Hallmark, Alaina

1992-01-07

24

<< more rows here >>

Note: The objective is to calculate the number of complete years of service as of October 1, 2017. (Note that I used the CAST function to ensure that the string literal is treated as a date.) For example, if a staff member was hired on October 10, 2015, the answer should be 1, not 2. The expression in this SELECT statement is technically correct and works as expected, but it returns the wrong answer when there are any leap years between the hire date and October 1, 2017. Strangely enough, the SQL Standard does not define any functions for performing specialized date and time calculations. The Standard defines only basic subtraction of two dates/times, addition of a date/time and an interval, and multiplication or division by a number to yield an interval.

You can correct this problem by using the appropriate date arithmetic function provided by your database system. As mentioned earlier, most database systems provide their own methods of working with dates and times, and you can find a summary of date and time functions supported by six of the major database systems in Appendix C. But be careful! For example, both Microsoft SQL Server and Microsoft Office Access have a DateDiff function that lets you calculate the difference in years, but the answer returned is simply the difference between the year portion of the two dates. The number of years between December 31, 2016 and January 1, 2017 is 1! I’ll show you a more precise way to answer this problem in Chapter 19 using CASE.

“Show me a list of staff members, their salaries, and a proposed 7 percent bonus for each staff member.”

Translation/Clean Up

Select the last name || ‘, ’ || and first name as StaffMember, salary, and salary times * 0.07 as Bonus from the staff table

SQL

SELECT StfLastName || ', ' || StfFirstName

 AS Staff, Salary, Salary * 0.07 AS Bonus

FROM Staff

CH05_Proposed_Bonuses (27 Rows)

Staff

Salary

Bonus

Alborous, Sam

$60,000.00

$4,200.00

Black, Alastair

$60,000.00

$4,200.00

Bonnicksen, Joyce

$60,000.00

$4,200.00

Brehm, Peter

$60,000.00

$4,200.00

Brown, Robert

$49,000.00

$3,430.00

Coie, Caroline

$52,000.00

$3,640.00

DeGrasse, Kirk

$45,000.00

$3,150.00

Ehrlich, Katherine

$45,000.00

$3,150.00

Glynn, Jim

$45,000.00

$3,150.00

Hallmark, Alaina

$57,000.00

$3,900.00

<< more rows here >>

Bowling League Database

“Display a list of all bowlers and addresses formatted suitably for a mailing list, sorted by ZIP Code.”

Translation/Clean Up

Select first name || ‘ ’ || and last name as FullName, BowlerAddress, city || ‘, ’ || state || ‘ ’ || and ZIP Code as CityStateZip, BowlerZip from the bowlers table and order by ZIP Code

SQL

SELECT BowlerFirstName || ' ' || BowlerLastName AS

FullName,

Bowlers.BowlerAddress,

BowlerCity || ', ' || BowlerState || ' ' ||

BowlerZip AS CityStateZip, BowlerZip

FROM Bowlers

ORDER BY BowlerZip

CH05_Names_Address_For_Mailing (32 Rows)

FullName

BowlerAddress

CityStateZip

BowlerZip

Kathryn Patterson

16 Maple Lane

Auburn, WA 98002

98002

Rachel Patterson

16 Maple Lane

Auburn, WA 98002

98002

Ann Patterson

16 Maple Lane

Auburn, WA 98002

98002

Neil Patterson

16 Maple Lane

Auburn, WA 98002

98002

Megan Patterson

16 Maple Lane

Auburn, WA 98002

98002

Carol Viescas

16345 NE 32nd Street

Bellevue, WA 98004

98004

Sara Sheskey

17950 N 59th

Seattle, WA 98011

98011

Richard Sheskey

17950 N 59th

Seattle, WA 98011

98011

William Thompson

122 Spring Valley Drive

Duvall, WA 98019

98019

Mary Thompson

122 Spring Valley Drive

Duvall, WA 98019

98019

<< more rows here >>

Note: Notice that I included the BowlerZip column not only in the CityStateZip expression but also as a separate column. Remember that the SQL Standard enables you to sort only on columns that are included in the SELECT clause. Even though you don’t need the BowlerZip again to create your mailing list, you should include the column so that you can use it in the ORDER BY clause. Some database systems, notably Microsoft Office Access, do not impose this requirement, but remember that I’m strictly following the standard in every query I use as an example.

“What was the point spread between a bowler’s handicap and raw score for each match and game played?”

Translation/Clean Up

Select bowler ID, match ID, game number, handicap score, raw score, handicap score minus – raw score as PointDifference from the bowler scores table and order by bowler ID, match ID, game number

SQL

SELECT BowlerID, MatchID, GameNumber,

HandiCapScore, RawScore, HandiCapScore

 − RawScore AS PointDifference

FROM Bowler_Scores

ORDER BY BowlerID, MatchID, GameNumber

CH05_Handicap_vs_RawScore (1344 Rows)

BowlerID

MatchID

GameNumber

HandiCapScore

RawScore

PointDifference

1

1

1

192

146

46

1

1

2

192

146

46

1

1

3

199

153

46

1

5

1

192

145

47

1

5

2

184

137

47

1

5

3

199

152

47

1

10

1

189

140

49

1

10

2

186

137

49

1

10

3

210

161

49

<< more rows here >>

Summary

I began the chapter with a brief overview of expressions. I then explained that you need to understand data types before you can build expressions, and I went on to discuss each of the major data types in some detail. I next showed you the CAST function and explained that you’ll often use it to change the data type of a column or literal so that it’s compatible with the type of expression you’re trying to build. I then covered all the ways that you can introduce a constant value—a literal—into your expressions. I then introduced you to the concept of using an expression to broaden or narrow the scope of information you retrieve from the database. I also explained that an expression is some form of operation involving numbers, character strings, or dates and times.

I continued our discussion of expressions and provided a concise overview of each type of expression. I showed you how to concatenate strings of characters and how to concatenate strings with other types of data by using the CAST function. I then showed you how to create mathematical expressions, and I explained how the order of precedence affects a given mathematical operation. I closed this discussion with a look at date and time expressions. After showing you how the SQL Standard handles dates and times, I revealed that most database systems provide their own methods of working with dates and times.

I then proceeded to the subject of using expressions in a SELECT statement, and I showed you how to incorporate expressions in the SELECT clause. I then showed you how to use both literal values and columns within an expression, as well as how to name the column that holds the result value of the expression. Before ending this discussion, I took a brief digression and introduced you to the value expression. I revealed that the SQL Standard uses this term to refer to a column reference, literal value, and expression collectively and that you can use a value expression in various clauses of an SQL statement. (More on this in later chapters, of course!)

I closed this chapter with a discussion on Nulls. You learned that a Null represents a missing or an unknown value. I showed you how to use a Null properly and explained that it can be quite useful under the right circumstances. But I also discussed how Nulls adversely affect mathematical operations. You now know that a mathematical operation involving a Null value returns a Null value. I also showed you how Nulls can make the information in a result set inaccurate.

In the next chapter, I’ll discuss the idea of retrieving a very specific set of information. I’ll then show you how to use a WHERE clause to filter the information retrieved by a SELECT statement.

The following section presents a number of requests that you can work out on your own.

Problems for You to Solve

Below, I show you the request statement and the name of the solution query in the sample databases. If you want some practice, you can work out the SQL for each request and then check your answer with the query I saved in the samples. Don’t worry if your syntax doesn’t exactly match the syntax of the queries I saved—as long as your result set is the same.

Sales Orders Database

1. “What if we adjusted each product price by reducing it 5 percent?”

You can find the solution in CH05_Adjusted_Wholesale_Prices (90 rows).

2. “Show me a list of orders made by each customer in descending date order.”

(Hint: You might need to order by more than one column for the information to display properly.)

You can find the solution in CH05_Orders_By_Customer_And_Date (944 rows).

3. “Compile a complete list of vendor names and addresses in vendor name order.”

You can find the solution in CH05_Vendor_Addresses (10 rows).

Entertainment Agency Database

1. “Give me the names of all our customers by city.”

(Hint: You’ll have to use an ORDER BY clause on one of the columns.)

You can find the solution in CH05_Customers_By_City (15 rows).

2. “List all entertainers and their Web sites.”

You can find the solution in CH05_Entertainer_Web_Sites (13 rows).

3. “Show the date of each agent’s first six-month performance review.”

(Hint: You’ll need to use date arithmetic to answer this request. Be sure to refer to Appendix C.)

You can find the solution in CH05_First_Performance_Review (9 rows).

School Scheduling Database

1. “Give me a list of staff members, and show them in descending order of salary.”

You can find the solution in CH05_Staff_List_By_Salary (27 rows).

2. “Can you give me a staff member phone list?”

You can find the solution in CH05_Staff_Member_Phone_List (27 rows).

3. “List the names of all our students, and order them by the cities they live in.”

You can find the solution in CH05_Students_By_City (18 rows).

Bowling League Database

1. “Show next year’s tournament date for each tournament location.”

(Hint: Add 364 days to get the same day of the week, and be sure to refer to Appendix C.)

You can find the solution in CH05_Next_Years_Tourney_Dates (20 rows).

2. “List the name and phone number for each member of the league.”

You can find the solution in CH05_Phone_List (32 rows).

3. “Give me a listing of each team’s lineup.”

(Hint: Base this query on the Bowlers table.)

You can find the solution in CH05_Team_Lineups (32 rows).