20
Using Unlinked Data and “Driver” Tables

“If you only have a hammer, you tend to see every problem as a nail.”

—ABRAHAM MASLOW

Topics Covered in This Chapter

What Is Unlinked Data?

Solving Problems with Unlinked Data

Solving Problems Using “Driver” Tables

Sample Statements

Summary

Problems for You to Solve

Before you start this chapter, make sure you get a good night’s sleep! And while I’m doling out warnings, perhaps you should also make sure your seatbelt is securely fastened. I promised that I would introduce you to concepts that make you think “outside the box.” In this chapter, I am going to tackle problems that can be solved using unlinked data—problems where you will use more than one table in your FROM clause, but you won’t specify any linking criteria using an ON clause. Let’s get started.

Caution: I am going to use the CASE expression extensively in this chapter. If you are not familiar with using CASE, I strongly recommend you work through Chapter 19, “Condition Testing,” before tackling this chapter.

What Is Unlinked Data?

As you learned beginning in Chapter 7, “Thinking in Sets,” most problems you’ll solve using SQL involve gathering data from more than one table. In Chapter 8, “INNER JOINs,” I showed you how to fetch information from multiple tables by linking them on matching data in the Primary and Foreign keys where all the values match. In Chapter 9, “OUTER JOINs,” I showed you how to fetch all the rows from one table and any matching information from a related table again using matching data in the Primary and Foreign keys. In this chapter, I’ll use multiple tables, but I will purposefully not match on key values—I will be using “unlinked” tables.

Let’s take a look at the SQL syntax to create unlinked tables. First, Figure 20-1 shows you the syntax for the SELECT Statement.

Image

Figure 20-1The SELECT Statement

You need to study the Table Reference to understand how to put unlinked tables in a FROM clause. Figure 20-2 shows you the full diagram for Table Reference.

Image

Figure 20-2The structure of a Table Reference

And finally, you need to study the diagram for Joined Table. Even though you really aren’t going to “join” unlinked tables, the SQL Standard does show you how to do it in the Joined Table definition, as shown in Figure 20-3.

Image

Figure 20-3The diagram for Joined Table

To get unlinked tables, you need to do what the SQL Standard calls a CROSS JOIN. So what do you get when you put two or more tables in the FROM clause of your SQL using a CROSS JOIN? The result is something called a Cartesian Product. You’ll get all rows from the first table matched with all rows from the second table, and the total number of rows you will get will be the product of the number of rows in the first table times the number of rows in the second table. Let’s take a look at a simple example:

SELECT Customers.CustLastName,
  Products.ProductName
FROM Customers CROSS JOIN Products;

In the Sales Orders sample database, you can find 28 customers and 40 products, so you’ll get 28 times 40 rows or 1,120 rows! The result looks like this:

CustLastName

ProductName

Viescas

Trek 9000 Mountain Bike

Thompson

Trek 9000 Mountain Bike

Hallmark

Trek 9000 Mountain Bike

Brown

Trek 9000 Mountain Bike

McCrae

Trek 9000 Mountain Bike

Viescas

Trek 9000 Mountain Bike

Sergienko

Trek 9000 Mountain Bike

Patterson

Trek 9000 Mountain Bike

Cencini

Trek 9000 Mountain Bike

Kennedy

Trek 9000 Mountain Bike

<< more rows here >>

You might be asking: Why is this useful? Let’s say you need to produce a catalog of all products that is customized for each customer. Your sales department has asked you to create the information to be able to say “Dear Mr. Thompson” or “Dear Mrs. Brown,” print a mailing label on the outside cover, and then list all the products available. You could certainly include the Orders and Order_Details tables to fully link Customers with Products, but then you would get only the products that each customer had ever purchased. To solve your problem, you need to use unlinked tables that result in a Cartesian Product to get the information you need. (By the way, I saved the query to produce the list of all customers and products as CH20_Customer_Catalog in the Sales Orders sample database.)

Note: The SQL Standard allows you to simply list tables separated by commas when you want to use unlinked tables (see Figure 20-1 of the SELECT Statement shown previously), and nearly all database systems accept this syntax. However, as you have learned, the SQL Standard also defines the keywords CROSS JOIN to explicitly indicate that you intend to get the Cartesian Product of the table reference on the left with the table reference on the right.

When you save a View in Microsoft SQL Server using only commas to separate the table names, you’ll find the view saved with the commas replaced with CROSS JOIN. When you save a View in MySQL using only commas, you’ll find the view saved with the commas replaced with JOIN. (CROSS is the default if you don’t specify INNER or OUTER and do not include an ON clause.) PostgreSQL leaves the commas but replaces INNER JOIN with just JOIN—the default being INNER when not specified. Go figure! Microsoft Office Access doesn’t support CROSS JOIN, so I created all the sample queries using only the lowest common denominator—the comma syntax. I will, however, continue to use CROSS JOIN in the text and in the Clean Up steps to make it clear that’s what I am doing. In the SQL statements in the sample databases, I’ll use only commas.

Deciding When to Use a CROSS JOIN

Deciding to use a CROSS JOIN isn’t easy. You can think of these types of queries in two categories:

• Using data from two or more of the main data tables in your database—the tables that you built to store all the subjects and actions described by your application.

I mentioned Customers and all Products previously in this chapter. The same might apply to all Agents and Entertainers, Students and Courses, or even Teams unlinked with a second copy of the Teams table to list all potential matches.

• Using data from one or more of your main data tables and a “helper” or “driver” table that contains rows, for example, for all dates across a relevant time period.

You certainly have date information in your database, such as the OrderDate in the Orders table. But when you want to look at all dates across a range regardless of whether an order was placed on that date, you need a driver table to supply all the values. You can also use driver tables to supply “lookup” values such as a translation from Gender code to the relevant word or conversion of a grade point to a letter grade defined by a range of grade points.

Solving Problems with Unlinked Data

Normally when you set about solving problems using data in your main data tables, you figure out where the data you want is stored and then gather all the tables required to link that data in some meaningful way. When the data you want is in two or more tables, you think about using a JOIN to link the tables, including any intervening tables necessary to logically link all the tables even if you don’t actually need data columns from some of those tables.

Solving problems with unlinked data involves breaking this mold and “thinking outside of the box” to get the answer you want. Let’s take a look again at the Customers and Products “catalog” problem, but let’s complicate it by flagging any products the customers have already ordered.

Note: Throughout this chapter, I use the “Request/Translation/Clean Up/SQL” technique introduced in Chapter 4, “Creating a Simple Query.” Because this process should now be very familiar to you, I have combined the Translation/Clean Up steps for all the following examples to simplify the process.

“Produce a list of all customer names and address and all products that we sell and indicate the products the customer has already purchased.”

From what you learned in Part III, “Working with Multiple Tables,” you would look at your table relationships to start to figure out how to proceed. Figure 20-4 shows you the standard way you would link Customers and Products, using the Orders and Order_Details tables as intermediaries.

Image

Figure 20-4The normal way to connect Customers to Products

Remember that I want all customers (including those who haven’t ordered anything) and all Products (including products never ordered). If you have your thinking cap on, you might come up with using a FULL OUTER JOIN (see Chapter 9), and you are correct—that would be one way to do it. Keep in mind that not all database systems support FULL OUTER JOINs, so that might not be a solution for you. You could also create one query (view) that LEFT JOINs Customers with Orders and Order_Details, and then use that query in another query to RIGHT JOIN with the Products table. When (remember CASE?) a key field in the Order_Details table is not Null, then indicate that the customer has previously ordered the product.

But this chapter is about solving problems with unlinked tables, so let’s tackle the problem head-on by using a CROSS JOIN of Customers and Products and a subquery in the SELECT clause to do a lookup to see if the customer ever ordered the product. Just for fun, let’s also look up the category description for each product. Here’s how to do it:

Translation/Clean Up

Select customer first name, customer last name, customer street address, customer city, customer state, customer zip code, category description, product number, product name, retail price, and (CASE when the customer ID is in the (selection of customer ID from the orders table inner joined with the order details table on Orders.order number in the orders table equals = Order_Details.order number in the order details table where the Products.product number in the products table equals the = Order_Details.product number in the order details table then display ‘You purchased this!’, else ‘ ‘ END) display a blank from the customers table and the CROSS JOIN categories table inner joined with the products table on Categories.category ID in the categories table equals = Products.category ID in the products table sorted ORDER BY customer ID, category description, and product number

SQL

SELECT Customers.CustomerID, Customers.

CustFirstName,

  Customers.CustLastName,

  Customers.CustStreetAddress,

  Customers.CustCity, Customers.CustState,

  Customers.CustZipCode,

  Categories.CategoryDescription,

  Products.ProductNumber, Products.ProductName,

  Products.RetailPrice,

 (CASE WHEN Customers.CustomerID IN

  (SELECT Orders.CustomerID

   FROM ORDERS INNER JOIN Order_Details

   ON Orders.OrderNumber =

      Order_Details.OrderNumber

   WHERE Order_Details.ProductNumber =

     Products.ProductNumber)

 THEN 'You purchased this! '

 ELSE ' ' END) AS ProductOrdered

FROM Customers, Categories INNER JOIN Products

  ON Categories.CategoryID = Products.CategoryID

ORDER BY Customers.CustomerID,

 Categories.CategoryDescription,

 Products.ProductNumber;

Yes, there is an INNER JOIN to link Categories with Products, but the key part of the FROM clause is the CROSS JOIN with the Customers table. You can find this query saved as CH20_Customer_All_Products_PurchasedStatus in the Sales Orders sample database. As expected, the query returns 1,120 rows.

Note: Recall from Chapter 19 that Microsoft Office Access does not support the CASE expression. In the samples I created in the Access databases, you’ll find that I used a built-in function called Immediate If (IIf) that serves a similar purpose.

Solving Problems Using “Driver” Tables

Let’s move on now to solving problems that require you to set up one or more tables containing a list of values that you’ll CROSS JOIN with other tables in your database to get your answer. I call this sort of table a “driver” table because the contents of the table “drive” the result you get. (If you also own Effective SQL that I wrote with my good friends, Doug Steele and Ben Clothier, we decided to call them “tally” tables in that book, but they’re the same thing.) Arguably, the most common type of driver table contains a list of dates or weeks or months that you can CROSS JOIN with your data to list all days or weeks or months and any matching events that occur on those dates.

Another use of a driver table is to define a categorization of values across a set of defined ranges. Examples include assigning a letter grade to a grade point score, rating instructors based on their proficiency rating, evaluating bowlers based on their average score, categorizing product prices, or categorizing the amount spent by a customer.

A really creative use of a driver table lets you “pivot” your data to display a result that looks more like a spreadsheet. A common example would be to display sales or purchases by month, with the months listed across by product or customer.

Setting Up a Driver Table

The SQL Standard defines WITH RECURSIVE that allows you to execute a stated SQL query multiple times in a loop. This can be useful to load a driver table with consecutive dates across a date range. Unfortunately, only a few database systems support this. To load my large driver tables, I resorted to using Visual Basic in Microsoft Office Access to perform the recursion necessary to load hundreds of rows into a date range table. (You can actually find some of the code I used if you dig around in the sample databases that are in Microsoft Office Access format.)

When your driver table is a simple set of ranges to translate to a value, it’s easy enough to load the data by hand. For example, here’s the list of values I entered into the ztblLetterGrades table you can find in the School Scheduling sample database:

LetterGrade

LowGradePoint

HighGradePoint

A

93

96.99

A-

90

92.99

A+

97

120

B

83

86.99

B-

80

82.99

B+

87

89.99

C

73

76.99

C-

70

72.99

C+

77

79.99

D

63

66.99

D-

60

62.99

D+

67

69.99

F

0

59.99

This should look familiar because it’s the same list of ranges that I used in the CH19_Students_Classes_Letter_Grades query in the previous chapter. (By the way, I named all the driver tables using a “ztbl” prefix to clearly separate them from the main data tables in each database.) One clear advantage to setting up a table like this is that you can easily change the range values should the need arise. You don’t have to go digging in the CASE clauses in each query that depends on the ranges to obtain the answer.

As I noted previously, a really creative use of a driver table lets you pivot your result to look like a spreadsheet. Quite a few database systems provide nonstandard ways to pivot data, but I’ll show you how to create a pivot using standard SQL and a driver table. You can find one such table I created for this purpose in the Sales Orders sample database called ztblMonths. Here is what part of the table looks like:

MonthYear

YearNumber

MonthNumber

MonthStart

MonthEnd

January 2017

2017

1

1/1/2017

1/31/2017

February 2017

2017

2

2/1/2017

2/29/2017

March 2017

2017

3

3/1/2017

3/31/2017

April 2017

2017

4

4/1/2017

4/30/2017

May 2017

2017

5

5/1/2017

5/31/2017

June 2017

2017

6

6/1/2017

6/30/2017

July 2017

2017

7

7/1/2017

7/31/2017

August 2017

2017

8

8/1/2017

8/31/2017

<< more rows here >>

Additional columns…

January

February

March

April

May

June

1

0

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

<< more rows here >>

Additional columns…

July

August

September

October

November

December

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

1

0

0

0

0

<< more rows here >>

Looks a bit strange, doesn’t it? The little secret is you’ll use a WHERE clause to match the rows in this driver table with the date of the order, and then you will build columns by multiplying the total sales times the value found in a particular column to get a total for that month. When the order occurs in January 2017, only the January column on the matching row contains a 1 to result in 1 times quantity times the price. The value won’t be added to the columns for the other months because zero times any value is always zero. Another way to think of it is the ones and zeros define the horizontal “buckets” for each value encountered in your query that calculates the values you want to display. When a date matches the range defined by the row in the driver table, the 1 indicates the correct horizontal bucket in which to place the value. So, when a value is in January 2017, that value ends up in the January column by multiplying the column value times the expression that calculates the total.

Using a Driver Table

Let’s use the two driver tables described in the previous section to solve problems. First, I want to display a grade letter based on each student’s numeric grade received for a class. I solved this problem using CASE in the previous chapter. Now I’ll solve it using the driver table.

“List all students, the classes for which they enrolled, the grade they received, and a conversion of the grade number to a letter.”

Translation/Clean UP

Select Students.student ID from the students table, Students.student first name from the students table, Students.student last name from the students table, Classes.class ID from the classes table, Classes.start date from the classes table, Subjects.subject code from the subjects table, Subjects.subject name from the subjects table, Student_Schedules.grade from the student_schedules table, and ztblLetterGrades.letter grade from the letter grades driver table from ztblLetterGrades the letter grades driver table and CROSS JOIN the students table inner joined with the student schedules table on Students.student ID in the students table equals = Student_Schedules.student ID in the student schedules table, then inner joined with the classes table on Student_Schedules.class ID in the student schedules table equals = Classes.class ID in the classes table, then inner joined with the subjects table on Classes.subject ID in the classes table equals = Subjects.subject ID in the subjects table, then inner joined with the student class status table on Student_Schedules.class status in the student schedules table equals = Student_Class_Status.class status in the student class status table where Student_Class_Status.class status description in the student class status table equals = ‘Completed’ and Student_Schedules.grade in the student schedules table is between ztblLetterGrades.low grade point in the letter grades driver table and ztblLetterGrades.high grade point in the letter grades driver table

SQL

SELECT Students.StudentID, Students.StudFirstName,

  Students.StudLastName, Classes.ClassID,

  Classes.StartDate,

  Subjects.SubjectCode, Subjects.SubjectName,

  Student_Schedules.Grade,

  ztblLetterGrades.LetterGrade

FROM ztblLetterGrades, (((Students

INNER JOIN Student_Schedules

  ON Students.StudentID =

  Student_Schedules.StudentID)

INNER JOIN Classes

  ON Student_Schedules.ClassID = Classes.ClassID)

INNER JOIN Subjects

  ON Classes.SubjectID = Subjects.SubjectID)

INNER JOIN Student_Class_Status

  ON Student_Schedules.ClassStatus =

  Student_Class_Status.ClassStatus

WHERE

  (Student_Class_Status.ClassStatusDescription =

  'Completed')

  AND (Student_Schedules.Grade Between

  ztblLetterGrades.LowGradePoint

  AND ztblLetterGrades.HighGradePoint);

You can find this query saved as CH20_Students_Classes_Letter_Grades in the School Scheduling sample database. You’ll find that it returns the same 68 rows as the CH19_Student_Classes_Letter_Grades that I showed you in Chapter 19.

Now let’s take a look at using the second driver table.

“Show product sales for each product for all months, listing the months as columns.”

Translation/Clean Up

Select Products.product name from the products table, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.January from the months driver table) as January, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.February from the months driver table) as February, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.March from the months driver table) as March, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.April from the months driver table) as April, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.May from the months driver table) as May, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.June from the months driver table) as June, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.July from the months driver table) as July, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.August from the months driver table) as August, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.September from the months driver table) as September, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.October from the months driver table) as October, the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.November from the months driver table) as November, and the sum of (Order_Details.quoted price from the order details table times * Order_Details.quantity ordered from the order details table times * ztblMonths.December from the months driver table) as December from ztblMonths the months driver table CROSS JOIN and the products table then inner joined with the order details table on Products.product number in the products table equals = Order_Details.product number in the order details table then inner joined with the orders table on Orders.order number in the orders table equals = Order_Details.order number in the order details table where Orders.order date in the orders table is between ztblMonths.month start in the months driver table and ztblMonths.month end in the months driver table grouped by Products.product name in the products table

SQL

SELECT Products.ProductName,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.January)

 AS January,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.February)

 AS February,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.March)

 AS March,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.April)

 AS April,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.May)

 AS May,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.June)

 AS June,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.July)

 AS July,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.August)

 AS August,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.September)

 AS September,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.October)

 AS October,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.November)

 AS November,

  SUM(Order_Details.QuotedPrice *

 Order_Details.QuantityOrdered *

 ztblMonths.December)

 AS December

FROM ztblMonths, (Products

INNER JOIN Order_Details

  ON Products.ProductNumber =

  Order_Details.ProductNumber)

INNER JOIN Orders

  ON Orders.OrderNumber = Order_Details.OrderNumber

WHERE Orders.OrderDate BETWEEN ztblMonths.MonthStart

AND ztblMonths.MonthEnd

GROUP BY Products.ProductName;

The “magic” happens when you restrict the row returned by the driver table to the month that matches the date in the orders table. When the date falls in January, only the January column has the value 1. That places the amount for that row in January in the correct “bucket” to be finally summed to get your pivoted result. You can find this query saved in the Sales Orders sample database as CH20_Product_Sales_Month_Pivot. The query returns 38 rows, but there are 40 products in the Products table. The two missing rows occur because two of the products have never been sold. (See CH09_Products_Never_Ordered to discover those two products.)

Sample Statements

You now know the mechanics of constructing queries using CROSS JOIN and driver tables and have seen some of the types of requests you can answer with these techniques. Let’s take a look at a fairly robust set of samples, all of which use CROSS JOIN between two data tables or with a driver table. These examples come from each of the sample databases, and they illustrate the use of these techniques to solve “thinking outside the box” problems.

Note: Remember in the Introduction that I warned you that results from each database system won’t necessarily match the sort order you see in examples in this book unless you include an ORDER BY clause. Even when you include that specification, the system might return results in columns not included in the ORDER BY clause in a different sequence because of different optimization techniques.

If you’re running the examples in Microsoft SQL Server, simply selecting the rows from the view does not honor any ORDER BY clause specified in the view. You must open the design of the view and execute it from there to see the ORDER BY clause honored.

Also, when you use GROUP BY, you’ll often see the results returned by your database system as though the rows are sorted by the columns you specified. This happens because some optimizers first sort the data internally to make it faster to process your GROUP BY. Keep in mind that if you want a specific sort order, you must also include an ORDER BY clause.

I’ve also included sample result sets that would be returned by these operations and placed them immediately after the SQL syntax line. The name that appears immediately above a result set is the name I gave each query in the sample data that you’ll find on the book’s website, www.informit.com/title/9780134858333. I stored each query in the appropriate sample database (as indicated within the example), using “CH20” as the leading part of the query or view name. You can follow the instructions in the Introduction to this book to load the samples onto your computer and try them out.

Note: Remember that all of the field names and table names used in these examples are drawn from the sample database structures shown in Appendix B, “Schema for the Sample Databases.”

Because many of these examples use complex joins, the optimizer for your database system may choose a different way to solve these queries. For this reason, the first few rows I show you may not exactly match the result you obtain, but the total number of rows should be the same. Keep in mind that for any SQL Server View that contains an ORDER BY clause, you must open the view in Design mode first and then execute it to see the specified order. If you SELECT * from the View, SQL Server does not honor the ORDER BY clause.

Examples Using Unlinked Tables

This first set of sample statements shows you problems you can solve using unlinked tables. All of them use a CROSS JOIN between two data tables.

Sales Order Database

“List all employees and customers who live in the same state and indicate whether the customer has ever placed an order with the employee.”

Translation/Clean Up

Select employee first name, employee last name, customer first name, customer last name, customer area code, customer phone number, and (CASE when the customer ID in the customers table is in the (selection of Orders.customer ID from the orders table where Orders.employee ID in the orders table equals = Employees.employee ID) in the employees table then display ‘Ordered from you.’ else display ‘ ‘ END) from employees and CROSS JOIN customers where the Employees.employee state in the employees table equals = Customer.customer state in the customers table

SQL

SELECT Employees.EmpFirstName,

  Employees.EmpLastName,

  Customers.CustFirstName,

  Customers.CustLastName,

  Customers.CustAreaCode,

  Customers.CustPhoneNumber,

  (CASE WHEN Customers.CustomerID IN

     (SELECT Orders.CustomerID

     FROM Orders

     WHERE Orders.EmployeeID =

        Employees.EmployeeID)

  THEN 'Ordered from you. '

  ELSE ' ' END) AS CustStatus

FROM Employees, Customers

WHERE Employees.EmpState = Customers.CustState;

CH20_Employees_Same_State_Customers (83 rows)

Emp FirstName

Emp LastName

Cust FirstName

Cust LastName

Cust AreaCode

Cust PhoneNumber

Cust Status

Ann

Patterson

William

Thompson

425

555-2681

Ordered from you.

Ann

Patterson

Gary

Hallmark

253

555-2676

Ordered from you.

Ann

Patterson

Dean

McCrae

425

555-2506

Ordered from you.

Ann

Patterson

John

Viescas

425

555-2511

Ordered from you.

Ann

Patterson

Andrew

Cencini

206

555-2601

Ordered from you.

Ann

Patterson

Liz

Keyser

425

555-2556

Ordered from you.

Ann

Patterson

Julia

Schnebly

206

555-9936

Ordered from you.

Ann

Patterson

Suzanne

Viescas

425

555-2686

Ordered from you.

Ann

Patterson

Jeffrey

Tirekicker

425

555-9999

Ann

Patterson

Joyce

Bonnicksen

425

555-2726

Ordered from you.

<< more rows here >>

Note: If you’re really sharp, you probably figured out that I could have solved the problem using an INNER JOIN between Employees and Customers ON EmpState = CustState. As I have stated many times before, there’s almost always more than one way to solve a problem. Now you know how to solve it using a CROSS JOIN.

Entertainment Database

“List all customer preferences and the count of first, second, and third preferences.”

Note: This is a bit tricky because you first need to “pivot” each customer’s first, second, and third preferences (as indicated by the PreferenceSeq column), and then count them. You could use a driver table to help perform the pivot, but with only three unique values to pivot into columns, it’s just as easy to do it with CASE.

Translation/Clean Up

Select Musical_Styles.style ID from the musical styles table, Musical_Styles.style name from the musical styles table, the count of (RankedPeferences.first style) from the ranked preferences query as first preference, the count of (RankedPreferences.second style) from the ranked preferences query as second preference, and the count of (RankedPreferences.third style) from the ranked preferences query as third preference from the musical styles table and CROSS JOIN the (selection of (CASE when Musical_Preferences.preference sequence in the musical preferences table is = 1 then return the Musical_Preferences.style ID from the musical preferences table else return Null END) as first style, (CASE when Musical_Preferences.preference sequence in the musical preferences table is = 2 then return the Musical_Preferences.style ID from the musical preferences table else return Null END) as second style, (CASE when Musical_Preferences.preference sequence in the musical preferences table is = 3 then return the Musical_Preferences.style ID from the musical preferences table else return Null END) as third style from the musical preferences table) as ranked preferences where Musical_Styles.style ID in the musical styles table equals = RankedPreferences.first style in the ranked preferences query or Musical_Styles.style ID in the musical styles table equals = RankedPreferences.second style in the ranked preferences query or Musical_Styles.style ID in the musical styles table equals = RankedPreferences.third style in the ranked preferences query grouped by style ID, and style name having the count of (first style) > greater than 0 or the count of (second style) > greater than 0 or the count of (third style) > greater than 0 ordered by first preference descending, second preference descending, third preference descending, and style ID

SQL

SELECT Musical_Styles.StyleID,

  Musical_Styles.StyleName,

  COUNT(RankedPreferences.FirstStyle)

 AS FirstPreference,

  COUNT(RankedPreferences.SecondStyle)

 AS SecondPreference,

  COUNT(RankedPreferences.ThirdStyle)

 AS ThirdPreference

FROM Musical_Styles,

 (SELECT (CASE WHEN

 Musical_Preferences.PreferenceSeq = 1

        THEN Musical_Preferences.StyleID

        ELSE Null END) As FirstStyle,

      (CASE WHEN

 Musical_Preferences.PreferenceSeq = 2

        THEN Musical_Preferences.StyleID

        ELSE Null END) As SecondStyle,

      (CASE WHEN

 Musical_Preferences.PreferenceSeq = 3

        THEN Musical_Preferences.StyleID

        ELSE Null END) AS ThirdStyle

  FROM Musical_Preferences) AS RankedPreferences

WHERE Musical_Styles.StyleID =

       RankedPreferences.FirstStyle

OR Musical_Styles.StyleID =

       RankedPreferences.SecondStyle

OR Musical_Styles.StyleID =

       RankedPreferences.ThirdStyle

GROUP BY StyleID, StyleName

HAVING COUNT(FirstStyle) > 0

   OR     COUNT(SecondStyle) > 0

   OR     COUNT(ThirdStyle) > 0

ORDER BY FirstPreference DESC,

      SecondPreference DESC,

      ThirdPreference DESC, StyleID;

CH20_Customer_Style_Preference_Rankings (20 rows)

StyleID

StyleName

FirstPreference

SecondPreference

ThirdPreference

21

Standards

2

2

0

15

Jazz

2

1

0

19

Rhythm and Blues

2

0

1

22

Top 40 Hits

2

0

0

10

Contemporary

1

2

0

8

Classic Rock & Roll

1

1

0

20

Show Tunes

1

1

0

3

60’s Music

1

0

0

11

Country Rock

1

0

0

14

Chamber Music

1

0

0

23

Variety

1

0

0

<< more rows here >>

Notice that although there are 25 distinct musical styles defined in the database, the query returns only 20 rows. The styles that are missing aren’t ranked first, second, or third by any customer.

School Scheduling Database

“List all students who have completed English courses and rank them by Quintile on the grades they received.”

A Quintile divides a group into five equal ranges. When applied to student rankings, a quintile will be 20% of students—those in the top 20 percent are in the first quintile, those in the next 20% are in the second quintile, and so on. To solve this, you need to CROSS JOIN two queries:

1. A query that assigns a ranking number to each student who completed an English course. You can calculate the rank by counting the number of students who have a grade greater than or equal to the current student’s grade. The student with the highest grade will be ranked #1, the student with the second highest grade #2, and so on.

2. A query that counts all students who completed an English course. You can use this count times 0.2, 0.4, 0.6, and 0.8 to figure out the quintile. The students whose rank (as calculated by the first query) is less than or equal to 0.2 times the total number of students is in the first quintile.

Translation/Clean Up

Select S1.subject ID from the first query, S1.student first name from the first query, S1.student last name from the first query, S1.class status from the first query, S1.grade from the first query, S1.category ID from the first query, S1.subject name from the first query, S1.rank in category from the first query, StudCount.number of students from the student count query, and (CASE when the rank in category <= is less than or equal to 0.2 * times the number of students then return ‘First’ when the rank in category <= is less than or equal to 0.4 * times the number of students then return ‘Second’ when the rank in category <= is less than or equal to 0.6 * times the number of students then return ‘Third’ when the rank in category <= is less than or equal to 0.8 * times the number of students then return ‘Fourth’ else return ‘Fifth’ END) as the quintile from the (selection of Subjects.subject ID in the subjects table, Students.student first name in the students table, Students.student last name in the students table, Student_Schedules.class status in the student schedules table, Student_Schedules.grade in the student schedules table, Subjects.category ID in the subjects table, Subjects.subject name in the subjects table, and the (selection of the count(*) of all rows from the classes table inner joined with the student schedules table AS SS2 on Classes.class ID in the classes table = equals SS2.class ID in the student schedules table, then inner joined with the subjects table AS S3 on S3.subject ID in the subjects table = equals Classes.subject ID in the classes table where S3.category ID in the subjects table = equals ‘ENG’ and SS2.grade in the student schedules table >= is greater than or equal to Student_Schedules.grade in the student schedules table) as rank in category) from the subjects table inner joined with the classes table on Subjects.subject ID in the subjects table = equals Classes.subject ID in the classes table, then inner joined with the student schedules table on Student_Schedules.class ID in the student schedules table = equals Classes.class ID in the classes table, then inner joined with the students table on Students.student ID in the students table = equals Student_Schedules.student ID in the student schedules table where Student_Schedules.class status in the student schedules table = equals 2 and Subjects.category ID in the subjects table = equals ‘ENG’) AS S1 CROSS JOIN and the (selection of the count(*) of all rows as number of students from the classes table AS C2 inner joined with the student schedules table AS SS3 on C2.class id in the classes table = equals SS3.class ID in the student schedules table, then inner joined with the subjects table AS S2 on S2.subject ID in the subjects table = equals C2.subject ID in the classes table where SS3.class status in the student schedules table = equals 2 and S2.category ID in the subjects table = equals ‘ENG’) As student count ordered by S1.grade in the first query descending

SQL

SELECT S1.SubjectID, S1.StudFirstName,

  S1.StudLastName,

  S1.ClassStatus, S1.Grade, S1.CategoryID,

  S1.SubjectName,

  S1.RankInCategory, StudCount.NumStudents,

  (CASE WHEN RankInCategory <= 0.2 * NumStudents

 THEN 'First'

 WHEN RankInCategory <= 0.4 * NumStudents

 THEN 'Second'

 WHEN RankInCategory <= 0.6 * NumStudents

 THEN 'Third'

 WHEN RankInCategory <= 0.8 * NumStudents

 THEN 'Fourth'

 ELSE 'Fifth' END) AS Quintile

FROM

  (SELECT Subjects.SubjectID,

  Students.StudFirstName,

  Students.StudLastName,

  Student_Schedules.ClassStatus,

  Student_Schedules.Grade, Subjects.CategoryID,

  Subjects.SubjectName,

 (SELECT Count(*)

  FROM (Classes

  INNER JOIN Student_Schedules AS SS2

       ON Classes.ClassID = SS2.ClassID)

  INNER JOIN Subjects As S3

       ON S3.SubjectID = Classes.SubjectID

  WHERE S3.CategoryID = 'ENG'

       AND SS2.Grade >=

     Student_Schedules.Grade)

       AS RankInCategory

FROM ((Subjects INNER JOIN Classes

  ON Subjects.SubjectID = Classes.SubjectID)

INNER JOIN Student_Schedules

  ON Student_Schedules.ClassID =

     Classes.ClassID)

INNER JOIN Students

  ON Students.StudentID =

  Student_Schedules.StudentID

WHERE Student_Schedules.ClassStatus = 2 AND

  Subjects.CategoryID = 'ENG') AS S1,

  (SELECT Count(*) AS NumStudents

FROM (Classes AS C2 INNER JOIN

   Student_Schedules AS SS3

  ON C2.ClassID = SS3.ClassID)

 INNER JOIN Subjects AS S2

  ON S2.SubjectID = C2.SubjectID

WHERE SS3.ClassStatus = 2 And S2.CategoryID = 'ENG')

  AS StudCount

ORDER BY S1.Grade DESC;

CH20_English_Student_Quintiles (18 rows)

Subject ID

Stud First Name

Stud Last Name

Class Status

Grade

Category

Subject Name

RankIn Category

Num Students

Quintile

37

Scott

Bishop

2

98.07

ENG

Composition - Fundamentals

1

18

First

37

Sara

Sheskey

2

97.59

ENG

Composition - Fundamentals

2

18

First

37

John

Kennedy

2

93.01

ENG

Composition - Fundamentals

3

18

First

37

Brannon

Jones

2

91.66

ENG

Composition - Fundamentals

4

18

Second

37

Janice

Galvin

2

91.44

ENG

Composition - Fundamentals

5

18

Second

38

Kendra

Bonnicksen

2

88.91

ENG

Composition - Intermediate

6

18

Second

37

George

Chavez

2

88.54

ENG

Composition - Fundamentals

7

18

Second

37

Marianne

Wier

2

87.4

ENG

Composition - Fundamentals

8

18

Third

<< more rows here >>

Note: This query uses the grade from each English class to rank the students, so it is possible to see a student listed more than once if the student has completed more than one English class. To rank students for all English classes taken, you would first have to calculate the average of credits times grade divided by credits for each student and then rank those results.

Bowling League Database

“List all potential matches between teams without duplicating any team pairing.”

To solve this problem, you need two copies of the Teams table in your FROM clause. That will give you all combinations of two teams, but you obviously don’t want to list a team bowling itself. Think about dealing with each team one at a time. When looking at Team 1, you need to match it with any team that has a higher TeamID value. Looking at Team 2, you’ve already matched it with Team 1 on the first pass, but any higher value in TeamID will work. So as long as the TeamID in the second copy of the table has a higher value than the TeamID in the first copy of the table, you’re good to go!

Translation/Clean Up

Select Teams.team ID from the 1st copy of the teams table as team1 ID, Teams.TeamName from the 1st copy of the teams table as team1 name, Teams_1.team ID from the 2nd copy of the teams table as team1 name, and Teams_1.team name from the 2nd copy of the teams table as team2 name from the teams table CROSS JOIN and a 2nd copy of the teams table AS Team_1 where Teams_1.team ID in the 2nd copy of the teams table is greater than > Teams.team ID in the 1st copy of the teams table ordered by Teams.team ID in the 1st copy of the teams table, and Teams_1.team ID in the 2nd copy of the teams table

SQL

SELECT Teams.TeamID AS Team1ID,

  Teams.TeamName AS Team1Name,

  Teams_1.TeamID AS Team2ID,

  Teams_1.TeamName AS Team2Name

FROM Teams, Teams AS Teams_1

WHERE Teams_1.TeamID > Teams.TeamID

ORDER BY Teams.TeamID, Teams_1.TeamID;

CH20_Team_Pairings (45 rows)

Team1ID

Team1Name

Team2ID

Team2Name

1

Marlins

2

Sharks

1

Marlins

3

Terrapins

1

Marlins

4

Barracudas

1

Marlins

5

Dolphins

1

Marlins

6

Orcas

1

Marlins

7

Manatees

1

Marlins

8

Swordfish

1

Marlins

9

Huckleberrys

1

Marlins

10

MintJuleps

2

Sharks

3

Terrapins

2

Sharks

4

Barracudas

<< more rows here >>

Note: You might look at this query and ask: “Couldn’t I also solve this with an INNER JOIN moving the WHERE clause to an ON clause?” You would be absolutely correct for most database systems that support something other than an equijoin in the ON clause. As usual, there’s always more than one way to solve a particular problem using SQL.

Examples Using Driver Tables

Let’s move on to solving some problems using driver tables. All the following solutions use driver tables that I’ve already built for you in the sample databases.

Sales Order Database

“The warehouse manager has asked you to print an identification label for each item in stock.”

You can look up the quantity on hand in the Products table. The trick here is to use a driver table that has one column of integers, and each row has a successive value from 1 to the maximum number you might have in stock. You can use ztblSeqNumbers in the sample database for this purpose.

Translation/Clean Up

Select ztblSeqNumbers.Sequence from the sequence driver table, Products.product number from the products table, and Products.product name from the products table from ztblSeqNumbers the sequence driver table CROSS JOIN and the products table where ztblSeqNumbers.sequence in the sequence driver table is less than or equal to <= Products.quantity on hand in the products table ordered by Products.product number from the products table, and ztblSequenceNumbers.sequence from the sequence driver table

SQL

SELECT ztblSeqNumbers.Sequence,

  Products.ProductNumber,

  Products.ProductName

FROM ztblSeqNumbers, Products

WHERE ztblSeqNumbers.Sequence <=

  Products.QuantityOnHand

ORDER BY Products.ProductNumber,

  ztblSeqNumbers.Sequence;

CH20_Product_Stock_Labels (813 rows)

Sequence

ProductNumber

ProductName

1

1

Trek 9000 Mountain Bike

2

1

Trek 9000 Mountain Bike

3

1

Trek 9000 Mountain Bike

4

1

Trek 9000 Mountain Bike

5

1

Trek 9000 Mountain Bike

6

1

Trek 9000 Mountain Bike

1

2

Eagle FS-3 Mountain Bike

2

2

Eagle FS-3 Mountain Bike

<< more rows here >>

Entertainment Database

“Produce a booking calendar that lists for all weeks in January 2018 any engagement during that week.”

Note: Remember to find an engagement that occurs in any part of a date span you need to find engagements that begin before or on the end date of the span and end after or on the start date of the span. You need to do a similar thing to find weeks in which any part of the week falls in January 2018.

Translation/Clean Up

Select ztblWeeks.week start from the weeks driver table, ztblWeeks.week end from the weeks driver table, Entertainers.entertainer ID from the entertainers table, Entertainers.entertainer stage name from the entertainers table, Customers.customer first name from the customers table, Customers.customer last name from the customers table, Engagements.start date from the engagements table, and Engagements.end date from the engagements table from ztblWeeks the weeks driver table CROSS JOIN and the customers table inner joined with the engagements table on Customers.customer ID in the customers table equals = Engagements.customer ID in the engagements table, then inner joined with the entertainers table on Entertainers.entertainer ID in the entertainers table equals = Engagements.entertainer ID in the engagements table where ztblWeeks.week start in the weeks driver table is less than or equal to <= ‘2018-01-31’ and ztblWeeks.week end in the weeks driver table is greater than or equal to >= ‘2018-01-01’ and Engagements.start date in the engagements table is less than or equal to <= ztblWeeks.week end in the weeks driver table and Engagements.end date in the engagements table is greater than or equal to >= ztblWeeks.week start in the weeks driver table

SQL

SELECT ztblWeeks.WeekStart, ztblWeeks.WeekEnd,

  Entertainers.EntertainerID,

  Entertainers.EntStageName,

  Customers.CustFirstName, Customers. CustLastName,

  Engagements.StartDate, Engagements.EndDate

FROM ztblWeeks, (Customers INNER JOIN Engagements

  ON Customers.CustomerID =

  Engagements.CustomerID)

INNER JOIN Entertainers

  ON Entertainers.EntertainerID =

  Engagements.EntertainerID

WHERE ztblWeeks.WeekStart <= '2018-01-31' AND

ztblWeeks.WeekEnd >= '2018-01-01' AND

Engagements.StartDate <= ztblWeeks.WeekEnd AND

Engagements.EndDate >= ztblWeeks.WeekStart;

CH20_All_Weeks_Jan2018_All_Engagements (50 rows)

Week Start

WeekEnd

Entertainer ID

EntStage Name

CustFirst Name

CustLast Name

Start Date

End Date

2017-12-31

2018-01-05

1001

Carol Peacock Trio

Mark

Rosales

2017-12-30

2018-01-08

2018-01-07

2018-01-12

1001

Carol Peacock Trio

Mark

Rosales

2017-12-30

2018-01-08

2018-01-07

2018-01-12

1001

Carol Peacock Trio

Matt

Berg

2017-01-09

2018-01-09

2018-01-21

2018-01-26

1001

Carol Peacock Trio

Dean

McCrae

2017-01-23

2018-01-31

2018-01-28

2018-02-02

1001

Carol Peacock Trio

Dean

McCrae

2018-01-23

2018-01-31

2017-12-31

2018-01-05

1002

Topazz

Estella

Pundt

2018-01-02

2018-01-10

2018-01-07

2018-01-12

1002

Topazz

Estella

Pundt

2018-01-02

2018-01-10

2017-12-31

2018-01-05

1003

JV & the Deep Six

Carol

Viescas

2017-12-31

2018-01-05

2018-01-07

2018-01-12

1003

JV & the Deep Six

Mark

Rosales

2018-01-09

2018-01-10

2018-01-28

2018-02-02

1003

JV & the Deep Six

Zachary

Ehrlich

2018-01-29

2018-02-02

<< more rows here >>

School Scheduling Database

“Display a list of classes by semester, date, and subject.”

Note: This is a bit tricky because the Classes table has an unnormalized list of days listed as columns. When a class is scheduled for a given day, the value is 1 or “true” for that column. I need to use the ztblSemesterDays driver table and include a class when the semester matches and the day of the week in the driver table has the “flag” turned on in the appropriate day column. Because some database systems use -1 (Microsoft Office Access, for example) and others use 1 for “true,” I will test for not equal to 0 to determine whether the column has a “true” value.

Translation/Clean Up

Select ztblSemesterDays.semester no from the semester driver table, ztblSemesterDays.semester date from the semester driver table, Classes.start time from the classes table, ztblSemesterDays.semester day name from the semester driver table, Subjects.subject code from the subjects table, Subjects.subject name from the subjects table, Class_Rooms.building code from the class rooms table, Class_Rooms.class room ID from the class rooms table from ztblSemesterDays the semester driver table CROSS JOIN and the subjects table then inner joined with the classes table on Subjects.subject ID in the subjects table equals = Classes.subject ID in the classes table then inner joined with the class rooms table on Class_Rooms.class room ID in the class rooms table equals = Classes.class room ID in the classes table where Classes.semester number in the classes table equals = ztblSemesterDays.semester no in the semester driver table and 1 equals = (CASE when ztblSemesterDays.semester day name in the semester driver table equals = ‘Monday’ and Classes. Monday schedule in the classes table does not equal <> 0 then return 1 when ztblSemesterDays.semester day name in the semester driver table equals = ‘Tuesday’ and Classes.Tuesday schedule in the classes table does not equal <> 0 then return 1 when ztblSemesterDays.semester day name in the semester driver table equals = ‘Wednesday’ and Classes.Wednesday schedule in the classes table does not equal <> 0 then return 1 when ztblSemesterDays.semester day name in the semester driver table equals = ‘Thursday’ and Classes.Thursday schedule in the classes table does not equal <> 0 then return 1 when ztblSemesterDays.semester day name in the semester driver table equals = ‘Friday’ and Classes.Friday schedule in the classes table does not equal <> 0 then return 1 when ztblSemesterDays.semester day name in the semester driver table equals = ‘Saturday’ and Classes.Saturday schedule in the classes table does not equal <> 0 then return 1 else return 0 END) ordered by ztblSemesterDays.semester no in the semester driver table, ztblSemesterDays.semester date in the semester driver table, Subjects.subject code in the subjects table, Class_Rooms.building code in the class rooms table, Class_Rooms.class room ID in the class rooms table, and Classes.start time in the classes table

SQL

SELECT ztblSemesterDays.SemesterNo,

  ztblSemesterDays.SemDate, Classes.StartTime,

  ztblSemesterDays.SemDayName, Subjects.SubjectCode,

  Subjects.SubjectName, Class_Rooms.BuildingCode,

  Class_Rooms.ClassRoomID

FROM ztblSemesterDays, (Subjects

INNER JOIN Classes

ON Subjects.SubjectID = Classes.SubjectID)

INNER JOIN Class_Rooms

ON Class_Rooms.ClassRoomID =

 Classes.ClassRoomID

WHERE Classes.SemesterNumber =

 ztblSemesterDays.SemesterNo

AND Classes.StartDate <= ztblSemesterDays.SemDate

AND 1 =

(CASE WHEN ztblSemesterDays.SemDayName = 'Monday'

      AND Classes.MondaySchedule <> 0 THEN 1

 WHEN ztblSemesterDays.SemDayName = 'Tuesday'

      AND Classes.TuesdaySchedule <> 0 THEN 1

 WHEN ztblSemesterDays.SemDayName = 'Wednesday'

      AND Classes.WednesdaySchedule <> 0 THEN 1

 WHEN ztblSemesterDays.SemDayName = 'Thursday'

      AND Classes.ThursdaySchedule <> 0 THEN 1

 WHEN ztblSemesterDays.SemDayName = 'Friday'

      AND Classes.FridaySchedule <> 0 THEN 1

 WHEN ztblSemesterDays.SemDayName = 'Saturday'

      AND Classes.SaturdaySchedule <> 0 THEN 1

 ELSE 0 END)

ORDER BY ztblSemesterDays.SemesterNo,

ztblSemesterDays.SemDate, Subjects.SubjectCode,

Class_Rooms.BuildingCode,

Class_Rooms.ClassRoomID,

Classes.StartTime;

CH20_Class_Schedule_Calendar (7,221 rows)

Semester No

SemDate

Start Time

SemDay Name

Subject Code

Subject Name

Building

Class RoomID

1

2018-09-11

16:00

Monday

ACC 210

Financial Accounting Fundamentals I

IB

3305

1

2018-09-11

15:30

Monday

ART 101

Design

AS

1619

1

2018-09-11

8:00

Monday

ART 111

Drawing

AS

1627

1

2018-09-11

9:00

Monday

ART 111

Drawing

AS

1627

1

2018-09-11

11:00

Monday

ART 251

Art History

LB

1231

1

2018-09-11

14:00

Monday

ART 251

Art History

LB

1231

1

2018-09-11

10:00

Monday

BIO 100

Biological Principles

AS

1532

1

2018-09-11

12:00

Monday

BIO 101

General Biology

AS

1532

1

2018-09-11

13:30

Monday

BIO 280

Microbiology

AS

1530

1

2018-09-11

7:30

Monday

CHE 101

Chemistry

AS

1519

<< more rows here >>

Bowling League Database

“Print a bowler mailing list, but skip the first three labels on the first page that have already been used.”

Note: What you want to do is produce three blank name and address lines to bypass the used labels, and then list all the bowlers and their addresses. You can use ztblSkipLabels in a SELECT statement that substitutes blanks for all the fields and stops when the number in the driver table becomes greater than the number of labels you want to skip. Follow that with a UNION ALL of a SELECT statement to produce the names and addresses for all bowlers. You must use a UNION ALL because a simple UNION would eliminate all the duplicate blank lines you produced in the first query.

Translation / Clean Up

Select blanks ‘ ‘ as bowler last name, blanks ‘ ‘ as bowler first name, blanks ‘ ‘ as bowler address, blanks ‘ ‘ as bowler city, blanks ‘ ‘ as bowler state, blanks ‘ ‘ as bowler zip from ztblSkipLabels the skip labels driver table where the ztblSkipLabels.label count in the skip labels driver table is less than or equal to <= 3 unioned with all rows in select bowler last name, bowler first name, bowler address, bowler city, bowler state, and bowler zip from the bowlers table ordered by bowler zip, and bowler last name

SQL

SELECT ' ' As BowlerLastName, ' ' As

BowlerFirstName,

  ' ' As BowlerAddress, ' ' As BowlerCity,

  ' ' As BowlerState, ' ' As BowlerZip

FROM ztblSkipLabels

WHERE ztblSkipLabels.LabelCount <= 3

UNION ALL

SELECT BowlerLastName, BowlerFirstName,

  BowlerAddress, BowlerCity, BowlerState,

BowlerZip

FROM Bowlers

ORDER BY BowlerZip, BowlerLastName;

CH20_Bowler_Mailing_Skip_3 (35 rows)

BowlerLast Name

BowlerFirst Name

Bowler Address

Bowler City

Bowler State

Bowler Zip

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Patterson

Kathryn

16 Maple Lane

Auburn

WA

98002

Patterson

Rachel

16 Maple Lane

Auburn

WA

98002

Patterson

Ann

16 Maple Lane

Auburn

WA

98002

Patterson

Neil

16 Maple Lane

Auburn

WA

98002

Patterson

Megan

16 Maple Lane

Auburn

WA

98002

Viescas

Carol

16345 NE 32nd Street

Bellevue

WA

98004

Sheskey

Sara

17950 N 59th

Seattle

WA

98011

<< more rows here >>

Summary

I started this chapter with a definition of unlinked data and a discussion of how to use CROSS JOIN to handle unlinked data in your queries. At the end of the first section, I outlined the two cases where a CROSS JOIN can be useful—linking main data tables with each other and linking main data tables with a “driver” table.

I next covered solving problems using main data tables linked to each other and explained a complex example. After that, I covered using a driver table, showing you how to set one up and discussing another two examples. Finally, I showed you and explained examples from four of the sample databases using both main tables linked to each other and a driver table linked to one or more main tables. I encourage you to try to work out the problems presented in the final section that follows.

Problems for You to Solve

The following problems show you the request statement and the name of the solution query in the sample databases. If you want some practice, you can work out the SQL you need for each request and then check your answer with the query I saved in the samples. Don’t worry if your syntax doesn’t exactly match the syntax of the queries I saved—as long as your Result Set is the same.

Sales Order Database

1. “List months and the total sales by products for each month.”

(Hint: Use the ztblMonths driver table I provided.)

You can find the solution in CH20_Product_Sales_ByMonth (253 rows).

2. “Produce a customer mailing list, but skip the five labels already used on the first page of the labels.”

(Hint: Use the ztblSeqNumbers driver table I provided.)

You can find the solution in CH20_Customer_Mailing_Skip_5 (33 rows).

3. “The sales manager wants to send out 10% discount coupons for customers who made large purchases in December 2017. Use the ztblPurchaseCoupons table to determine how many coupons each customer gets based on the total purchases for the month.”

(Hint: You need to CROSS JOIN the driver table with the Customers table joined with a subquery that calculates the total spend for each customer.)

You can find the solution in CH20_Customer_Dec_2017_Order_Coupons (27 rows).

4. “Using the solution to #3 above, print out one 10% off coupon based on the number of coupons each customer earned.”

(Hint: Use the ztblSeqNumbers driver table that I provided with the query in the above problem.)

You can find the solution in CH20_Customer_Discount_Coupons_Print (309 rows).

5. “Display all months in 2017 and 2018, all products, and the total sales (if any) registered for the product in the month.”

(Hint: Use a CROSS JOIN between the ztblMonths driver table and the Products table and use a subquery to fetch the product sales for each product and month.)

You can find the solution in CH20_Product_Sales_All_Months_2017_2018 (960 rows).

6. “Display all products and categorize them from Affordable to Expensive.”

(Hint: Use a CROSS JOIN with ztblPriceRanges.) You can find the solution in CH20_Product_Price_Ranges (40 rows).

Entertainment Database

1. “List all agents and any entertainers who haven’t had a booking since February 1, 2018.”

(Hint: Use a CROSS JOIN between Agents and Entertainers and use NOT IN on a subquery in the WHERE clause to find entertainers not booked since February 1, 2018.)

You can find the solution in CH20_Agents_Entertainers_Unbooked_Feb1_2018 (162 rows).

2. “Show all entertainer styles and the count of the first, second, and third strengths.”

(Hint: This is similar to the CH20_Customer_Style_Preference_Rankings query I showed you earlier. Use a CROSS JOIN of the Musical_Styles table with a subquery that “pivots” the strengths into three columns, then count the columns.)

You can find the solution in CH20_Entertainer_Style_Strength_Rankings (17 rows).

3. “Display customers and their first, second, and third-ranked preferences along with entertainers and their first, second, and third-ranked strengths, then match customers to entertainers when the first or second preference matches the first or second strength.”

(Hint: Create a query on musical styles and customers and pivot the first, second, and third strengths using a CASE expression. You will need to use MAX and GROUP BY because the pivot will return Null values for some of the positions. Do the same with entertainers and musical styles, then CROSS JOIN the two subqueries and return the rows where the preferences and strengths match in the first two positions.)

You can find the solution in CH20_Customers_Match_Entertainers_FirstSecond_PrefStrength (6 rows).

4. “List all months across and calculate each entertainer’s income per month.”

(Hint: Use the ztblMonths driver table to pivot the amounts per month and use SUM to total the amounts per entertainer.)

You can find the solution in CH20_Entertainer_BookingAmount_ByMonth (12 rows).

5. “Display all dates in December 2017 and any entertainers booked on those days.”

(Hint: Build a subquery using a CROSS JOIN between the ztblDays driver table and a JOIN on entertainers and engagements, then LEFT JOIN that with ztblDays again to get all dates.)

You can find the solution in CH20_All_December_Days_Any_Bookings (79 rows).

6. “Produce a customer mailing list, but skip the four labels already used on the first page of labels.”

You can find the solution in CH20_Customer_Mailing_Skip_4 (19 rows).

School Scheduling Database

1. “List all students and the classes they could take, excluding the subjects enrolled or already completed. Be sure to list any subject prerequisite.”

(Hint: Do a CROSS JOIN between students and subjects joined with classes, and use a subquery to eliminate classes found in the student schedules table for the current student where the class status in the student schedules table is not 1 (enrolled) or 2 (completed).)

You can find the solution in CH20_Students_Additional_Courses (1,894 rows).

2. “Display a count of students by gender and marital status by state of residence in columns across.”

(Hint: Use the ztblGenderMatrix and ztblMaritalStatusMatrix driver tables to pivot your values.)

You can find the solution in CH20_Student_Crosstab_Gender_MaritalStatus (4 rows).

3. “Calculate an average proficiency rating for all teaching staff across the subjects they teach and show an overall rating based on the values found in the ztblProfRatings driver table.”

You can find the solution in CH20_Staff_Proficiency_Ratings (24 rows).

4. “Create a mailing list for students, but skip the first two labels already used on the first page.”

You can find the solution in CH20_Student_Mailing_Skip_2 (20 rows).

Bowling League Database

1. “Show bowlers and a rating of their raw score averages based on the values found in the ztblBowlerRatings driver table.”

You can find the solution in CH20_Bowler_Ratings (32 rows).

2. “List all weeks from September through December 2017 and the location of any tournament scheduled for those weeks.”

You can find the solution in CH20_Tournament_Week_Schedule_2017 (19 rows).