CERTIFICATION OBJECTIVES
The three pillars of relational theory are selection, projection, and joining. This chapter focuses on the practical implementation of joining. Rows from different tables are associated with each other using joins. Support for joining has implications for the way data is stored in database tables. Many data models such as third normal form or star schemas have emerged to exploit this feature.
Tables may be joined in several ways. The most common technique is called an equijoin. A row is associated with one or more rows in another table based on the equality of column values or expressions. Tables may also be joined using a nonequijoin. In this case, a row is associated with one or more rows in another table if its column values fall into a range determined by inequality operators.
A less common technique is to associate rows with other rows in the same table. This association is based on columns with logical and usually hierarchical relationships with each other. This is called a self-join. Rows with null or differing entries in common join columns are excluded when equijoins and nonequijoins, collectively known as inner joins, are performed. An outer join is available to fetch these one-legged or orphaned rows if necessary.
A cross join or Cartesian product is formed when every row from one table is joined to all rows in another. This join is often the result of missing or inadequate join conditions but is occasionally intentional.
CERTIFICATION OBJECTIVE 7.01
Write SELECT Statements to Access Data from More Than One Table Using Equijoins and Nonequijoins
This certification objective receives extensive coverage in this chapter. It is crucial to learning the concepts and language for performing joins. Different types of joins are introduced in their primitive forms outlining the broad categories that are available. An in-depth discussion of the various join clauses is then conducted. The modern ANSI-compliant and traditional Oracle syntaxes are discussed, but emphasis is placed on the modern syntax. This section concludes with a discussion of nonequijoins and additional join conditions. Joining is described by focusing on the following eight areas:
Types of joins
Joining tables using ANSI SQL syntax
Qualifying ambiguous column names
The NATURAL JOIN clause
The JOIN USING clause
The JOIN ON clause
N-way joins and additional join conditions
Nonequijoins
Types of Joins
Two basic joins are the equijoin and the nonequijoin. Equijoins are probably more frequently used. Joins may be performed between multiple tables, but much of the following discussion will use two hypothetical tables to illustrate the concepts and language of joins. The first table is called the source and the second is called the target. Rows in the source and target tables comprise one or more columns. As an example, assume that the source and target are the COUNTRIES and REGIONS tables from the HR schema, respectively.
The COUNTRIES table contains three columns named COUNTRY_ID, COUNTRY_NAME, and REGION_ID. The REGIONS table is comprised of two columns named REGION_ID and REGION_NAME. The data in these two tables is related to each other based on the common REGION_ID column. Consider the following queries:
The name of the region to which a country belongs may be determined by obtaining its REGION_ID value. This value is used to join it with the row in the REGIONS table with the same REGION_ID. Query 1 retrieves the column values associated with the row from the COUNTRIES table where the COUNTRY_ID=’CA’. The REGION_ID value of this row is 2. Query 2 fetches the Americas REGION_NAME from the REGIONS table for the row with REGION_ID=2. Equijoining facilitates the retrieval of column values from multiple tables using a single query.
The source and target tables can be swapped, so the REGIONS table could be the source and the COUNTRIES table could be the target. Consider the following two queries:
Query 1 fetches one row with a REGION_ID value of 2. Joining in this reversed manner allows the following question to be asked: What countries belong to the Americas region? The answers from Query 2 are five COUNTRY_NAME values: Argentina, Brazil, Canada, Mexico, and the United States of America. These results may be obtained from a single query that joins the tables together. The language to perform equijoins, nonequijoins, outer joins, and cross joins is introduced next, along with a discussion of the traditional Oracle join syntax.
Inner Joins
The inner join is implemented using three possible join clauses that use the following keywords in different combinations: NATURAL JOIN, USING, and ON.
When the source and target tables share identically named columns, it is possible to perform a natural join between them without specifying a join column. In this scenario, columns with the same names in the source and target tables are automatically associated with each other. Rows with matching column values in both tables are retrieved. The REGIONS and COUNTRIES table both share the REGION_ID column. They may be naturally joined without specifying join columns, as shown in the first two queries in Figure 7-1.
FIGURE 7-1 Natural joins and other inner joins
The NATURAL JOIN keywords instruct Oracle to identify columns with identical names between the source and target tables. Thereafter, a join is implicitly performed between them. In the first query, the REGION_ID column is identified as the only commonly named column in both tables. REGIONS is the source table and appears after the FROM clause. The target table is therefore COUNTRIES. For each row in the REGIONS table, a match for the REGION_ID value is sought from all the rows in the COUNTRIES table. An interim result set is constructed containing rows matching the join condition. This set is then restricted by the WHERE clause. In this case, because the COUNTRY_NAME value must be “Canada”, a REGION_NAME of “Americas” is returned.
The second query shows a natural join where COUNTRIES is the source table. The REGION_ID value for each row in the COUNTRIES table is identified and a search for a matching row in the REGIONS table is initiated. If matches are found, the interim results are limited by any WHERE conditions. The COUNTRY_NAME from rows with “Americas” as their REGION_NAME are returned.
Sometimes more control must be exercised regarding which columns to use for joins. When there are identical column names in the source and target tables you want to exclude as join columns, the JOIN…USING format may be used. Remember that Oracle does not impose any rules stating that columns with the same name in two discrete tables must necessarily have any relationship with each other. The third query explicitly specifies that the REGIONS table be joined to the COUNTRIES table based on common values in their REGION_ID columns. This syntax allows inner joins to be formed on specific columns instead of on all commonly named columns.
The fourth query demonstrates the JOIN…ON format of the inner join, which allows join columns to be explicitly stated. This format does not depend on the columns in the source and target tables having identical names. This form is more general and is the most widely used inner join format.
Be wary when using natural joins since database designers may assign the same name to key or unique columns. These columns may have names like ID or SEQ_NO. If a natural join is attempted between such tables, ambiguous and unexpected results may be returned.
Outer Joins
Not all tables share a perfect relationship, where every record in the source table can be matched to at least one row in the target table. It is occasionally required that rows with nonmatching join column values also be retrieved by a query. This may seem to defeat the purpose of joins but has some practical benefits.
Suppose the EMPLOYEES and DEPARTMENTS tables are joined with common DEPARTMENT_ID values. EMPLOYEES records with null DEPARTMENT_ID values are excluded along with values absent from the DEPARTMENTS table. An outer join fetches these rows.
Cross Joins
A cross join or Cartesian product derives its names from mathematics, where it is also referred to as a cross product between two sets or matrices. This join creates one row of output for every combination of source and target table rows.
If the source and target tables have three and four rows, respectively, a cross join between them results in (3 × 4 = 12) rows being returned. Consider the row counts retrieved from the queries in Figure 7-2.
The first two row counts are performed on the COUNTRIES and REGIONS tables yielding 25 and 4 rows, respectively. Query 3 counts the number of rows returned from a cross join of these tables and yields 100. Query 4 would return 100 records if the WHERE clause was absent. Each of the four rows in the REGIONS table is joined to the one row from the COUNTRIES table. Each row returned contains every column from both tables.
Oracle Join Syntax
A proprietary Oracle join syntax has evolved that is stable and understood by millions of users. This traditional syntax is supported by Oracle and is present in software systems across the world. You will no doubt encounter the traditional Oracle join syntax that is now making way for the standardized ANSI-compliant syntax discussed in this chapter.
The traditional Oracle join syntax supports inner joins, outer joins, and Cartesian joins, as shown in the following queries:
Query 1 performs an inner join by specifying the join as a condition in the WHERE clause. This is the most significant difference between the traditional and ANSI SQL join syntaxes. Take note of the column aliasing using the TABLE.COLUMN_NAME notation to disambiguate the identical column names. This notation is discussed in detail later in this chapter. Query 2 specifies the join between the source and target tables as a WHERE condition. There is a plus symbol enclosed in brackets (+) to the left of the equal sign that indicates to Oracle that a right outer join must be performed. This query returns employees’ LAST_NAME and their matching DEPARTMENT_NAME values. In addition, the outer join retrieves DEPARTMENT_NAME from the rows with DEPARTMENT_ID values not currently assigned to any employee records. Query 3 performs a Cartesian or cross join by excluding the join condition.
The traditional Oracle join syntax is widely used. However, the exam assesses your understanding of joins and the ANSI SQL forms of its syntax. Be prepared, though: some questions may tap your knowledge of the traditional syntax. This knowledge is useful since traditional Oracle syntax is deeply embedded across software systems worldwide.
Joining Tables Using ANSI SQL Syntax
Prior to Oracle 9i, the traditional join syntax was the only language available to join tables. Since then, Oracle has introduced a new language that is compliant with the latest ANSI standards. It offers no performance benefits over the traditional syntax. Inner, outer, and cross joins may be written using both ANSI SQL and traditional Oracle SQL.
The general form of the SELECT statement using ANSI SQL syntax is as follows:
SELECT table1.column, table2.column
FROM table1
[NATURAL JOIN table2] |
[JOIN table2 USING (column_name)] |
[JOIN table2 ON (table1.column_name = table2.column_name)] |
[LEFT | RIGHT | FULL OUTER JOIN table2
ON (table1.column_name = table2.column_name)] |
[CROSS JOIN table2];
This is dissected and examples are explained in the following sections. The general form of the traditional Oracle-proprietary syntax relevant to joins is as follows:
SELECT table1.column, table2.column
FROM table1, table2
[WHERE (table1.column_name = table2.column_name)] |
[WHERE (table1.column_name(+)= table2.column_name)] |
[WHERE (table1.column_name)= table2.column_name) (+)] ;
If no joins or fewer than N-1 joins are specified in the WHERE clause conditions, where N refers to the number of tables in the query, then a Cartesian or cross join is performed. If an adequate number of join conditions is specified, then the first optional conditional clause specifies an inner join, while the second two optional clauses specify the syntax for right and left outer joins.
Qualifying Ambiguous Column Names
Columns with the same names may occur in tables involved in a join. The columns named DEPARTMENT_ID and MANAGER_ID are found in both the EMPLOYEES and DEPARTMENTS tables. The REGION_ID column is present in both the REGIONS and COUNTRIES tables. Listing such columns in a query becomes problematic when Oracle cannot resolve their origin. Columns with unique names across the tables involved in a join cause no ambiguity as Oracle can easily resolve their source table.
The problem of ambiguous column names is addressed with dot notation. A column may be prefixed by its table name and a dot or period symbol to designate its origin. This differentiates it from a column with the same name in another table. Dot notation may be used in queries involving any number of tables. Referencing some columns using dot notation does not imply that all columns must be referenced in this way.
Dot notation is enhanced with table aliases. A table alias provides an alternate, usually shorter name for a table. A column may be referenced as TABLE_NAME.COLUMN_NAME or TABLE_ALIAS.COLUMN_NAME. Consider the query shown in Figure 7-3.
The EMPLOYEES table is aliased with the short name EMP while the DEPARTMENTS table is not. The SELECT clause references the EMPLOYEE_ID and MANAGER_ID columns as EMP.EMPLOYEE_ID and EMP.MANAGER_ID. The MANAGER_ID column from the DEPARTMENTS table is referred to as DEPARTMENTS.MANAGER_ID. Qualifying the EMPLOYEE_ID column using dot notation is unnecessary because there is only one column with this name between the two tables. Therefore, there is no ambiguity.
The MANAGER_ID column must be qualified to avoid ambiguity because it is featured in both tables. Since the JOIN…USING format is applied, only DEPARTMENT_ID is used as the join column. If a NATURAL JOIN was employed, both the DEPARTMENT_ID and MANAGER_ID columns would be used. If the MANAGER_ID column was not qualified, an “ORA-00918:column ambiguously defined” error would be returned. If DEPARTMENT_ID was aliased, an “ORA-25154:column part of USING clause cannot have qualifier” error would be raised.
SQL Developer provides the heading MANAGER_ID to the first reference made in the SELECT clause. The string "_1"
is automatically appended to the second reference, creating the heading MANAGER_ID_1.
Qualifying column references with dot notation to indicate a column’s table of origin has a performance benefit. Time is saved because Oracle is directed instantaneously to the appropriate table and does not have to resolve the table name.
The NATURAL JOIN Clause
The general syntax for the NATURAL JOIN clause is as follows:
SELECT table1.column, table2.column
FROM table1
NATURAL JOIN table2;
The natural join identifies the columns with common names in table1 and table2 and implicitly joins the tables using all these columns. The columns in the SELECT clause may be qualified using dot notation unless they are one of the join columns. Consider the following queries:
The natural join identifies columns with common names between the two tables. In Query 1, COUNTRY_ID occurs in both tables and becomes the join column. Query 2 is written using traditional Oracle syntax and retrieves the same rows as Query 1. Unless you are familiar with the columns in the source and target tables, natural joins must be used with caution, as join conditions are automatically formed between all columns with shared names.
Query 3 performs a natural join between the JOBS and COUNTRIES tables. There are no columns with identical names, resulting in a Cartesian product. Query 4 is equivalent to Query 3, and a Cartesian join is performed using traditional Oracle syntax.
The natural join is simple but prone to a fundamental weakness. It suffers the risk that two columns with the same name might have no relationship and may not even have compatible data types. In Figure 7-4, the COUNTRIES, REGIONS, and SALE_REGIONS tables are described. The SALES_REGIONS table was constructed to illustrate the following important point: Although it has REGION_ID in common with the COUNTRIES table, it cannot be naturally joined to it because their data types are incompatible. The data types of the COUNTRIES.REGION_ID and SALES_REGIONS.REGION_ID columns are NUMBER and VARCHAR2, respectively. The character data cannot be implicitly converted into numeric data and an “ORA-01722: invalid number” error is raised. The REGIONS.REGION_ID column is of type NUMBER and its data is related to the data in the COUNTRIES table. Therefore, the natural join between the REGIONS and COUNTRIES tables works perfectly.
FIGURE 7-4 The natural join
EXERCISE 7-1
Using the NATURAL JOIN
The JOB_HISTORY table shares three identically named columns with the EMPLOYEES table: EMPLOYEE_ID, JOB_ID, and DEPARTMENT_ID. You are required to describe the tables and fetch the EMPLOYEE_ID, JOB_ID, DEPARTMENT_ID, LAST_NAME, HIRE_DATE, and END_DATE values for all rows retrieved using a natural join. Alias the EMPLOYEES table as EMP and the JOB_HISTORY table as JH and use dot notation where possible.
1. Start SQL*Plus and connect to the HR schema.
2. The tables are described using the commands DESC EMPLOYEES and DESC JOB_HISTORY, and the columns with identical names and their data types may be examined.
3. The FROM clause is
4. The JOIN clause is
5. The SELECT clause is
6. Executing this statement returns a single row with the same EMPLOYEE_ID, JOB_ID, and DEPARTMENT_ID values in both tables and is shown in the following illustration:
The JOIN USING Clause
The format of the syntax for the JOIN USING clause is as follows:
SELECT table1.column, table2.column
FROM table1
JOIN table2 USING (join_column1, join_column2…);
While the natural join contains the NATURAL keyword in its syntax, the JOIN…USING syntax does not. An error is raised if the keywords NATURAL and USING occur in the same join clause. The JOIN…USING clause allows one or more equijoin columns to be explicitly specified in brackets after the USING keyword. This avoids the shortcomings associated with the natural join. Many situations demand that tables be joined only on certain columns and this format caters for this requirement. Consider the following queries:
Query 1 specifies that the LOCATIONS and COUNTRIES tables must be joined on common COUNTRY_ID column values. All columns from these tables are retrieved for the rows with matching join column values. Query 2 shows a traditionally specified query that retrieves the same rows as Query 1. The join columns specified with the JOIN…USING syntax cannot be qualified using table names or aliases when they are referenced in the SELECT and JOIN clauses. Since this join syntax potentially excludes some columns with identical names from the join clause, these must be qualified if they are referenced to avoid ambiguity.
As Figure 7-5 shows, the JOB_HISTORY and EMPLOYEES tables were joined based on the presence of equal values in their JOB_ID and EMPLOYEE_ID columns. Rows conforming to this join condition are retrieved. These tables share three identically named columns. In this example, only two of these are specified as join columns. Notice that although the third identically named column is DEPARTMENT_ID, it is qualified with a table alias to avoid ambiguity while the join columns specified in the SELECT clause cannot be qualified with table aliases.
FIGURE 7-5 Natural join using the JOIN…USING clause
The JOIN ON Clause
The format of the syntax for the JOIN ON clause is as follows:
SELECT table1.column, table2.column
FROM table1
JOIN table2 ON (table1.column_name = table2.column_name);
The natural join and the JOIN…USING clauses depend on join columns with identical column names. The JOIN…ON clause allows the explicit specification of join columns, regardless of their column names. This is the most flexible and widely used form of the join clauses. The ON and NATURAL keywords cannot appear together in a join clause. The equijoin columns are fully qualified as table1.column1 = table2.column2 and are optionally specified in brackets after the ON keyword. The following queries illustrate the JOIN…ON clause:
Query 1 retrieves all column values from both the DEPARTMENTS and EMPLOYEES tables for the rows that meet an equijoin condition. This condition is fulfilled by EMPLOYEE_ID values matching DEPARTMENT_ID values in the DEPARTMENTS table. The traditional Oracle syntax in Query 2 returns the same results as Query 1. Notice the similarities between the traditional join condition specified in the WHERE clause and the join condition specified after the ON keyword.
The START_DATE column in the JOB_HISTORY table is joined to the HIRE_DATE column in the EMPLOYEES table in Figure 7-6. This equijoin retrieves the details of employees who worked for the organization and changed jobs.
FIGURE 7-6 Inner join using the JOIN…ON clause
SCENARIO & SOLUTION |
You are required to retrieve information from multiple tables, group the results, and apply an aggregate function to them. Can a group function be used against data from multiple table sources? |
Yes. Joining multiple tables ultimately yields a set of data comprising one or more rows and columns. Once the dataset is created, aggregate functions treat it as if the data originated from one source. |
When joining two tables, there is a risk that between them they contain common column names. Does Oracle know which tables to fetch data from if such columns are present in the SELECT list? |
No. Oracle does not know from which tables such columns originate, and an error is raised. Ambiguous column references can be avoided using qualifiers. Qualifiers employ dot notation to clarify a column’s table of origin. |
The NATURAL JOIN clause is used to join rows from two tables based on columns with common names sharing identical values. Is it possible to join two tables based on some of the shared columns and not all of them? |
Yes. The clause recommended to join two tables based on one or more of the columns with identical names is JOIN…USING. A pair of brackets follows the USING clause in which the unqualified join columns are specified. |
EXERCISE 7-2
Using the NATURAL JOIN…ON Clause
Each record in the DEPARTMENTS table has a MANAGER_ID column matching an EMPLOYEE_ID value in the EMPLOYEES table. You are required to produce a report with one column aliased as Managers. Each row must contain a sentence of the format FIRST_NAME LAST_NAME is manager of the DEPARTMENT_NAME department. Alias the EMPLOYEES table as E and the DEPARTMENTS table as D and use dot notation where possible.
1. Start SQL Developer and connect to the HR schema.
2. The Managers column may be constructed by concatenating the required items and separating them with spaces.
3. The SELECT clause is
4. The FROM clause is
5. The JOIN…ON clause is
6. Executing this statement returns 11 rows describing the managers of each department as shown in the following illustration:
N-Way Joins and Additional Join Conditions
The joins just discussed were demonstrated using two tables. There is no restriction on the number of tables that may be joined. Third normal form consists of a set of tables connected through a series of primary- and foreign-key relationships. Traversing these relationships using joins enables consistent and reliable retrieval of data. However, there are instances when primary- and foreign-key relationships are not defined between tables. These tables may also be joined, but the results do not benefit from referential integrity being enforced by the database. When multiple joins exist in a statement, they are evaluated from left to right. Consider the following query using a mixture of natural joins and Oracle joins:
The natural join between DEPARTMENTS and LOCATIONS creates an interim result set consisting of 27 rows since they are implicitly joined on the LOCATION_ID column. This set is then Cartesian-joined to the COUNTRIES table since a join condition is not implicitly or explicitly specified. The 27 interim rows are joined to the 25 rows in the COUNTRIES table, yielding a new interim results set with 675 (27 × 25) rows and three columns: DEPARTMENT_NAME, CITY, and COUNTRY_NAME. This set is then joined to the REGIONS table. Once again, a Cartesian join occurs because the REGION_ID column is absent from any join condition. The final result set contains 2700 (675 × 4) rows and four columns. Using natural joins mixed with Oracle joins is error prone and not recommended since join conditions may sometimes be erroneously omitted.
The JOIN…USING and JOIN…ON syntaxes are better suited for joining multiple tables. The following query joins four tables using the natural join syntax:
This query correctly yields 27 rows in the final results set since the required join columns are listed in the SELECT clause. The following query demonstrates how the JOIN…ON clause is used to fetch the same 27 rows. A join condition can reference only columns in its scope. In the following example, the join from DEPARTMENTS to LOCATIONS may not reference columns in the COUNTRIES or REGIONS tables, but the join between COUNTRIES and REGIONS may reference any column from the four tables involved in the query.
The JOIN…USING clause can also be used to join these four tables as follows:
The WHERE clause is used to specify conditions that restrict the results set of a query whether it contains joins or not. The JOIN…ON clause is also used to specify conditions that limit the results set created by the join. Consider the following two queries:
Query 1 uses a WHERE clause to restrict the 27 rows created by equijoining the DEPARTMENTS and LOCATIONS tables based on their LOCATION_ID values to the three that contain DEPARTMENT_ID values beginning with the letter “P”. Query 2 implements the condition within the brackets of the ON subclause and returns the same three rows.
Five tables are joined in Figure 7-7, resulting in a list describing the top-earning employees and geographical information about their departments.
FIGURE 7-7 N-way joins and additional join conditions
There are three equijoin or inner join formats. The natural join uses the NATURAL JOIN clause and joins two tables based on all columns with shared names. The other two formats use the JOIN…USING and JOIN…ON clauses. Pay attention to the syntax, since a join clause such as SELECT * FROM TABLE1 NATURAL JOIN TABLE2 USING (COLUMN) may appear correct, but is, in fact, syntactically incorrect. Remember that the USING, ON, and NATURAL keywords are mutually exclusive in the context of the same join clause.
Nonequijoins
Nonequijoins match column values from different tables based on an inequality expression. The value of the join column in each row in the source table is compared to the corresponding values in the target table. A match is found if the expression used in the join, based on an inequality operator, evaluates to true. When such a join is constructed, a nonequijoin is performed.
A nonequijoin is specified using the JOIN…ON syntax, but the join condition contains an inequality operator instead of an equal sign.
The format of the syntax for a nonequijoin clause is as follows:
SELECT table1.column, table2.column
FROM table1
[JOIN table2 ON (table1.column_name < table2.column_name)]|
[JOIN table2 ON (table1.column_name > table2.column_name)]|
[JOIN table2 ON (table1.column_name <= table2.column_name)]|
[JOIN table2 ON (table1.column_name >= table2.column_name)]|
[JOIN table2 ON (table1.column BETWEEN table2.col1 AND table2.col2)]
Consider the 16 rows returned by the query in Figure 7-8. The EMPLOYEES table is nonequijoined to the JOBS table based on the inequality join condition (2*E.SALARY < J.MAX_SALARY)
. The JOBS table stores the salary range for different jobs in the organization. The SALARY value for each employee record is doubled and compared with all MAX_SALARY values in the JOBS table. If the join condition evaluates to true, the row is returned.
Nonequijoins are not as commonly used as equijoins. The BETWEEN range operator often appears with nonequijoin conditions. It is simpler to use one BETWEEN operator in a condition than two nonequijoin conditions based on the less than or equal to (<=) and the greater than or equal to (>=) operators.
CERTIFICATION OBJECTIVE 7.02
Join a Table to Itself Using a Self-Join
Storing hierarchical data in a single relational table may be accomplished by allocating at least two columns per row. One column stores an identifier of the row’s parent record and the second stores the row’s identifier. Associating rows with each other based on a hierarchical relationship requires Oracle to join a table to itself. This self-join technique is discussed in the next section.
Joining a Table to Itself Using the JOIN…ON Clause
Suppose there is a need to store a family tree in a relational table. There are several approaches one could take. One option is to use a table called FAMILY with columns named ID, NAME, MOTHER_ID, and FATHER_ID, where each row stores a person’s name, unique ID number, and the ID values for their parents.
When two tables are joined, each row from the source table is subjected to the join condition with rows from the target table. If the condition evaluates to true, then the joined row, consisting of columns from both tables, is returned.
When the join columns originate from the same table, a self-join is required. Conceptually, the source table is duplicated to create the target table. The self-join works like a regular join between these tables. Note that, internally, Oracle does not duplicate the table and this description is merely provided to explain the concept of self-joining. Consider the following three queries:
To identify a person’s father in the FAMILY table, you could use Query 1 to get that person’s ID, NAME, and FATHER_ID value. In Query 2, the FATHER_ID value obtained from the first query can be substituted to obtain the father’s NAME value. Notice that both Queries 1 and 2 source information from the FAMILY table.
Query 3 performs a self-join with the JOIN…ON clause by aliasing the FAMILY table as f1 and f2. Oracle treats these as different tables even though they point to the same physical table. The first occurrence of the FAMILY table, aliased as f1, is designated as the source table, while the second occurrence, aliased as f2, is assigned as the target table. The join condition in the ON clause is of the format source.child_id=target.parent_id. Figure 7-9 shows a sample of FAMILY data and demonstrates a three-way self-join to the same table.
EXERCISE 7-3
Performing a Self-Join
There is a hierarchical relationship between employees and their managers. For each row in the EMPLOYEES table, the MANAGER_ID column stores the EMPLOYEE_ID of every employee’s manager. Using a self-join on the EMPLOYEES table, you are required to retrieve the employee’s LAST_NAME, EMPLOYEE_ID, MANAGER_ID, manager’s LAST_NAME, and employee’s DEPARTMENT_ID for the rows with DEPARMENT_ID values of 10, 20, or 30. Alias the EMPLOYEES table as E and the second instance of the EMPLOYEES table as M. Sort the results based on the DEPARTMENT_ID column.
1. Start SQL Developer and connect to the HR schema.
2. The SELECT clause is
3. The FROM clause with source table and alias is
4. The JOIN…ON clause with aliased target table is
5. The WHERE clause is
6. The ORDER BY clause is
7. Executing this statement returns nine rows describing the managers of each employee in these departments as shown in the following illustration:
CERTIFICATION OBJECTIVE 7.03
View Data That Does Not Meet a Join Condition by Using Outer Joins
Equijoins match rows between two tables based on the equality of the column data stored in each table. Nonequijoins rely on matching rows between tables based on a join condition containing an inequality expression. Target table rows with no matching join column in the source table are usually not required. When they are required, however, an outer join is used to fetch them. Several variations of outer joins may be used depending on whether join column data is missing from the source or target tables or both. These outer join techniques are described in the following topics:
Inner versus outer joins
Left outer joins
Right outer joins
Full outer joins
Inner Versus Outer Joins
When equijoins and nonequijoins are performed, rows from the source and target tables are matched using a join condition formulated with equality and inequality operators, respectively. These are referred to as inner joins. An outer join is performed when rows, which are not retrieved by an inner join, are returned.
Two tables sometimes share a master-detail or parent-child relationship. In the sample HR schema there are several pairs of tables with such a relationship. One pair is the DEPARTMENTS and EMPLOYEES tables. The DEPARTMENTS table stores a master list of DEPARTMENT_NAME and DEPARTMENT_ID values. Each EMPLOYEES record has a DEPARTMENT_ID column constrained to be either a value that exists in the DEPARTMENTS table or null. This leads to one of the following three scenarios. The fourth scenario could occur if the constraint between the tables was removed.
1. An employee row has a DEPARTMENT_ID value that matches a row in the DEPARTMENTS table.
2. An employee row has a null value in its DEPARTMENT_ID column.
3. There are rows in the DEPARTMENTS table with DEPARTMENT_ID values that are not stored in any employee records.
4. An employee row has a DEPARTMENT_ID value that is not featured in the DEPARTMENTS table.
The first scenario describes an inner join between the two tables. The second and third scenarios cause many problems. Joining the EMPLOYEES and DEPARTMENTS tables on the DEPARTMENT_ID column may result in rows with null DEPARTMENT_ID values being excluded. An outer join can be used to include these orphaned rows in the results set. The fourth scenario should rarely occur in a well-designed database, because foreign-key constraints would prevent the insertion of child records with no parent values. Since this row will be excluded by an inner join, it may be retrieved using an outer join.
A left outer join between the source and target tables returns the results of an inner join as well as rows from the source table excluded by that inner join. A right outer join between the source and target tables returns the results of an inner join as well as rows from the target table excluded by that inner join. If a join returns the results of an inner join as well as rows from both the source and target tables excluded by that inner join, then a full outer join has been performed.
Left Outer Joins
The format of the syntax for the LEFT OUTER JOIN clause is as follows:
SELECT table1.column, table2.column
FROM table1
LEFT OUTER JOIN table2
ON (table1.column = table2.column);
A left outer join performs an inner join of table1 and table2 based on the condition specified after the ON keyword. Any rows from the table on the left of the JOIN keyword excluded for not fulfilling the join condition are also returned. Consider the following two queries:
Queries 1 and 2 are identical except for the join clauses, which have the keywords LEFT OUTER JOIN and JOIN, respectively. Query 2 performs an inner join and seven rows are returned. These rows share identical DEPARTMENT_ID values in both tables. Query 1 returns the same seven rows and one additional row. This extra row is obtained from the table to the left of the JOIN keyword, which is the DEPARTMENTS table. It is the row containing details of the Payroll department. The inner join does not include this row since no employees are currently assigned to the department.
A left outer join is shown in Figure 7-10. The inner join produces 27 rows with matching LOCATION_ID values in both tables. There are 43 rows in total, which implies that 16 rows were retrieved from the LOCATIONS table, which is on the left of the JOIN keyword. None of the rows from the DEPARTMENTS table contain any of these 16 LOCATION_ID values.
FIGURE 7-10 Left outer join
Right Outer Joins
The format of the syntax for the RIGHT OUTER JOIN clause is as follows:
SELECT table1.column, table2.column
FROM table1
RIGHT OUTER JOIN table2
ON (table1.column = table2.column);
A right outer join performs an inner join of table1 and table2 based on the join condition specified after the ON keyword. Rows from the table to the right of the JOIN keyword, excluded by the join condition, are also returned. Consider the following query:
The inner join produces seven rows containing details for the employees with LAST_NAME values that begin with the letter “G”. The EMPLOYEES table is to the right of the JOIN keyword. Any employee records which do not conform to the join condition are included, provided they conform to the WHERE clause condition. In addition, the right outer join fetches one EMPLOYEE record with a LAST_NAME of Grant. This record currently has a null DEPARTMENT_ID value. The inner join excludes the record since no DEPARTMENT_ID is assigned to this employee.
A right outer join between the JOB_HISTORY and EMPLOYEES tables is shown in Figure 7-11. The EMPLOYEES table is on the right of the JOIN keyword. The DISTINCT keyword eliminates duplicate combinations of JOB_ID values from the tables. The results show the jobs that employees have historically left. The jobs that no employees have left are also returned. These have a null value in the “Jobs in JOB_HISTORY” column.
FIGURE 7-11 Right outer join
There are three types of outer join formats. Each of them performs an inner join before including rows the join condition excluded. If a left outer join is performed, then rows excluded by the inner join, to the left of the JOIN keyword, are also returned. If a right outer join is performed, then rows excluded by the inner join, to the right of the JOIN keyword, are returned as well. The full outer join performs an inner join as well as a left and right outer join.
Full Outer Joins
The format of the syntax for the FULL OUTER JOIN clause is as follows:
SELECT table1.column, table2.column
FROM table1
FULL OUTER JOIN table2
ON (table1.column = table2.column);
A full outer join returns the combined results of a left and right outer join. An inner join of table1 and table2 is performed before rows excluded by the join condition from both tables are merged into the results set.
The traditional Oracle join syntax does not support a full outer join, which is typically performed by combining the results from a left and right outer join using the UNION set operator described in Chapter 9. Consider the full outer join shown in Figure 7-12. The WHERE clause restricting the results to rows with NULL DEPARTMENT_ID values shows the orphan rows in both tables. There is one record in the EMPLOYEES table that has no DEPARTMENT_ID values, and there are 16 departments to which no employees belong.
FIGURE 7-12 Full outer join
EXERCISE 7-4
Performing an Outer Join
The DEPARTMENTS table contains details of all departments in the organization. You are required to retrieve the DEPARTMENT_NAME and DEPARTMENT_ID values for those departments to which no employees are currently assigned.
1. Start SQL*Plus and connect to the HR schema.
2. The SELECT clause is
3. The FROM clause with source table and alias is
4. The LEFT OUTER JOIN clause with aliased target table is
5. The WHERE clause is
6. Executing this statement returns 16 rows describing the departments to which no employees are currently assigned as shown in the following illustration:
SCENARIO & SOLUTION |
The data in two tables you wish to join is related but does not share any identically named columns. Is it possible to join tables using columns that do not share the same name? |
Yes. The JOIN…ON clause is provided for this purpose. It provides a flexible and generic solution to joining tables based on nonidentical column names. |
You wish to divide staff into four groups named after the four regions in the REGIONS table. Is it possible to obtain a list of EMPLOYEE_ID, LAST_NAME, and REGION_NAME values for each employee by joining the EMPLOYEE_ID and REGION_ID columns in a round-robin manner? |
Yes. The REGION_ID value ranges from 1 to 4. Adding 1 to the remainder of EMPLOYEE_ID divided by 4 creates a value in the range 1 to 4. The round-robin assignment of employees may be done as follows:
SELECT LAST_NAME, EMPLOYEE_ID, REGION_NAME FROM EMPLOYEES JOIN REGIONS ON (MOD(EMPLOYEE_ID,4)+1= REGION_ID)
|
You are required to retrieve a list of DEPARTMENT_NAME and LAST_NAME values for all departments, including those that currently have no employees assigned to them. In such cases the string 'No Employees' should be displayed as the LAST_NAME column value. Can this be done using joins? |
Yes. Depending on which side of the JOIN keyword the DEPARTMENTS table is placed, a left or right outer join may be used, since this is the table where the orphan rows originate. The following query satisfies the request:
SELECT DEPARTMENT_NAME, NVL(LAST_NAME,'No Employees') FROM EMPLOYEES RIGHT OUTER JOIN DEPARTMENTS USING (DEPARTMENT_ID)
|
CERTIFICATION OBJECTIVE 7.04
Generate a Cartesian Product of Two or More Tables
A Cartesian product of two tables may be conceptualized as joining each row of the source table with every row in the target table. The number of rows in the result set created by a Cartesian product is equal to the number of rows in the source table multiplied by the number of rows in the target table. Cartesian products may be formed intentionally using the ANSI SQL cross join syntax. This technique is described in the next section.
Creating Cartesian Products Using Cross Joins
Cartesian product is a mathematical term. It refers to the set of data created by merging the rows from two or more tables together. Cross join is the syntax used to create a Cartesian product by joining multiple tables. Both terms are often used synonymously. The format of the syntax for the CROSS JOIN clause is as follows:
SELECT table1.column, table2.column
FROM table1
CROSS JOIN table2;
It is important to observe that no join condition is specified using the ON or USING keywords. A Cartesian product freely associates the rows from table1 with every row in table2. Conditions that limit the results are permitted in the form of WHERE clause restrictions. If table1 and table2 contain x and y number of rows, respectively, the Cartesian product will contain x times y number of rows. The results from a cross join may be used to identify orphan rows or generate a large data set for use in application testing. Consider the following queries:
Query 1 takes the 19 rows and 4 columns from the JOBS table and the 10 rows and 5 columns from the JOB_HISTORY table and generates one large set of 190 records with 9 columns. SQL*Plus presents any identically named columns as headings. SQL Developer appends an underscore and number to each shared column name and uses it as the heading. The JOB_ID column is common to both the JOBS and JOB_HISTORY tables. The headings in SQL Developer are labeled JOB_ID and JOB_ID_1, respectively. Query 2 generates the same Cartesian product as the first, but the 190 rows are constrained by the WHERE clause condition and only 10 rows are returned.
When using the cross join syntax, a Cartesian product is intentionally generated. Inadvertent Cartesian products are created when there are insufficient join conditions in a statement. Joins that specify fewer than N-1 join conditions when joining N tables or that specify invalid join conditions may inadvertently create Cartesian products. A natural join between two tables sharing no identically named columns results in a Cartesian join since two tables are joined but less than one condition is available.
Figure 7-13 shows a cross join between the REGIONS and COUNTRIES tables. There are 4 rows in REGIONS and 25 rows in COUNTRIES. Since the WHERE clause limits the REGIONS table to 2 of 4 rows, the Cartesian product produces 50 (25 × 2) records. The results are sorted alphabetically, first on the REGION_NAME and then on the COUNTRY_NAME. The first record has the pair of values, Asia and Argentina. When the REGION_NAME changes, the first record has the pair of values, Middle East and Africa and Argentina. Notice that the COUNTRY_NAME values are repeated for every REGION_NAME.
FIGURE 7-13 The cross join
EXERCISE 7-5
Performing a Cross Join
You are required to obtain the number of rows in the EMPLOYEES and DEPARTMENTS tables as well as the number of records that would be created by a Cartesian product of these two tables. Confirm your results by explicitly counting and multiplying the number of rows present in each of these tables.
1. Start SQL*Plus and connect to the HR schema.
2. The SELECT clause to find the number of rows in the Cartesian product is
3. The FROM clause is
4. The Cartesian product is performed using
5. Explicit counts of the rows present in the source tables are performed using
6. Explicit multiplication of the values resulting from the previous queries may be performed by querying the DUAL table.
7. Executing these statements reveals that there are 107 records in the EMPLOYEES table, 27 records in the DEPARTMENTS table, and 2,889 records in the Cartesian product of these two data sets as shown in the following illustration:
INSIDE THE EXAM
Joining is a fundamental relational principle. The certification objectives in this chapter are examined using practical scenarios in which two tables are joined. You are required to predict the number of rows returned by a join query or to assess whether it is syntactically correct or not. The inner join clauses include NATURAL JOIN, JOIN…USING, and JOIN…ON.
Remember the following simple rules. The keywords NATURAL, USING, and ON are mutually exclusive. They may not be used together in the same join clause. The NATURAL join takes no join conditions. The JOIN…USING clause requires unqualified column references in join conditions, which must appear in brackets after the USING keyword.
Self-joins are often used for searching through hierarchical data stored in separate columns in the same table. It is an uncommon join, and little emphasis is placed on testing your knowledge of self-joins in the exam. Outer joins, however, form a significant part of the exam content. Ensure that you have a solid understanding of LEFT, RIGHT, and FULL OUTER joins.
Cartesian products may be created inadvertently or intentionally using the CROSS JOIN clause. A mistake frequently made in the early stages of learning about joins is to specify fewer join conditions than are necessary when joining multiple tables. This leads to accidental Cartesian joins and is sometimes tested in the exams. Remember that when joining N tables, at least N-1 join conditions are required to avoid a Cartesian join.
CERTIFICATION SUMMARY
Data stored in separate tables may be associated with each other using various types of joins. Joins allow data to be stored in a relational manner. This prevents the need for multiple copies of the same data across multiple tables.
Equijoins and nonequijoins are referred to as inner joins. They associate rows from multiple tables that conform to join conditions and are specified using either equality or inequality operators. Rows that do not conform to these join conditions, which are ordinarily excluded by inner joins, may be retrieved with outer joins. Left, right, and full outer joins facilitate the retrieval of orphan rows.
The ANSI SQL join syntax is discussed in detail, and three forms of the inner join are explored. Each form has a purpose, and the advantages and risks associated with them are considered.
Joins associate columns from multiple tables that may share the same name. Dot notation uses a method of qualifying columns to disambiguate them. It is accompanied by table aliasing, which is not strictly essential but helps a great deal when formulating joins between tables with lengthy names.
The retrieval of hierarchical data stored in a single table using self-joins is considered. N-way joins allow more than two tables to be joined, and this generalized option is discussed. Finally, cross joins and the unique challenges associated with them are examined.
Joining is one of the fundamental pillars of relational theory and is critical to your successful exploitation of the full potential that SQL offers.
TWO-MINUTE DRILL
Write SELECT Statements to Access Data from More Than One Table Using Equijoins and Nonequijoins
Equijoining occurs when one query fetches column values from multiple tables in which the rows fulfill an equality-based join condition.
A natural join is performed using the NATURAL JOIN syntax when the source and target tables are implicitly equijoined using all identically named columns.
The JOIN…USING syntax allows an inner join to be formed on specific columns with shared names.
Dot notation refers to qualifying a column by prefixing it with its table name and a dot or period symbol. This designates the table a column originates from and differentiates it from identically named columns from other tables.
The JOIN…ON clause allows the explicit specification of join columns regardless of their column names. This provides a flexible joining format.
The ON, USING, and NATURAL keywords are mutually exclusive and therefore cannot appear together in a join clause.
A nonequijoin is performed when the values in the join columns fulfill the join condition based on an inequality expression.
Join a Table to Itself Using a Self-Join
A self-join is required when the join columns originate from the same table. Conceptually, the source table is duplicated and a target table is created. The self-join then works as a regular join between two discrete tables.
Storing hierarchical data in a relational table requires a minimum of two columns per row. One column stores an identifier of the row’s parent record and the second stores the row’s identifier.
View Data That Does Not Meet a Join Condition Using Outer Joins
When equijoins and nonequijoins are performed, rows from the source and target tables are matched. These are referred to as inner joins.
An outer join is performed when rows, which are not retrieved by an inner join, are included for retrieval in addition to the rows retrieved by the inner join.
A left outer join between the source and target tables returns the results of an inner join and the missing rows it excluded from the source table.
A right outer join between the source and target tables returns the results of an inner join and the missing rows it excluded from the target table.
A full outer join returns the combined results of a left outer join and right outer join.
Generate a Cartesian Product of Two or More Tables
A Cartesian product is sometimes called a cross join. It is a mathematical term that refers to the set of data created by merging the rows from two or more tables.
The count of the rows returned from a Cartesian product is equal to the number of rows in the source table multiplied by the number of rows in the target table.
Joins that specify fewer than N-1 join conditions when joining N tables, or that specify invalid join conditions, inadvertently create Cartesian products.
SELF TEST
The following questions will help you measure your understanding of the material presented in this chapter. Read all the choices carefully because there may be more than one correct answer. Choose all the correct answers for each question.
Write SELECT Statements to Access Data from More Than One Table Using Equijoins and Nonequijoins
1. The EMPLOYEES and DEPARTMENTS tables have two identically named columns: DEPARTMENT_ID and MANAGER_ID. Which of these statements joins these tables based only on common DEPARTMENT_ID values? (Choose all that apply.)
A. SELECT * FROM EMPLOYEES NATURAL JOIN DEPARTMENTS;
B. SELECT * FROM EMPLOYEES E NATURAL JOIN DEPARTMENTS D ON E.DEPARTMENT_ID=D.DEPARTMENT_ID;
C. SELECT * FROM EMPLOYEES NATURAL JOIN DEPARTMENTS USING (DEPARTMENT_ID);
D. None of the above
2. The EMPLOYEES and DEPARTMENTS tables have two identically named columns: DEPARTMENT_ID and MANAGER_ID. Which statements join these tables based on both column values? (Choose all that apply.)
A. SELECT * FROM EMPLOYEES NATURAL JOIN DEPARTMENTS;
B. SELECT * FROM EMPLOYEES JOIN DEPARTMENTS USING (DEPARTMENT_ID,MANAGER_ID);
C. SELECT * FROM EMPLOYEES E JOIN DEPARTMENTS D ON E.DEPARTMENT_ID=D.DEPARTMENT_ID AND E.MANAGER_ID=D.MANAGER_ID;
D. None of the above
3. Which join is performed by the following query? (Choose the best answer.)
A. Equijoin
B. Nonequijoin
C. Cross join
D. Outer join
4. Which of the following statements are syntactically correct? (Choose all that apply.)
A. SELECT * FROM EMPLOYEES E JOIN DEPARTMENTS D USING (DEPARTMENT_ID);
B. SELECT * FROM EMPLOYEES JOIN DEPARTMENTS D USING (D.DEPARTMENT_ID);
C. SELECT D.DEPARTMENT_ID FROM EMPLOYEES JOIN DEPARTMENTS D USING (DEPARTMENT_ID);
D. None of the above
5. Which of the following statements are syntactically correct? (Choose all that apply.)
A. SELECT E.EMPLOYEE_ID, J.JOB_ID PREVIOUS_JOB, E.JOB_ID CURRENT_JOB FROM JOB_HISTORY J CROSS JOIN EMPLOYEES E ON (J.START_DATE=E.HIRE_DATE);
B. SELECT E.EMPLOYEE_ID, J.JOB_ID PREVIOUS_JOB, E.JOB_ID CURRENT_JOB FROM JOB_HISTORY J JOIN EMPLOYEES E ON (J.START_DATE=E.HIRE_DATE);
C. SELECT E.EMPLOYEE_ID, J.JOB_ID PREVIOUS_JOB, E.JOB_ID CURRENT_JOB FROM JOB_HISTORY J OUTER JOIN EMPLOYEES E ON (J.START_DATE=E.HIRE_DATE);
D. None of the above
6. Choose one correct statement regarding the following query:
A. Joining three tables is not permitted.
B. A Cartesian product is generated.
C. The JOIN…ON clause may be used for joins between multiple tables.
D. None of the above
Join a Table to Itself Using a Self-Join
7. How many rows are returned after executing the following statement? (Choose the best answer.)
The REGIONS table contains the following row data:
REGION_ID |
REGION_NAME |
1 |
Europe |
2 |
Americas |
3 |
Asia |
4 |
Middle East and Africa |
A. 2
B. 3
C. 4
D. None of the above
View Data that Does Not Meet a Join Condition Using Outer Joins
8. Choose one correct statement regarding the following query.
A. No rows in the LOCATIONS table have the COUNTRY_ID values returned.
B. No rows in the COUNTRIES table have the COUNTRY_ID values returned.
C. The rows returned represent the COUNTRY_ID values for all the rows in the LOCATIONS table.
D. None of the above
9. Which of the following statements are syntactically correct? (Choose all that apply.)
A. SELECT JH.JOB_ID FROM JOB_HISTORY JH RIGHT OUTER JOIN JOBS J ON JH.JOB_ID=J.JOB_ID;
B. SELECT JOB_ID FROM JOB_HISTORY JH RIGHT OUTER JOIN JOBS J ON (JH.JOB_ID=J.JOB_ID);
C. SELECT JOB_HISTORY.JOB_ID FROM JOB_HISTORY OUTER JOIN JOBS ON JOB_HISTORY.JOB_ID=JOBS.JOB_ID;
D. None of the above
Generate a Cartesian Product of Two or More Tables
10. If the REGIONS table, which contains 4 rows, is cross joined to the COUNTRIES table, which contains 25 rows, how many rows appear in the final results set? (Choose the best answer.)
A. 100 rows
B. 4 rows
C. 25 rows
D. None of the above
LAB QUESTION
Using SQL Developer or SQL*Plus, connect to the OE schema and complete the following tasks.
You are required to produce a report of customers who purchased products with list prices of more than $1,000. The report must contain customer first and last names, as well as the product names and their list prices. Customer information is stored in the CUSTOMERS table, which has the CUSTOMER_ID column as its primary key. The product name and list price details are stored in the PRODUCT_INFORMATION table with the PRODUCT_ID column as its primary key. Two other related tables may assist in generating the required report: the ORDERS table, which stores the CUSTOMER_ID and ORDER_ID information, and the ORDER_ITEMS table, which stores the PRODUCT_ID values associated with each ORDER_ID.
There are several approaches to solving this question. Your approach may differ from the solution listed.
SELF TEST ANSWERS
Write SELECT Statements to Access Data from More Than One Table Using Equijoins and Nonequijoins
1. D. The queries in B and C incorrectly contain the NATURAL keyword. If this is removed, they will join the DEPARTMENTS and EMPLOYEES tables based on the DEPARTMENT_ID column.
A, B, and C are incorrect. A performs a natural join that implicitly joins the two tables on all columns with identical names which, in this case, are DEPARTMENT_ID and MANAGER_ID.
2. A, B, and C. These clauses demonstrate different techniques to join the tables on both the DEPARTMENT_ID and MANAGER_ID columns.
D is incorrect.
3. B. The join condition is an expression based on the less than inequality operator. Therefore, this join is a nonequijoin.
A, C, and D are incorrect. A would be correct if the operator in the join condition expression was an equality operator. The CROSS JOIN keywords or the absence of a join condition would result in C being true. D would be true if one of the OUTER JOIN clauses was used instead of the JOIN…ON clause.
4. A. This statement demonstrates the correct usage of the JOIN…USING clause.
B, C, and D are incorrect. B is incorrect since only nonqualified column names are allowed in the brackets after the USING keyword. C is incorrect because the column in brackets after the USING keyword cannot be referenced with a qualifier in the SELECT clause.
5. B demonstrates the correct usage of the JOIN…ON clause.
A, C, and D are incorrect. A is incorrect since the CROSS JOIN clause cannot contain the ON keyword. C is incorrect since the OUTER JOIN keywords must be preceded by the LEFT, RIGHT, or FULL keyword.
6. C. The JOIN…ON clause and the other join clauses may all be used for joins between multiple tables. The JOIN…ON and JOIN…USING clauses are better suited for N-way table joins.
A, B, and D are incorrect. A is false since you may join as many tables as you wish. A Cartesian product is not created since there are two join conditions and three tables.
Join a Table to Itself Using a Self-Join
7. B. Three rows are returned. For the row with a REGION_ID value of 2, the REGION_NAME is Asia and half the length of the REGION_NAME is also 2. Therefore, this row is returned. The same logic results in the rows with REGION_ID values of three and four and REGION_NAME values of Europe and Americas being returned.
A, C, and D are incorrect.
View Data That Does Not Meet a Join Condition Using Outer Joins
8. A. The right outer join fetches the COUNTRIES rows that the inner join between the LOCATIONS and COUNTRIES tables have excluded in addition to the inner join results. The WHERE clause then restricts the results by eliminating these inner join results. This leaves the rows from the COUNTRIES table with which no records from the LOCATIONS table records are associated.
B, C, and D are incorrect.
9. A. This statement demonstrates the correct use of the RIGHT OUTER JOIN…ON clause.
B, C, and D are incorrect. The JOB_ID column in the SELECT clause in B is not qualified and is therefore ambiguous since the table from which this column comes is not specified. C uses an OUTER JOIN without the keywords LEFT, RIGHT, or FULL.
Generate a Cartesian Product of Two or More Tables
10. A. The cross join associates every four rows from the REGIONS table 25 times with the rows from the COUNTRIES table yielding a result set that contains 100 rows.
B, C, and D are incorrect.
LAB ANSWER
Using SQL Developer or SQL*Plus, connect to the OE schema, and complete the following tasks. There are several approaches to solving this question. Your approach may differ from the following solution listed.
1. Start SQL Developer and connect to the OE schema.
2. The SELECT list consists of four columns from two tables, which will be associated with each other using several joins. The SELECT clause is
3. The FROM clause is
4. The WHERE clause is
5. The JOIN clauses are interesting since the PRODUCT_INFORMATION and CUSTOMERS tables not directly related. They are related through two other tables.
6. The ORDERS table must first be joined to the CUSTOMERS table based on common CUSTOMER_ID values. The first join clause following the FROM CUSTOMERS clause is
7. This set must then be joined to the ORDER_ITEMS table based on common ORDER_ID values since the ORDER_ITEMS table can ultimately link to the PRODUCT_INFORMATION table. The second join clause is
8. The missing link to join to the PRODUCT_INFORMATION table based on common PRODUCT_ID column values is now available. The third join clause is
9. Executing this statement returns the report required as shown in the following illustration: