What You’ll Learn in This Hour:
▶ Defining SQL statement tuning
▶ Comparing database tuning and SQL statement tuning
▶ Properly joining tables
▶ Understanding the problems of full table scans
▶ Invoking the use of indexes
▶ Avoiding the use of OR
and HAVING
▶ Avoiding large sort operations
In this hour, you learn how to tune your SQL statement for maximum performance using some simple methods. Although so far this book has focused on how to write SQL, it is just as important to learn how to write efficient SQL that can help keep the database running optimally. This hour focuses on simple steps you can take when working with various queries to ensure that your SQL performs optimally.
SQL statement tuning is the process of optimally building SQL statements to achieve results in the most effective and efficient manner. SQL tuning begins with the basic arrangement of the elements in a query. Simple formatting can play a large role in optimizing a statement.
SQL statement tuning mainly involves tweaking a statement’s FROM
and WHERE
clauses. The database server decides how to evaluate a query using mostly these two clauses. To this point, you have learned the basics of the FROM
and WHERE
clauses; now it is time to fine-tune them for better results and happier users.
Before learning about SQL statement tuning, you need to understand the difference between tuning a database and tuning the SQL statements that access the database.
Database tuning is the process of tuning the actual database, which encompasses the allocated memory, disk usage, CPU, I/O, and underlying database processes. Tuning a database also involves the management and manipulation of the database structure, such as the design and layout of tables and indexes. In addition, database tuning often involves modification of the database architecture to optimize the use of the hardware resources available. You need to take many other considerations into account when tuning a database, but the database administrator (DBA) and system administrator normally accomplish these tasks. The objective of database tuning is to ensure that the database has been designed in a way that best accommodates the expected activity within the database.
SQL tuning is the process of tuning the SQL statements that access the database. These SQL statements include database queries and transactional operations, such as inserts, updates, and deletes. The objective of SQL statement tuning is to formulate statements that most effectively access the database in its current state, taking advantage of database and system resources and indexes. The objective is to reduce the operational overhead of executing the query on the database.
Note
Tuning Is Not One-Dimensional
You must perform both database tuning and SQL statement tuning to achieve optimal results when accessing the database. A poorly tuned database might render your efforts in SQL tuning useless, and vice versa. Ideally, you should first tune the database, ensure that indexes exist where needed, and then tune the SQL code.
Formatting your SQL statement sounds like an obvious task, but it is worth mentioning. A newcomer to SQL will probably neglect to address several considerations when building a SQL statement. The upcoming sections discuss the following points—some are common sense and others are not so obvious:
▶ The format of SQL statements for readability
▶ The order of tables in the FROM
clause
▶ The placement of the most restrictive conditions in the WHERE
clause
▶ The placement of join conditions in the WHERE
clause
Formatting a SQL statement for readability is fairly obvious, but many SQL statements are not written neatly. The neatness of a statement does not affect the actual performance (the database does not care how neat the statement appears), but careful formatting is the first step in tuning a statement. When you look at a SQL statement with tuning intentions, making the statement readable is always the first priority. How can you determine whether the statement is well written if it is difficult to read?
Consider some basic rules for making a statement readable:
▶ Always begin a new line with each clause in the statement. For example, place the FROM
clause on a separate line from the SELECT
clause. Then place the WHERE
clause on a separate line from the FROM
clause, and so on.
▶ Use tabs or spaces for indentation when arguments of a clause in the statement exceed one line.
▶ Use tabs and spaces consistently.
▶ Use table aliases when multiple tables are used in the statement. Using the full table name to qualify each column in the statement quickly clutters the statement and makes reading it difficult.
▶ Use remarks sparingly in SQL statements (if they are available within your specific implementation). Remarks are great for documentation, but too many of them clutter a statement.
▶ Begin a new line with each column name in the SELECT
clause if many columns are selected.
▶ Begin a new line with each table name in the FROM
clause if many tables are used.
▶ Begin a new line with each condition of the WHERE
clause. You can easily see all conditions of the statement and the order in which they are used.
Following is an example of a statement that is hard to decipher:
SQL> select birds.bird_name, 2 birds.incubation + birds.fledging "PARENTING", food.food_name, 3 migration.migration_location 4 from birds, food, birds_food, migration, birds_migration 5 where birds.bird_id = birds_food.bird_id 6 and food.food_id = birds_food.food_id 7 and birds.bird_id = birds_migration.bird_id 8 and migration.migration_id = birds_migration.migration_id 9 and birds.wingspan > 48 10 and birds.incubation + birds.fledging > 60 11 and migration.migration_location not in ('Mexico', 'Central America') 12 and food.food_name = 'Fish' 13 order by birds.bird_name; BIRD_NAME PARENTING FOOD_NAME MIGRATION_LOCATION ------------------- ---------- ---------- ----------------------- Bald Eagle 126 Fish Southern United States Brown Pelican 107 Fish No Significant Migration Common Loon 111 Fish Southern United States Double-crested Cormorant 71 Fish Southern United States Golden Eagle 125 Fish No Significant Migration Great Blue Heron 88 Fish Southern United States Great Blue Heron 88 Fish South America Great Egret 75 Fish Southern United States Osprey 100 Fish Southern United States Osprey 100 Fish South America Ring-billed Gull 61 Fish Southern United States 11 rows selected.
Here the statement has been reformatted for improved readability:
SQL> select b.bird_name, 2 b.incubation + fledging "PARENTING", 3 f.food_name, 4 m.migration_location 5 from birds b, 6 food f, 7 birds_food bf, 8 migration m, 9 birds_migration bm 10 where b.bird_id = bf.bird_id 11 and f.food_id = bf.food_id 12 and b.bird_id = bm.bird_id 13 and m.migration_id = bm.migration_id 14 and b.wingspan > 48 15 and b.incubation + b.fledging > 60 16 and m.migration_location not in ('Mexico', 'Central America') 17 and f.food_name = 'Fish' 18 order by bird_name;
Both statements have the same content, but the second statement is much more readable. It has been greatly simplified through the use of table aliases, which have been defined in the query’s FROM
clause. In addition, the second statement aligns the elements of each clause, making each clause stand out.
Again, making a statement more readable does not directly improve its performance, but it assists you in making modifications and debugging a lengthy and otherwise complex statement. Now you can easily identify the columns selected, the tables used, the table joins performed, and the conditions placed on the query.
Note
Always Establish Standards
It is especially important to establish coding standards in a multiuser programming environment. If all code is consistently formatted, shared code and modifications to code are much easier to manage.
FROM
ClauseThe arrangement or order of tables in the FROM
clause can make a difference, depending on how the optimizer reads the SQL statement. For example, experienced users have found that listing the smaller tables first and the larger tables last is more efficient.
Take a look at an example of the FROM
clause:
FROM SMALLEST TABLE, LARGEST TABLE
Note
Check for Performance When Using Multiple Tables
Check your particular implementation for performance tips, if any, when listing multiple tables in the FROM
clause.
As you learned in Hour 14, “Joining Tables in Queries,” most joins use a base table to link tables that have one or more common columns on which to join. The base table is the main table that most or all tables are joined to in a query. The column from the base table is normally placed on the right side of a join operation in the WHERE
clause. The tables joined to the base table are normally listed in order from smallest to largest, similar to the tables listed in the FROM
clause.
If a base table does not exist, the tables should be listed from smallest to largest, with the largest tables on the right side of the join operation in the WHERE
clause. The join conditions should be in the first position(s) of the WHERE
clause, followed by the filter clause(s), as shown here:
FROM TABLE1, Smallest table TABLE2, to TABLE3 Largest table, also base table WHERE TABLE1.COLUMN = TABLE3.COLUMN Join condition AND TABLE2.COLUMN = TABLE3.COLUMN Join condition [ AND CONDITION1 ] Filter condition [ AND CONDITION2 ] Filter condition
Caution
Be Restrictive with Your Joins
Because joins typically return a high percentage of rows of data from the table(s), you should evaluate join conditions after more restrictive conditions.
In this example, TABLE3
is used as the base table. TABLE1
and TABLE2
are joined to TABLE3
for both simplicity and proven efficiency.
The most restrictive condition is typically the driving factor in achieving optimal performance for a SQL query. The most restrictive condition is the condition in the WHERE
clause of a statement that returns the fewest rows of data. Conversely, the least restrictive condition is the condition in a statement that returns the most rows of data. This hour is concerned with the most restrictive condition simply because it does the most filtering of the data that is to be returned by the query.
Your goal is for the SQL optimizer to evaluate the most restrictive condition first because the condition returns a smaller subset of data, thus reducing the query’s overhead. Effectively placing the most restrictive condition in the query requires knowing how the optimizer operates. In some cases, the optimizers seem to read from the bottom of the WHERE
clause up. Therefore, you want to place the most restrictive condition last in the WHERE
clause, which is the condition that the optimizer reads first. The following example shows how to structure the WHERE
clause based on the restrictiveness of the conditions and the FROM
clause based on the size of the tables:
FROM TABLE1, Smallest table TABLE2, to TABLE3 Largest table, also base table WHERE TABLE1.COLUMN = TABLE3.COLUMN Join condition AND TABLE2.COLUMN = TABLE3.COLUMN Join condition [ AND CONDITION1 ] Least restrictive [ AND CONDITION2 ] Most restrictive
Caution
Always Test Your WHERE Clauses
If you do not know how your particular implementation’s SQL optimizer works, the DBA also does not know, or you do not have sufficient documentation, you can execute a large query that takes a while to run and then rearrange conditions in the WHERE
clause. Be sure to record the time it takes the query to complete each time you make changes. You should have to run only a couple tests to figure out whether the optimizer reads the WHERE
clause from the top to bottom or the bottom to top. Turn off database caching during the testing for more accurate results.
Following is an example using a phony table:
Table |
|
Row Count |
|
Conditions |
|
|
|
|
|
|
|
Most Restrictive Condition |
|
Following is the first query:
SELECT COUNT(*) FROM TEST WHERE LASTNAME = 'SMITH' AND STATE = 'IN'; COUNT(*) ---------- 1,024
This is the second query:
SELECT COUNT(*) FROM TEST WHERE STATE = 'IN' AND LASTNAME = 'SMITH'; COUNT(*) ---------- 1,024
Suppose that the first query completed in 20 seconds, whereas the second query completed in 10 seconds. Because the second query returned faster results and the most restrictive condition was listed last in the WHERE
clause, you can safely assume that the optimizer reads the WHERE
clause from the bottom up.
Note
Try to Use Indexed Columns
Using an indexed column as the most restrictive condition in a query is a good practice. Indexes generally improve a query’s performance.
A full table scan occurs when an index is not used by the query engine or no index is present for the table(s) being used. Full table scans usually return data much more slowly than when an index is used. The larger the table, the slower that data is returned when a full table scan is performed. The query optimizer decides whether to use an index when executing the SQL statement. In most cases, the index is used if it exists.
Some implementations have sophisticated query optimizers that can decide whether to use an index. Decisions such as this are based on statistics that are gathered on database objects, such as the size of an object and the estimated number of rows that are returned by a condition with an indexed column. Refer to your implementation documentation for specifics on the decision-making capabilities of your relational database’s optimizer.
Avoid full table scans when reading large tables. For example, a full table scan is performed when a table that does not have an index is read, which usually takes considerably more time to return the data. For most larger tables, consider using an index. For small tables, as previously mentioned, the optimizer might choose the full table scan instead of the index if the table is indexed. For a small table with an index, consider dropping the index and reserving that space for other needy objects in the database.
Tip
Simple Ways to Avoid Table Scans
The easiest and most obvious way to avoid a full table scan—aside from ensuring that indexes exist on the table—is to use conditions in a query’s WHERE
clause to filter data to be returned.
This is a good time for a reminder of data that should be indexed. Consider, for example, a book written about the BIRDS
database: You would never index words or data such as and, the, bird, a specific weight, wingspan, and so forth. Indexing columns that do not contain many unique values is not beneficial. Instead, you want to index columns that tend to have many unique values, especially those used to search for data:
▶ Columns used as primary keys
▶ Columns used as foreign keys
▶ Columns frequently used to join tables
▶ Columns frequently used as conditions in a query
▶ Columns that have a high percentage of unique values
Tip
Table Scans Are Not Always Bad
Sometimes full table scans are beneficial. You want to perform them on queries against small tables or queries whose conditions return a high percentage of rows. The easiest way to force a full table scan is to avoid creating an index on the table.
Other performance considerations come into play as well when tuning SQL statements. The next sections discuss the following concepts:
▶ Using the LIKE
operator and wildcards
▶ Avoiding the OR
operator
▶ Avoiding the HAVING
clause
▶ Avoiding large sort operations
▶ Using stored procedures
▶ Disabling indexes during batch loads
LIKE
Operator and WildcardsThe LIKE
operator is a useful tool that places conditions on a query in a flexible manner. Using wildcards in a query can eliminate many possibilities of data that should be retrieved. Wildcards are flexible for queries that search for similar data (data that is not equivalent to an exact value specified).
Suppose you want to write a query using the fictitious table EMPLOYEE_TBL
and selecting the EMP_ID
, LAST_NAME
, FIRST_NAME
, and STATE
columns. You need to know the employee identification, name, and state for all the employees with the last name Stevens. Three SQL statements with different wildcard placements serve as examples.
Query 1 follows:
SELECT EMPLOYEEID, LASTNAME, FIRSTNAME, STATE FROM EMPLOYEES WHERE LASTNAME LIKE 'STEVENS';
Next is Query 2:
SELECT EMPLOYEEID, LASTNAME, FIRSTNAME, STATE FROM EMPLOYEES WHERE LASTNAME LIKE '%EVENS%';
The final query is Query 3:
SELECT EMPLOYEEID, LASTNAME, FIRSTNAME, STATE FROM EMPLOYEES WHERE LASTNAME LIKE 'ST%';
The SQL statements do not necessarily return the same results. More than likely, Query 1 returns fewer rows than the other two queries and takes advantage of indexing. Query 2 and Query 3 are less specific about the desired returned data, thus making them slower than Query 1. In addition, Query 3 is probably faster than Query 2 because it specifies the first letters of the string for which you are searching. (Additionally, the column LASTNAME
is likely to be indexed.) Query 3 could thus potentially take advantage of an index.
With Query 1, you might retrieve all individuals with the last name Stevens—but Stevens can be spelled in different ways. Query 2 picks up all individuals with the last name Stevens and its various spellings. Query 3 also picks up any last name that starts with ST
; this is the only way to ensure that you receive all the Stevens (or Stephens).
OR
OperatorRewriting the SQL statement using the IN
predicate instead of the OR
operator consistently and substantially improves data retrieval speed. Your implementation tells you about tools you can use to time or check the performance between the OR
operator and the IN
predicate. This section gives you an example of how to rewrite a SQL statement by removing the OR
operator and replacing it with the IN
predicate. Refer to Hour 13, “Using Operators to Categorize Data,” for more on using the OR
operator and the IN
predicate.
The following query uses the OR
operator:
SELECT EMPLOYEEID, LASTNAME, FIRSTNAME FROM EMPLOYEES WHERE CITY = 'INDIANAPOLIS IN' OR CITY = 'KOKOMO' OR CITY = 'TERRE HAUTE';
This is the same query using the IN
operator:
SELECT EMPLOYEEID, LASTNAME, FIRSTNAME FROM EMPLOYEES WHERE CITY IN ('INDIANAPOLIS IN', 'KOKOMO', 'TERRE HAUTE');
The SQL statements retrieve the same data. However, through testing and experience, you will find that the data retrieval is measurably faster by replacing OR
conditions with the IN
predicate, as in the second query.
HAVING
ClauseThe HAVING
clause is useful for parsing the result of a GROUP BY
clause; however, you can’t use it without cost. Using the HAVING
clause gives the SQL optimizer extra work, which results in extra time. The query must be concerned not only with grouping result sets, but also with parsing those result sets down through the restrictions of the HAVING
clause. The queries performed in the BIRDS
database are fairly simple and do not touch many rows of data. However, adding the HAVING
clause in a larger database can introduce some overhead, especially when the HAVING
clause has more complex logic and a higher number of groupings to be applied. If possible, write SQL statements without using the HAVING
clause, or design the HAVING
clause restrictions so that they are as simple as possible.
Large sort operations mean using the ORDER BY
, GROUP BY
, and HAVING
clauses. Subsets of data must be stored in memory or to disk (if not enough space is available in allotted memory) whenever sort operations are performed. You must sort data often. The main point is that these sort operations affect a SQL statement’s response time. Because you cannot always avoid large sort operations, it is best to schedule queries with large sorts as periodic batch processes during off-peak database usage; this ensures that the performance of most user processes is not affected.
You should create stored procedures for SQL statements that are executed on a regular basis—particularly large transactions or queries. Stored procedures are SQL statements that are compiled and permanently stored in the database in an executable format.
Normally, when a SQL statement is issued in the database, the database must check the syntax and convert the statement into an executable format within the database (called parsing). After it is parsed, the statement is stored in memory; however, it is not permanent. When other operations need memory, the statement might be ejected from memory. For stored procedures, the SQL statement is always available in an executable format and remains in the database until it is dropped like any other database object.
Not everyone agrees on whether using views improves or degrades query performance. Mostly this depends on the situation. Consider using views to improve performance. It is easy to test a query with or without views, and quite often you might find a huge performance opportunity. If used under the right circumstances, views create a subset of data from base database tables using the indexes that have already been defined and then create a subset of data that is stored in memory on the server instead of hard disk space. Memory is much faster to access than physical disk space. With a view, you also end up querying a small percentage of the data that exists in the base database tables, which typically improves query performance.
When a user submits a transaction to the database (INSERT
, UPDATE
, or DELETE
), an entry is made to both the database table and any indexes associated with the table being modified. This means that if the EMPLOYEES
table has an index and a user updates the EMPLOYEES
table, an update also occurs to the index associated with the EMPLOYEES
table. In a transactional environment, having a write to an index occur every time a write to the table occurs is usually not an issue.
During batch loads, however, an index can cause serious performance degradation. A batch load might consist of hundreds, thousands, or millions of manipulation statements or transactions. Because of their volume, batch loads take a long time to complete and are normally scheduled during off-peak hours—usually during weekends or evenings. Sometimes, simply dropping the indexes on tables associated with batch loads can dramatically decrease the time it takes a batch load to complete. Of course, you would need to re-create the indexes on the tables after the batch load. When you drop the indexes, changes are written to the tables much faster, so the job completes more quickly. When the batch load is complete, you should rebuild the indexes. During the rebuild, the indexes are populated with all the appropriate data from the tables. Although it might take a while for an index to be created on a large table, the overall time expended is less if you drop the index and rebuild it.
Another advantage to rebuilding an index after a batch load completes is the reduction of fragmentation that is found in the index. When a database grows, records are added, removed, and updated. Fragmentation can then occur. For any database that experiences a lot of growth, it is a good idea to periodically drop and rebuild large indexes. When you rebuild an index, the number of physical extents that comprise the index decreases, less disk I/O is involved to read the index, the user gets results more quickly, and everyone is happy.
Often you inherit a database that is in need of SQL statement tuning. These existing systems might have thousands of SQL statements executing at any given time. To optimize the amount of time spent on performance tuning, you need a way to determine what queries are most beneficial. This is where cost-based optimization comes into play. Cost-based optimization attempts to determine which queries are most costly in relation to the overall system resources spent. For instance, say you measure cost by execution duration and you have the following two queries with their corresponding run times:
SELECT * FROM EMPLOYEES WHERE FIRSTNAME LIKE '%LE%' 2 sec SELECT * FROM EMPLOYEES WHERE FIRSTNAME LIKE 'G%'; 1 sec
Initially, it might appear that the first statement is the one you need to concentrate your efforts on. However, what if the second statement is executed 1,000 times an hour but the first is performed only 10 times in the same hour? This makes a huge difference in how you allocate your time.
Cost-based optimization ranks SQL statements in order of total computational cost. Computational cost is easily determined based on some measure of query execution (duration, number of reads, and so on) multiplied by the number of executions over a given period:
Total Computational Cost = Execution Measure × (Number of Executions)
You get the most overall benefit by first tuning the queries with the most total computational cost. Looking at the previous example, if you cut each statement execution time in half, you can easily figure out the total computational savings:
Statement #1: 1 second × 10 executions = 10 seconds of computational savings
Statement #2: .5 second × 1000 executions = 500 seconds of computational savings
Now you can more easily understand why you should spend your valuable time on the second statement instead of the first. Not only have you worked to optimize your database, but you’ve optimized your time as well.
Tip
Performance Tools
Many relational databases have built-in tools that assist in SQL statement database performance tuning. For example, Oracle has a tool called EXPLAIN PLAN
that shows the user the execution plan of a SQL statement. Another tool in Oracle that measures the actual elapsed time of a SQL statement is TKPROF
. In SQL Server, the Query Analyzer has several options to give you an estimated execution plan or statistics from the executed query. Check with your DBA and implementation documentation for more information on available tools.
In this hour, you learned the meaning of tuning SQL statements in a relational database. You learned about two basic types of tuning: database tuning and SQL statement tuning—both of which are vital to the efficient operation of the database and the SQL statements within it. Each type of tuning is equally important and the overall performance of the database cannot be optimized without the other.
You read about methods for tuning a SQL statement, starting with a statement’s actual readability, which does not directly improve performance but does aid the programmer in the development and management of statements. One of the main issues that arises in SQL statement performance involves the use of indexes. Sometimes you should use indexes; other times you should avoid them. For all measures taken to improve SQL statement performance, you need to understand the data itself, the database design and relationships, and the users’ needs in accessing the database.
Q. By following what I have learned about performance, what realistic performance gains can I expect to see in data retrieval time?
A. Realistically, you can see performance gains from fractions of a second to minutes, hours, or even days.
Q. How can I test my SQL statements for performance?
A. Each implementation should have a tool or system to check performance. Oracle7 was used to test the SQL statements in this book. Oracle offers several tools for checking performance, including the EXPLAIN PLAN
, TKPROF
, and SET
commands. Check your particular implementation for tools that are similar to Oracle’s.
The following workshop consists of a series of quiz questions and practical exercises. The quiz questions are designed to test your overall understanding of the current material. The practical exercises give you the opportunity to apply the concepts discussed during the current hour, as well as build on the knowledge you acquired in previous hours of study. Be sure to complete the quiz questions and exercises before continuing to the next hour. Refer to Appendix C, “Answers to Quizzes and Exercises,” for answers.
1. Is using a unique index on a small table beneficial?
2. What happens when the optimizer chooses not to use an index on a table when a query has been executed?
3. Should the most restrictive clause(s) be placed before or after the join condition(s) in the WHERE
clause?
4. When is the LIKE
operator considered bad in terms of performance?
5. How can you optimize batch load operations in terms of indexes?
6. Which three clauses are the cause of sort operations that degrade performance?
1. Rewrite the following SQL statements to improve their performance. Use the fictitious EMPLOYEE_TBL
and EMPLOYEE_PAY_TBL
, as described here:
EMPLOYEE_TBL EMP_ID VARCHAR(9) NOT NULL Primary key, LAST_NAME VARCHAR(15) NOT NULL, FIRST_NAME VARCHAR(15) NOT NULL, MIDDLE_NAME VARCHAR(15), ADDRESS VARCHAR(30) NOT NULL, CITY VARCHAR(15) NOT NULL, STATE VARCHAR(2) NOT NULL, ZIP INTEGER(5) NOT NULL, PHONE VARCHAR(10), PAGER VARCHAR(10), CONSTRAINT EMP_PK PRIMARY KEY (EMP_ID) EMPLOYEE_PAY_TBL EMP_ID VARCHAR(9) NOT NULL primary key, POSITION VARCHAR(15) NOT NULL, DATE_HIRE DATETIME, PAY_RATE DECIMAL(4,2) NOT NULL, DATE_LAST_RAISE DATETIME, SALARY DECIMAL(8,2), BONUS DECIMAL(8,2), CONSTRAINT EMP_FK FOREIGN KEY (EMP_ID) REFERENCES EMPLOYEE_TBL (EMP_ID)
SELECT EMP_ID, LAST_NAME, FIRST_NAME, PHONE FROM EMPLOYEE_TBL WHERE SUBSTRING(PHONE, 1, 3) = '317' OR SUBSTRING(PHONE, 1, 3) = '812' OR SUBSTRING(PHONE, 1, 3) = '765';
SELECT LAST_NAME, FIRST_NAME FROM EMPLOYEE_TBL WHERE LAST_NAME LIKE '%ALL%;
SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.SALARY FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP WHERE LAST_NAME LIKE 'S%' AND E.EMP_ID = EP.EMP_ID;
2. Add another table called EMPLOYEE_PAYHIST_TBL
that contains a large amount of pay history data. Use the following table to write a series of SQL statements to address the following problems. Be sure to ensure that the queries you write perform well.
EMPLOYEE_PAYHIST_TBL PAYHIST_ID VARCHAR(9) NOT NULL primary key, EMP_ID VARCHAR(9) NOT NULL, START_DATE DATETIME NOT NULL, END_DATE DATETIME, PAY_RATE DECIMAL(4,2) NOT NULL, SALARY DECIMAL(8,2) NOT NULL, BONUS DECIMAL(8,2) NOT NULL, CONSTRAINT EMP_FK FOREIGN KEY (EMP_ID) REFERENCES EMPLOYEE_TBL (EMP_ID)
Find the SUM
of the salaried versus nonsalaried employees, by the year in which their pay started.
Find the difference in the yearly pay of salaried employees versus nonsalaried employees, by the year in which their pay started. Consider the nonsalaried employees to be working full time during the year (PAY_RATE
× 52 × 40).
Find the difference in what employees make now versus what they made when they started with the company. Again, consider the nonsalaried employees to be full time. Also consider that the employees’ current pay is reflected in EMPLOYEE_PAY_TBL
and also EMPLOYEE_PAYHIST_TBL
. In the pay history table, the current pay is reflected as a row with the END_DATE
for pay equal to NULL
.