Chapter 25
IN THIS CHAPTER
Verifying the structure of your database
Using test databases
Scrutinizing any queries containing joins
Examining queries containing subselects
Using GROUP BY with the SET functions
Being aware of restrictions on the GROUP BY clause
Using parentheses in expressions
Protecting your database by controlling privileges
Backing up your database regularly
Anticipating and handling errors
A database can be a virtual treasure trove of information, but like the treasure of the Caribbean pirates of long ago, the stuff that you really want is probably buried and hidden from view. The SQL SELECT
statement is your tool for digging up this hidden information. Even if you have a clear idea of what you want to retrieve, translating that idea into SQL can be a challenge. If your formulation is just a little off, you may end up with the wrong results — but results that are so close to what you expected that they mislead you. To reduce your chances of being misled, use the following ten principles.
If you retrieve data from a database and your results don't seem reasonable, check the database design. Many poorly designed databases are in use, and if you’re working with one, fix the design before you try any other remedy. Remember — good design is a prerequisite of data integrity.
Create a test database that has the same structure as your production database, but with only a few representative rows in the tables. Choose the data so that you know in advance what the results of your queries should be. Run each test query on the test data and see whether the results match your expectations. If they don’t, you may need to reformulate your queries. If a query is properly formulated but you end up with bad results all the same, you may need to restructure your database.
Build several sets of test data and be sure to include odd cases, such as empty tables and extreme values at the very limit of allowable ranges. Try to think of unlikely scenarios and check for proper behavior when they occur. In the course of checking for unlikely cases, you may gain insight into problems that are more likely to happen.
Joins are notoriously counterintuitive. If your query contains one, make sure that it’s doing what you expect before you add WHERE
clauses or other complicating factors.
Queries with subselects take data from one table and, based on what is retrieved, take some data from another table. Therefore, by definition, such queries can really be hard to get right. Make sure the data that the inner SELECT
retrieves is the data that the outer SELECT
needs to produce the desired result. If you have two or more levels of subselects, you need to be even more careful.
Say that you have a table (NATIONAL) that contains the name (Player
), team (Team
), and number of home runs hit (Homers
) by every baseball player in the National League. You can retrieve the team homer total for all teams with a query like this:
SELECT Team, SUM (Homers)
FROM NATIONAL
GROUP BY Team ;
This query lists each team, followed by the total number of home runs hit by all that team's players.
Suppose that you want a list of National League power hitters. Consider the following query:
SELECT Player, Team, Homers
FROM NATIONAL
WHERE Homers >= 20
GROUP BY Team ;
In most implementations, this query returns an error. Generally, only columns used for grouping or columns used in a set function may appear in the select list. However, if you want to view this data, the following formulation works:
SELECT Player, Team, Homers
FROM NATIONAL
WHERE Homers >= 20
GROUP BY Team, Player, Homers ;
Because all the columns you want to display appear in the GROUP BY
clause, the query succeeds and delivers the desired results. This formulation sorts the resulting list first by Team
, then by Player
, and finally by Homers
.
Sometimes when you mix AND
and OR
, SQL doesn't process the expression in the order that you expect. Use parentheses in complex expressions to make sure that you get the desired results. Typing a few extra keystrokes is a small price to pay for better results.
Many people don’t use the security features available in their DBMS. They don’t want to bother with them because they think misuse and misappropriation of data are things that only happen to other people. Don’t wait to get burned. Establish and maintain security for all databases that have any value.
Understatement alert: Data is hard to retrieve after a power surge, a fire, an earthquake, or some other disaster destroys your hard drive. (Remember, sometimes computers just die for no good reason.) Make frequent backups and put the backup media in a safe place.
Whether you’re making ad hoc queries from a workstation or embedding queries in an application, occasionally SQL returns an error message rather than the desired results. At a workstation, you can decide what to do next, based on the message returned. In an application, the situation is different. The application user probably doesn’t know what action is appropriate. Put extensive error handling into your applications to cover every conceivable error that may occur. Creating error-handling code takes a great deal of effort, but it’s better than having the user stare quizzically at a frozen screen.