Hour 10

Manipulating Data

What You’ll Learn in This Hour:

In this hour, you learn how to use the piece of SQL known as Data Manipulation Language (DML) to change data and tables in a relational database.

Getting an Overview of Data Manipulation

DML is the part of SQL that enables a database user to actually propagate changes among data in a relational database. With DML, the user can populate tables with new data, update existing data in tables, and delete data from tables. Simple database queries can also be performed within a DML command.

SQL uses three basic DML commands:

  •    INSERT

  •    UPDATE

  •    DELETE

The SELECT command, which can be used with DML commands, is discussed in more detail in Hour 12, “Introduction to Database Queries.” The SELECT command is the basic query command that you can use after you enter data into the database with the INSERT command. In this hour, you concentrate on getting the data into your tables so that you have something interesting to use the SELECT command on.

Populating Tables with New Data

Populating a table with data is simply the process of entering new data into a table, whether manually using individual commands or through batch processes using programs or other related software. Manual population of data refers to data entry using a keyboard. Automated population normally deals with obtaining data from an external data source (such as another database or possibly a flat file) and loading the obtained data into the database.

Many factors can affect what data and how much data can be put into a table when populating tables with data. Major factors include existing table constraints, the physical table size, column data types, the length of columns, and other integrity constraints, such as primary and foreign keys. The following sections help you learn the basics of inserting new data into a table.

Inserting Data into a Table

You use the INSERT statement to insert new data into a table. This statement has a few options; look at the basic syntax to begin:

INSERT INTO TABLE_NAME
VALUES ('value1', 'value2', [ NULL ] );

Caution

Data Is Case Sensitive

SQL statements can be in upper or lower case. However, data is sometimes case sensitive. For example, if you enter data into the database as upper case, depending on your database, you might have to reference that data in upper case. These examples use both lower- and upper-case statements, just to show that case does not affect the outcome.

Using this INSERT statement syntax, you must include every column in the specified table in the VALUES list. Notice that each value in this list is separated by a comma. Enclose the values inserted into the table in single quotation marks for character and date/time data types. Single quotation marks are not required for numeric data types or NULL values that use the NULL keyword. A value should be present for each column in the table, and those values must be in the same order as the columns are listed in the table. In later sections, you learn how to specify the column ordering; for now just know that the SQL engine you are working with assumes that you want to enter the data in the same order in which the columns were created.

In the following example, you insert a new record into the BIRDS table.

This is the table structure:

 Name               Null?    Type
 ------------------ -------- --------------------------
 BIRD_ID            NOT NULL NUMBER(3)
 BIRD_NAME          NOT NULL VARCHAR2(30)
 HEIGHT             NOT NULL NUMBER(4,2)
 WEIGHT             NOT NULL NUMBER(4,2)
 WINGSPAN                    NUMBER(4,2)
 EGGS               NOT NULL NUMBER(2)
 BROODS             NOT NULL NUMBER(1)
 INCUBATION         NOT NULL NUMBER(2)
 FLEDGING           NOT NULL NUMBER(3)
 NEST_BUILDER       NOT NULL CHAR(1)

This is the sample INSERT statement:

insert into birds (bird_id, bird_name, height, weight, wingspan, eggs, broods,
incubation, fledging, nest_builder)
values (1, 'Great Blue Heron', 52, 5.5, 78, 5, 1, 28, 60, 'B');

1 row created.

In this example, 10 values were inserted into a table with 10 columns. The inserted values are in the same order as the columns listed in the table. Character values are inserted using single quotation marks, and numeric values are inserted without single quotation marks, although this is optional with numeric values.

The following example is the exact same insert statement, but notice that the list of column names is not included. If you do not include a list of columns into which the data is being inserted, the assumption is that you are inserting into all columns in the same order they are listed in the table.

insert into birds
values (1, 'Great Blue Heron', 52, 5.5, 78, 5, 1, 28, 60, 'B');

1 row created.

Note

When to Use Quotation Marks

Although single quotation marks are not required around numeric data that is inserted, they can be used with any data type. In other words, single quotation marks are optional when referring to numeric data values in the database, but they are required for all other data values (data types). Including them is usually a matter of preference, but most SQL users choose not to use quotation marks with numeric values because their queries are more readable without them.

Inserting Data into Limited Columns of a Table

You can insert data into specified columns as well. For instance, suppose that you want to insert data into the BIRDS table for every column except the WINGSPAN column. Remember that if a column is defined as mandatory or not null, data must be inserted into that particular column for every row of data in the table. The WINGSPAN column in the BIRDS table is optional or defined as null, so leaving that column empty is acceptable when inserting a row of data into the BIRDS table.

The syntax for inserting values into a limited number of columns in a table follows:

INSERT INTO TABLE_NAME ('COLUMN1', 'COLUMN2')
VALUES ('VALUE1', 'VALUE2');

This is the sample INSERT statement:

insert into birds (bird_id, bird_name, height, weight, eggs, broods, incubation,
fledging, nest_builder)
values (1, 'Great Blue Heron', 52, 5.5, 5, 1, 28, 60, 'B');

1 row created.

You have specified a column list enclosed by parentheses after the table name in the INSERT statement. You have listed all columns into which you want to insert data. WINGSPAN is the only excluded column. If you look at the table definition, you can see that WINGSPAN does not require data for every record in the table. You know that WINGSPAN does not require data because NULL is specified in the table definition. NULL tells us that NULL values are allowed in the column. Furthermore, the list of values must appear in the same order as the column list.

Tip

Column List Ordering Can Differ

The column list in the INSERT statement does not have to reflect the same order of columns as in the definition of the associated table. However, the list of values must be in the order of the associated columns in the column list. In addition, you can leave off the NULL syntax for a column because the defaults for most RDBMSs specify that columns allow NULL values.

Inserting Data from Another Table

You can insert data into a table based on the results of a query from another table using a combination of the INSERT statement and the SELECT statement. Briefly, a query is an inquiry to the database that either expects or does not expect data to be returned. (See Hour 12, “Introduction to Database Queries,” for more information on queries.) A query is a question that the user asks the database, and the data returned is the answer. When combining the INSERT statement with the SELECT statement, you can insert the data retrieved from a query into a table.

The syntax for inserting data from another table follows:

insert into table_name [('column1', 'column2')]
select [*|('column1', 'column2')]
from table_name
[where condition(s)];

You see three new keywords in this syntax: SELECT, FROM, and WHERE. SELECT is the main command used to initiate a query in SQL. FROM is a clause in the query that specifies the names of tables where the target data should be found. The WHERE clause, also part of the query, places conditions on the query. A condition is a way of placing criteria on data affected by a SQL statement. A sample condition might state this: WHERE LASTNAME = 'SMITH'.

In the following example, let’s say that you want to create a table called SHORT_BIRDS. This table will have some of the information from the original BIRDS table but will store only information about birds that are short, or less than a certain height. Here you see a simple CREATE TABLE statement with three columns that will have data that is to be pulled from the BIRDS table.

SQL> create table short_birds
  2  (bird_id          number(3)        not null,
  3   bird_name        varchar2(30)     not null,
  4   height           number(3));

Table created.

Next, you use the INSERT statement to insert data into the SHORT_BIRDS table based on the output of a SELECT statement, from the BIRDS table itself. Only three columns are being pulled from the BIRDS table: BIRD_ID, BIRD_NAME, and HEIGHT. Notice the WHERE clause on line 4 of this INSERT statement. You are placing a condition on the query within the INSERT statement to consider only birds that have a height less than 20 inches. All rows of data from the BIRDS table will be inserted into the SHORT_BIRDS table if they meet this height criteria. This is a SELECT statement that is embedded within an INSERT statement. All the main components of the SELECT statement are discussed starting in Hour 12.

SQL> insert into short_birds
  2  select bird_id, bird_name, height
  3  from birds
  4  where height < 20;

6 rows created.

Finally, study the results of the following query. You now have a table with information about birds that are shorter than 20 inches in height.

SQL> select * from short_birds;

   BIRD_ID BIRD_NAME                          HEIGHT
---------- ------------------------------ ----------
         8 Belted Kingfisher                      13
        10 Pied-billed Grebe                      13
        11 American Coot                          16
        12 Common Sea Gull                        18
        13 Ring-billed Gull                       19
        17 American Crow                          18

6 rows selected.

Inserting NULL Values

Inserting a NULL value into a column of a table is simple. You might want to insert a NULL value into a column if the value of the column in question is unknown. For instance, not every bird in the database has a logged wingspan. You can insert a NULL value into a column of a table using the keyword NULL.

The syntax for inserting a NULL value follows:

insert into schema.table_name values
('column1', NULL, 'column3');

Use the NULL keyword in the proper sequence of the associated column that exists in the table. That column will not have data in it for that row if you enter NULL. In the syntax, a NULL value is entered in the place of COLUMN2.

Updating Existing Data

You can modify existing data in a table using the UPDATE command. This command does not add new records to a table, nor does it remove records—UPDATE simply updates the existing data. The update is generally used to update one table at a time in a database, but you also can use it to update multiple columns of a table at the same time. Depending on your needs, you can use a single statement to update an individual row of data in a table or numerous rows.

Updating the Value of a Single Column

The simplest use of the UPDATE statement is to update a single column in a table. Either a single row of data or numerous records can be updated when updating a single column in a table.

The syntax for updating a single column follows:

update table_name
set column_name = 'value'
[where condition];

Take a look at the MIGRATION table. This table has six rows of data that indicate different locations to which the birds in the database migrate.

SQL> select * from migration;
MIGRATION_ID MIGRATION_LOCATION
------------ ------------------------------
           1 Southern United States
           2 Mexico
           3 Central America
           4 South America
           5 No Significant Migration
           6 Partial, Open Water

6 rows selected.

Now let’s update the MIGRATION table for MIGRATION_ID of 5, which has a MIGRATION_LOCATION value of No significant migration. The following UPDATE statement updates the value of the MIGRATION_LOCATION to None for that specific MIGRATION_ID.

SQL> update migration
  2  set migration_location = 'None'
  3  where migration_id = 5;

1 row updated.

After the update, you can query the MIGRATION table with the SELECT statement to verify that the MIGRATION_LOCATION value was successfully changed as intended.

SQL> select * from migration;

MIGRATION_ID MIGRATION_LOCATION
------------ ------------------------------
           3 Central America
           2 Mexico
           5 None
           6 Partial, Open Water
           4 South America
           1 Southern United States

6 rows selected.

Now let’s look at an example of performing an UPDATE on a table without using the where clause. First, take a look at the table you previously created, called SHORT_BIRDS.

SQL> select * from short_birds;

   BIRD_ID BIRD_NAME                          HEIGHT
---------- ------------------------------ ----------
         8 Belted Kingfisher                      13
        10 Pied-billed Grebe                      13
        11 American Coot                          16
        12 Common Sea Gull                        18
        13 Ring-billed Gull                       19
        17 American Crow                          18

6 rows selected.

The following update statement is performed on the SHORT_BIRDS table, setting the BIRD_NAME to Some Sort of Bird for every row of data in the table. As you can see, six rows were updated. The SELECT statement that follows verifies this. All the rows were updated because criteria was not placed on the update using the where clause.

SQL> update short_birds
  2  set bird_name = 'Some Sort of Bird';

6 rows updated.

SQL> select * from short_birds;

   BIRD_ID BIRD_NAME                          HEIGHT
---------- ------------------------------ ----------
         8 Some Sort of Bird                      13
        10 Some Sort of Bird                      13
        11 Some Sort of Bird                      16
        12 Some Sort of Bird                      18
        13 Some Sort of Bird                      19
        17 Some Sort of Bird                      18

6 rows selected.

In this example, all six rows of data were updated. We set the value for BIRD_NAME to Some Sort of Bird for every row of data in the table. Was this actually the intention of the table update? Perhaps in some cases, but rarely do you issue an UPDATE statement without a WHERE clause. An easy way to check to see whether you will be updating the correct data set is to write a SELECT statement for the same table with your WHERE clause that you are using in the INSERT statement. Then you can physically verify that these are the rows you want to update.

Caution

Test Your UPDATE and DELETE Statements

Use extreme caution when using the UPDATE statement without a WHERE clause. The target column is updated for all rows of data in the table if conditions are not designated using the WHERE clause. In most situations, using of the WHERE clause with a DML command is appropriate.

Updating Multiple Columns in One or More Records

Next, you see how to update multiple columns with a single UPDATE statement. Study the following syntax:

update table_name
set column1 = 'value',
   [column2 = 'value',]
   [column3 = 'value']
[where condition];

Notice the use of SET in this syntax—there is only one SET, but multiple columns. Each column is separated by a comma. You should start to see a trend in SQL: The comma usually separates different types of arguments in SQL statements. In the following code, a comma separates the two columns being updated. Again, the WHERE clause is optional, but it is usually necessary.

As you take another look at the SHORT_BIRDS table, you will update the table and set the value of multiple columns to a new value using the UPDATE statement. You start by looking at BIRD_ID number 8.

SQL> select * from short_birds;

   BIRD_ID BIRD_NAME                          HEIGHT
---------- ------------------------------ ----------
         8 Belted Kingfisher                      13
        10 Pied-billed Grebe                      13
        11 American Coot                          16
        12 Common Sea Gull                        18
        13 Ring-billed Gull                       19
        17 American Crow                          18

6 rows selected.

Here, an update is performed on SHORT_BIRDS, setting the value for BIRD_NAME to Kingfisher and the height to 12 instead of 13 for BIRD_ID number 8, or the Belted Kingfisher.

SQL> update short_birds
  2  set bird_name = 'Kingfisher',
  3      height = 12
  4  where bird_id = 8;

1 row updated.

The following query verifies that this update was successful.

SQL> select * from short_birds;

   BIRD_ID BIRD_NAME                          HEIGHT
---------- ------------------------------ ----------
         8 Kingfisher                             12
        10 Pied-billed Grebe                      13
        11 American Coot                          16
        12 Common Sea Gull                        18
        13 Ring-billed Gull                       19
        17 American Crow                          18

6 rows selected.

Note

When to Use the SET Keyword

The SET keyword is used only once for each UPDATE statement. If you want to update more than one column, use a comma to separate the columns to be updated.

In Hour 14, “Joining Tables in Queries,” you learn how to write more complex statements through a construct known as a JOIN. This construct enables you to update values in one table using values from one or more outside tables.

Deleting Data from Tables

The DELETE command removes entire rows of data from a table. It does not remove values from specific columns; a full record, including all columns, is removed. Use the DELETE statement with caution.

To delete a single record or selected records from a table, use the DELETE statement with the following syntax:

delete from table_name
[where condition];

In the following example, suppose that you want to delete the row of data for the Belted Kingfisher altogether. The BIRD_ID for the Belted Kingfisher is number 8. Study the following DELETE statement.

SQL> delete from short_birds
  2  where bird_id = 8;

1 row deleted.

The following query verifies that one row of data was successfully deleted from the SHORT_BIRDS table. BIRD_ID number 8, Belted Kingfisher, no longer exists in that table.

SQL> select * from short_birds;

   BIRD_ID BIRD_NAME                          HEIGHT
---------- ------------------------------ ----------
        10 Pied-billed Grebe                      13
        11 American Coot                          16
        12 Common Sea Gull                        18
        13 Ring-billed Gull                       19
        17 American Crow                          18

5 rows selected.

So what happens if you perform a DELETE statement with no WHERE clause? The situation is similar to that of using an UPDATE without the WHERE clause. All rows of data are affected within the table. The following DELETE statement from the SHORT_BIRDS table lacks a WHERE clause; you can see that the remaining five rows in the SHORT_BIRDS table were deleted.

SQL> delete from short_birds;

5 rows deleted.

SQL> select * from short_birds;

no rows selected

Caution

Don’t Omit the WHERE Clause

If the WHERE clause is omitted from the DELETE statement, all rows of data are deleted from the table. As a general rule, always use a WHERE clause with the DELETE statement. In addition, first test your WHERE clause with a SELECT statement.

Also remember that the DELETE command might have a permanent effect on the database. Ideally, recovering erroneously deleted data should be possible using a backup, but in some cases, recovering data is difficult or even impossible. If you cannot recover data, you must re-enter it into the database—this is trivial if you are dealing with only one row of data, but not so trivial if you have thousands of rows of data. You can see the importance of the WHERE clause.

The temporary table that was populated from the original table earlier in this hour can be useful for testing the DELETE and UPDATE commands before issuing them against the original table. Also remember the technique discussed earlier when you looked at the UPDATE command: Write a SELECT statement using the same WHERE clause that you want to use for the DELETE statement. That way, you can verify that the data being deleted is actually the data you want.

Summary

During this hour, you learned about the three basic commands in DML: the INSERT, UPDATE, and DELETE statements. As you have seen, data manipulation is a powerful part of SQL that allows the database user to populate tables with new data, update existing data, and delete data.

An important lesson when updating or deleting data from tables in a database is sometimes learned when neglecting the use of the WHERE clause. Remember that the WHERE clause places conditions on a SQL statement—particularly in the case of UDPATE and DELETE operations, when you specify rows of data that are affected during a transaction. All target table data rows are affected if the WHERE clause is not used, which could be disastrous to the database. Protect your data and be cautious during data manipulation operations.

Q&A

Q. With all the warnings about DELETE and UPDATE, I’m apprehensive about using them. If I accidentally update all the records in a table because I didn’t use the WHERE clause, can I reverse the changes?

A. You have no reason to be afraid: You can correct almost anything you do to the database, although it might involve considerable time and work. Hour 11, “Managing Database Transactions,” discusses the concept of transactional control, which allows you to finalize or undo data manipulation operations.

Q. Is the INSERT statement the only way to enter data into a table?

A. No, but remember that the INSERT statement is an ANSI standard. Various implementations have their own tools for entering data into tables. For example, Oracle has a SQL*Loader utility, whereas SQL Server has a SQL Server Integration Services (SSIS) utility. Many other implementations have IMPORT utilities that can insert data.

Workshop

The following workshop consists of a series of quiz questions and practical exercises. The quiz questions are designed to test your overall understanding of the current material. The practical exercises give you the opportunity to apply the concepts discussed during the current hour, as well as build on the knowledge you acquired in previous hours of study. Be sure to complete the quiz questions and exercises before continuing to the next hour. Refer to Appendix C, “Answers to Quizzes and Exercises,” for answers.

Quiz

  1. 1. Do you always need to supply a column list for the table that you use an INSERT statement on?

  2. 2. What can you do if you do not want to enter a value for one particular column?

  3. 3. Why is it important to use a WHERE clause with UPDATE and DELETE?

  4. 4. What is an easy way to check that an UPDATE or DELETE will affect the rows that you want?

Exercises

  1. 1. Review the table structures in the BIRDS database, particularly the data in the BIRDS table itself. You will use this data to perform the exercises.

  2. 2. Use the SELECT statement to display all the data currently in the BIRDS table.

  3. 3. Create a new table called TALL_BIRDS that is based on the BIRDS table itself, with the following columns: BIRD_ID, BIRD_NAME, and WINGSPAN.

  4. 4. Insert data from the BIRDS table into the TALL_BIRDS table for birds taller than 30 inches.

  5. 5. Use the SELECT statement to display all the new data in the TALL_BIRDS table.

  6. 6. Insert the following data into the TALL_BIRDS table:

    BIRD_NAME = Great Egret
    HEIGHT = 40
    WINGSPAN = 66
  7. 7. Update every data value in the BIRDS table for the bird name column to read Bird. Was this command successful? Why or why not?

  8. 8. Update the wingspan of every bird in the TALL_BIRDS table to a NULL value.

  9. 9. Delete the record for Great Egret from the TALL_BIRDS table.

  10. 10. Delete every remaining row of data from the TALL_BIRDS table.

  11. 11. Drop the TALL_BIRDS table.