Adding Data with INSERT and COPY

Once you have created your table with the necessary specifications, the next logical step is to fill the table with data. There are generally three methods in PostgreSQL with which you can fill a table with data:

The following is the syntax of the INSERT INTO command, when used to insert new values, which is subsequently described in detail:

INSERT INTO table_name
       [ ( column_name [, ...] ) ]
       VALUES ( value [, ...] )
table_name

The INSERT SQL command initiates an insertion of data into the table called table_name.

( column_name [, ...] )

An optional grouped expression which describes the targeted columns for the insertion.

VALUES

The SQL clause which instructs PostgreSQL to expect a grouped expression of values to follow.

( value [, ...] )

The required grouped expression that describes the values to be inserted. There should be one value for each specified column, separated by commas. These values may be expressions themselves (e.g., an operation between two values), or constants.

Each value following the VALUES clause must be of the same data type as the column it is being inserted into. If the optional column-target expression is omitted, PostgreSQL will expect there to be one value for each column in the literal order of the table’s structure. If there are fewer values to be inserted than columns, PostgreSQL will attempt to insert a default value (or the NULL value, if there is no default) for each omitted value.

To demonstrate, Example 4-16 illustrates the insertion of a new book into Book Town’s books table.

The SQL statement in Example 4-16 inserts a new book with an id of 41472, a title of Practical PostgreSQL, an author identifier of 1212, and a subject identifier of 4. Note the feedback beginning with INSERT, which indicates that the insertion was successful. The first number following INSERT is the OID (object identifier) of the freshly inserted row. The second number following INSERT represents the number of rows inserted (in this case, 1).

Notice that the optional column target list is specified identically to the physical structure of the table, from left to right. In this case, omitting the grouped expression would have no effect on the statement since the INSERT statement assumes that you are inserting values in the natural order of the table’s columns. You can re-arrange the names of the columns in the grouped column target list if you wish to specify the values in a different order following the VALUES clause, as demonstrated in Example 4-17.

If you already have values within one table (or across several other tables) that you wish to insert into a separate table, this can also be achieved with the INSERT INTO command. The following syntax is used for this technique:

INSERT INTO table_name
       [ ( column_name [, ...] ) ]
       query

Similar to the syntax of INSERT INTO presented in the previous section, you may optionally specify which columns you wish to insert into, and in what order the query returns their values. However, with this form of INSERT INTO, you provide a complete SQL SELECT statement in the place of the VALUES keyword.

For example, imagine that Book Town keeps a table called book_queue, which holds books waiting to be approved for sale. When approved, those values need to be moved from the queue, into the normal books table. This can be achieved with the syntax demonstrated in Example 4-18.

The preceding example demonstrates the insertion of two rows from the table book_queue into the books table by way of a SELECT statement that is passed to the INSERT INTO command. Any valid SELECT statement may be used in this context. In this case, the query selects the result of a function called nextval() from a sequence called book_ids, followed by the title, author_id and subject_id columns from the book_queue table.

Since more than one row is being inserted, the INSERT result indicating success returns 0 in place of the OID that would be returned if a single row had been inserted. The second number, as with a normal INSERT INTO command, returns the number of rows inserted (in this case, 2).

A useful technique within PostgreSQL is to use the COPY command to insert values directly into tables from external files. Files used for input by COPY must either be in standard ASCII text format, whose fields are delimited by a uniform symbol, or in PostgreSQL’s binary table format. Common delimiters for ASCII files are tabs and commas. When using an ASCII formatted input file with COPY, each line within the file will be treated as a row of data to be inserted and each delimited field will be treated as a column value.

The COPY FROM command operates much faster than a normal INSERT command because the data is read as a single transaction directly to the target table. On the other hand, it is a very strict format, and the entire COPY procedure will fail if just one line is malformed.

The following is the syntax for using the COPY FROM command, where table_name is the table that you wish to insert values into and filename is the absolute system path to the file to be read:

COPY [ BINARY ] table_name [ WITH OIDS ]
     FROM { 'filename' | stdin }
     [ [USING] DELIMITERS 'delimiter' ]
     [ WITH NULL AS 'null_string' ]
BINARY

Indicates that input will come from a binary file previously created by the COPY TO command.

table_name

The name of the table you are copying.

WITH OIDS

Instructs PostgreSQL to retrieve all of the OIDs of the table represented by filename from the first line of the file.

FROM { 'filename' | stdin }

Indicates that either the file specified with filename or standard input (stdin) should be read by PostgreSQL.

[ USING ] DELIMITERS 'delimiter'

Indicates the character provided with delimiter should be used as a delimiter when parsing input. This clause is not applicable to files that were output in PostgreSQL’s binary format.

WITH NULL AS 'null_string'

Indicates that the character(s) provided with null_string should be interpreted as NULL values. This clause is not applicable to files that were output in PostgreSQL’s binary format.

When preparing to copy a file from the underlying operating system, remember that the file specified must be readable by the postmaster process (i.e., the user which PostgreSQL is running as), since the backend reads the file directly. Additionally, the filename must be provided with an absolute path; an attempt to use a relative path will result in an error.

If you are using an ASCII formatted input file, a delimiter value may be passed to the DELIMITERS clause, which defines the character which delimits columns on a single line in the filename. If omitted, PostgreSQL will assume that the ASCII file is tab-delimited. The optional WITH NULL clause allows you to specify in what form to expect NULL values. If omitted, PostgreSQL interprets the \N sequence as a NULL value to be inserted (e.g., blank fields in a source file will be treated as blank string constants, rather than NULL, by default).

The stdin term may be supplied as the source for the FROM clause if you wish to type values in manually or paste from another location directly into a terminal session. If you choose to enter values from stdin, you must terminate the input stream with a \. sequence (backslash-period) followed immediately by a newline.

Example 4-19 shows the contents of a file that was output in ASCII format by PostgreSQL. The file in Example 4-19 is comma-delimited and uses \null to represent NULL values. It contains row data from the Book Town subjects table.

The statement in Example 4-20 copies the file (/tmp/subjects.sql) into a table within the booktown database’s subjects table.