Adding Data with INSERT and COPY

Once you have created your table with the necessary specifications, the next logical step is to fill the table with data. There are generally three methods in PostgreSQL with which you can fill a table with data:

Use the INSERT INTO command with a grouped set of data to insert new values.
Use the INSERT INTO command in conjunction with a SELECT statement to insert existing values from another table.
Use the COPY (or \copy) command to insert values from a system file.

Inserting New Values

The following is the syntax of the INSERT INTO command, when used to insert new values, which is subsequently described in detail:

INSERT INTO table_name
       [ ( column_name [, ...] ) ]
       VALUES ( value [, ...] )

table_name: The INSERT SQL command initiates an insertion of data into the table called table_name.
( column_name [, ...] ): An optional grouped expression which describes the targeted columns for the insertion.
VALUES: The SQL clause which instructs PostgreSQL to expect a grouped expression of values to follow.
( value [, ...] ): The required grouped expression that describes the values to be inserted. There should be one value for each specified column, separated by commas. These values may be expressions themselves (e.g., an operation between two values), or constants.

Each value following the VALUES clause must be of the same data type as the column it is being inserted into. If the optional column-target expression is omitted, PostgreSQL will expect there to be one value for each column in the literal order of the table’s structure. If there are fewer values to be inserted than columns, PostgreSQL will attempt to insert a default value (or the NULL value, if there is no default) for each omitted value.

To demonstrate, Example 4-16 illustrates the insertion of a new book into Book Town’s books table.

Example 4-16. Inserting new values into the books table

booktown=# INSERT INTO books (id, title, author_id, subject_id)
booktown-#        VALUES (41472, 'Practical PostgreSQL', 1212, 4);
INSERT 3574037 1

The SQL statement in Example 4-16 inserts a new book with an id of 41472, a title of Practical PostgreSQL, an author identifier of 1212, and a subject identifier of 4. Note the feedback beginning with INSERT, which indicates that the insertion was successful. The first number following INSERT is the OID (object identifier) of the freshly inserted row. The second number following INSERT represents the number of rows inserted (in this case, 1).

Notice that the optional column target list is specified identically to the physical structure of the table, from left to right. In this case, omitting the grouped expression would have no effect on the statement since the INSERT statement assumes that you are inserting values in the natural order of the table’s columns. You can re-arrange the names of the columns in the grouped column target list if you wish to specify the values in a different order following the VALUES clause, as demonstrated in Example 4-17.

Example 4-17. Changing the order of target columns

booktown=# INSERT INTO books (subject_id, author_id, id, title)
booktown-#        VALUES (4, 7805, 41473, 'Programming Python');
INSERT 3574041 1

Inserting Values from Other Tables with SELECT

If you already have values within one table (or across several other tables) that you wish to insert into a separate table, this can also be achieved with the INSERT INTO command. The following syntax is used for this technique:

INSERT INTO table_name
       [ ( column_name [, ...] ) ]
       query

Similar to the syntax of INSERT INTO presented in the previous section, you may optionally specify which columns you wish to insert into, and in what order the query returns their values. However, with this form of INSERT INTO, you provide a complete SQL SELECT statement in the place of the VALUES keyword.

For example, imagine that Book Town keeps a table called book_queue, which holds books waiting to be approved for sale. When approved, those values need to be moved from the queue, into the normal books table. This can be achieved with the syntax demonstrated in Example 4-18.

Example 4-18. Inserting values from another table

booktown=# INSERT INTO books (id, title, author_id, subject_id)
booktown-#        SELECT nextval('book_ids'), title, author_id, subject_id
booktown-#               FROM book_queue WHERE approved;
INSERT 0 2

The preceding example demonstrates the insertion of two rows from the table book_queue into the books table by way of a SELECT statement that is passed to the INSERT INTO command. Any valid SELECT statement may be used in this context. In this case, the query selects the result of a function called nextval() from a sequence called book_ids, followed by the title, author_id and subject_id columns from the book_queue table.

Since more than one row is being inserted, the INSERT result indicating success returns 0 in place of the OID that would be returned if a single row had been inserted. The second number, as with a normal INSERT INTO command, returns the number of rows inserted (in this case, 2).

Copying Values from External Files with COPY

A useful technique within PostgreSQL is to use the COPY command to insert values directly into tables from external files. Files used for input by COPY must either be in standard ASCII text format, whose fields are delimited by a uniform symbol, or in PostgreSQL’s binary table format. Common delimiters for ASCII files are tabs and commas. When using an ASCII formatted input file with COPY, each line within the file will be treated as a row of data to be inserted and each delimited field will be treated as a column value.

The COPY FROM command operates much faster than a normal INSERT command because the data is read as a single transaction directly to the target table. On the other hand, it is a very strict format, and the entire COPY procedure will fail if just one line is malformed.

The following is the syntax for using the COPY FROM command, where table_name is the table that you wish to insert values into and filename is the absolute system path to the file to be read:

COPY [ BINARY ] table_name [ WITH OIDS ]
     FROM { 'filename' | stdin }
     [ [USING] DELIMITERS 'delimiter' ]
     [ WITH NULL AS 'null_string' ]

BINARY: Indicates that input will come from a binary file previously created by the COPY TO command.
table_name: The name of the table you are copying.
WITH OIDS: Instructs PostgreSQL to retrieve all of the OIDs of the table represented by filename from the first line of the file.
FROM { 'filename' | stdin }: Indicates that either the file specified with filename or standard input (stdin) should be read by PostgreSQL.
[ USING ] DELIMITERS 'delimiter': Indicates the character provided with delimiter should be used as a delimiter when parsing input. This clause is not applicable to files that were output in PostgreSQL’s binary format.
WITH NULL AS 'null_string': Indicates that the character(s) provided with null_string should be interpreted as NULL values. This clause is not applicable to files that were output in PostgreSQL’s binary format.

When preparing to copy a file from the underlying operating system, remember that the file specified must be readable by the postmaster process (i.e., the user which PostgreSQL is running as), since the backend reads the file directly. Additionally, the filename must be provided with an absolute path; an attempt to use a relative path will result in an error.

If you are using an ASCII formatted input file, a delimiter value may be passed to the DELIMITERS clause, which defines the character which delimits columns on a single line in the filename. If omitted, PostgreSQL will assume that the ASCII file is tab-delimited. The optional WITH NULL clause allows you to specify in what form to expect NULL values. If omitted, PostgreSQL interprets the \N sequence as a NULL value to be inserted (e.g., blank fields in a source file will be treated as blank string constants, rather than NULL, by default).

The stdin term may be supplied as the source for the FROM clause if you wish to type values in manually or paste from another location directly into a terminal session. If you choose to enter values from stdin, you must terminate the input stream with a \. sequence (backslash-period) followed immediately by a newline.

Example 4-19 shows the contents of a file that was output in ASCII format by PostgreSQL. The file in Example 4-19 is comma-delimited and uses \null to represent NULL values. It contains row data from the Book Town subjects table.

Example 4-19. An example ASCII copy file

1,Business,Productivity Ave
2,Children's Books,Kids Ct
3,Classics,Academic Rd
4,Computers,Productivity Ave
5,Cooking,Creativity St
12,Religion,\null
8,History,Academic Rd
9,Horror,Black Raven Dr
10,Mystery,Black Raven Dr
11,Poetry,Sunset Dr
13,Romance,Main St
14,Science,Productivity Ave
15,Science Fiction,Main St
0,Arts,Creativity St
6,Drama,Main St
7,Entertainment,Main St

The statement in Example 4-20 copies the file (/tmp/subjects.sql) into a table within the booktown database’s subjects table.

Example 4-20. Copying an ASCII file

booktown=# COPY subjects FROM '/tmp/subjects.sql'
booktown-#               USING DELIMITERS ',' WITH NULL AS '\null';
COPY

Binary format

The COPY command can also input and output binary formatted data. Specifying to the COPY FROM command the BINARY keyword requires that the input file specified was created with the COPY TO command in PostgreSQL’s binary format. Binary files can be read more quickly than ASCII files, but are not readable or modifiable with plain-text editors as ASCII files are.

Example 4-21 uses the COPY command to insert the rows in the binary output file from the subjects table within the booktown database.

Example 4-21. Copying a binary file

booktown=# COPY BINARY subjects FROM '/tmp/subjects.sql';
COPY

The difference between COPY and \copy

The COPY command is not the same as the psql \copy command. The \copy command accepts the same syntax (though without a terminating semicolon), and therefore performs the operation via the psql client, rather than the postmaster server. The result is that \copy operates with the permissions of the user running psql rather than of the user the postmaster is running as.

COPY TO

The syntax of COPY FROM may be used with nearly identical syntax to send a table’s data to a file. You need only replace the FROM keyword with the TO keyword. Additionally, the stdin keyword may be replaced with stdout if you wish to redirect to standard output rather than to a file (e.g., to the screen, in psql). Example 4-22 shows how you would copy the books table to an ASCII formatted file.

Example 4-22. Copying the books table to an ASCII file

booktown=# COPY books TO 'filename';
COPY

Copying WITH OIDS

Files containing row data with object identifier values (created with the COPY TO command, involving the WITH OIDS clause) can be read by a COPY FROM command, if the WITH OIDS clause is specified. Attempts to use the COPY FROM command with the WITH OIDS clause on a file that wasn’t given OIDs during its creation will fail.

The ability to copy values into a table with object-identifiers is a special capability reserved for COPY. These values cannot be modified by INSERT or UPDATE, as they are system values. If you are not careful, you may end up with two rows which have the same OID, which potentially negates their usefulness.