INSERT INTOdatabase_name
.table_name
(col1
,col2
) VALUES (val1
,val2
); INSERT INTOtable_name
VALUES (val1
,val2
,val3
... ); INSERT INTOtable_name
(col1
,col2
) SELECTc1, c2 FROM
...; INSERT INTOtable_name
DEFAULT VALUES; INSERT OR IGNORE INTOtable_name
(col1
,col2
) VALUES (val1
,val2
); REPLACE INTOtable_name
(col1
,col2
) VALUES (val1
,val2
);
The INSERT
command adds new rows to tables. An
individual INSERT
command can
only insert rows into one table, and in most cases can only
insert one row at a time. There are several variants of the
INSERT
command. The
primary differences relate to how the columns and values are
specified.
The basic format of the INSERT
command starts with the
command word INSERT
, followed
by an optional conflict resolution clause (OR ROLLBACK
, etc.). The INSERT
conflict resolution
clause is identical to the one found in the UPDATE
command. See UPDATE for more information on the
different conflict resolution options and how they behave. The
command word REPLACE
can also
be used, which is just a shortcut for INSERT OR REPLACE
. This is discussed in more
detail below.
After the conflict clause, the
command declares which table it is acting upon. This is
generally followed by a list of columns in parentheses. This
column list defines which columns will have values set in the
newly inserted row. If no column list is given, a default column
list is assumed to include all of the table’s columns, in order,
as they appear in the table definition. A default list will not
include the raw ROWID
column,
but any ROWID
alias (INTEGER PRIMARY KEY
) column is
included. If an explicit list of columns is given, the list may
contain any column (including the ROWID
column) in any order.
Following the column list is a
description of the values to insert into the new row. The values
are most commonly defined by the keyword VALUES
, followed by an explicit
list of values in parentheses. The values are matched to columns
by position. No matter how the column list is defined, the
column list and the value list must have the same number of
entries so that one value can be matched to each column in the
column list.
The INSERT
values can also be defined with a
subquery. Using a subquery is the only case when a single
INSERT
command can
insert more than one row. The result set generated by the
subquery must have the same number of columns as the column
list. As with the value list, the values of the subquery result
set are matched to the insert columns by position.
Any table column that does not
appear in the insert column list is assigned a default value.
Unless the table definition says otherwise, the default value is
a NULL. If you have a ROWID
alias column that you want assigned an automatic value, you must
use an explicit column list, and that list must not include the
ROWID
alias. If a
column is contained in the column list, either explicitly, or by
default, a value must be provided for that column. There is no
way to specify a default value except by leaving the column out
of the column list, or knowing what the default value is and
explicitly inserting it.
Alternatively, the column list and
value list can be skipped all together. The DEFAULT VALUES
variant provides
neither a column list nor a source of values and can be used to
insert a new row that consists entirely of default
values.
Because a typical INSERT
command only allows a
single row to be inserted, it is not uncommon to have a string
of INSERT
commands that are
used to bulk-load or otherwise populate a new table. Like any
other command, each INSERT
is
normally wrapped in its own transaction. Committing and
synchronizing this transaction can be quite expensive, often
limiting the number of inserts to a few dozen a second.
Due of the expense associated with
committing a transaction, it is very common to batch multiple
inserts into a single transaction. Especially in the case of
bulk inserts, it is not uncommon to batch 1,000 or even 10,000
or more INSERT
commands into
a single transaction, allowing a much higher insert rate. Just
understand that if an error is encountered, there are situations
where the whole transaction will be rolled back. While this may
be acceptable for loading bulk data from a file (that can simply
be rerun), it may not be acceptable for data that is inserted
from a real-time data stream. Batch transactions greatly speed
things up, but they can also make it significantly more
difficult to recover from an error.
One final word on the INSERT OR REPLACE
command. This
type of command is frequently used in event-tracking systems
where a “last seen” timestamp is required. When an event is
processed, the system needs to either insert a new row (if this
type of event has never been seen before) or it needs to update
an existing record. While the INSERT OR
REPLACE
variant seems perfect for this, it has
some specific limitations. Most importantly, the command truly
is an “insert or replace” and not an “insert or update.” The
REPLACE
option fully
deletes any old rows before processing the INSERT
request, making it
ineffective to update a subset of columns. In order to
effectively use the INSERT OR
REPLACE
option, the INSERT
needs to be capable of completely
replacing the existing row, and not simply updating a subset of
columns.