Description

The CREATE TABLE command is used to define a new table. It is one of the most complex SQL commands understood by SQLite, though nearly all of the syntax is optional.

A new table can be created in a specific database by qualifying the table name with an explicit database name. If one of the optional keywords TEMP or TEMPORARY is present, any database name given as part of the table name will be ignored, and the new table will be created in the temp database.

Creating a table that already exists will normally generate an error. If the optional IF NOT EXISTS clause is provided, this error is silently ignored. This leaves the original definition (and data) in place.

There are two variations of CREATE TABLE. The difference is in how the columns are defined. The least common variation uses a simple AS SELECT subquery to define the structure and initial contents of the table. The number of columns and the column names will be taken from the result set of the subquery. The rows of the result set will be loaded into the table as part of the table creation process. Because this variation provides no way to define column affinities (typical datatypes), keys, or constraints, it is typically limited to defining “quick and dirty” temporary tables. To quickly create and load structured data, it is often better to create a table using the standard notation and then use an INSERT...SELECT command to load the table. The standard notation explicitly defines a list of columns and table constraints.

Basic format

The most common way to define a table structure is to provide a list of column definitions. Column definitions consist of a name and a type, plus zero or more column-level constraint definitions.

The list of column definitions is followed by a list of table-level constraints. For the most part, column-level constraints and table-level constraints are very similar. The main difference is that column constraints apply to the values found in a single column, while table constraints can deal with one or more columns. It is possible to define most column constraints as table-level constraints that only reference a single column. For example, a multicolumn primary key must be defined as a table constraint, but a single-column primary key can be defined as either a table constraint or a column constraint.

The column name is a standard identifier. If nonstandard characters (such as a space or a hyphen) are used, the identifier must be quoted in the CREATE TABLE statement as well as any other reference.

The column name is followed by a type indicator. In SQLite, the type is optional, since nearly any column can hold any datatype. SQLite columns do not technically have types, but rather have type affinities. An affinity describes the most favored type for the column and allows SQLite to do implicit conversions in some cases. An affinity does not limit a column to a specific type, however. The use of affinities also accounts for the fact that the type format is extremely flexible, allowing type names from nearly any dialect of SQL. For more specifics on how type affinities are determined and used, see Column types.

If you want to make sure a specific affinity is used, the most straightforward type names are INT, REAL, TEXT, or BLOB. SQLite does not use precision or size limits internally. All integer values are signed 64-bit values, all floating-point values are 64-bit values, and all text and BLOB values are variable length.

All tables have an implied root column, known as ROWID, that is used internally by the database to index and store the database table structure. This column is not normally displayed or returned in queries, but can be accessed directly using the name ROWID, _ROWID_, or OID. The alternate names are provided for compatibility with other database engines. Generally, ROWID values should never be used or manipulated directly, nor should the ROWID column be directly used as a table key. To use a ROWID as a key value, it should be aliased to a user-defined column. See PRIMARY KEY constraint.

Column constraints

Each column definition can include zero or more column constraints. Column constraints follow the column type indicator; there is no comma or other delimiter between basic column definitions and the column constraints. The constraints can be given in any order.

Most of the column constraints are easy to understand. The PRIMARY KEY constraint is a bit unique, however, and is discussed below, in its own section.

The NOT NULL constraint prohibits the column from containing NULL entries. The UNIQUE constraint requires all the values of the column to be unique. An automatic unique index will be created on the column to enforce this constraint. Be aware that UNIQUE does not imply NOT NULL, and unique columns are allowed to have more than one NULL entry. This means there is a tendency for columns with a UNIQUE constraint to also have a NOT NULL constraint.

The CHECK constraint provides an arbitrary user-defined expression that must remain true. The expression can safely access any column in the row. The CHECK constraint is very useful to enforce specific data formats, ranges or values, or even specific datatypes. For example, if you want to be absolutely sure nothing but integer values are entered into a column, you can add a constraint such as:

CHECK ( typeof( column_name ) == 'integer' )

The DEFAULT constraint defines a default value for the column. This value is used when an INSERT statement does not include a specific value for this column. A DEFAULT can either be a literal value or, if enclosed in parentheses, an expression. Any expression must evaluate to a constant value. You can also use the special values CURRENT_TIME, CURRENT_DATE, or CURRENT_TIMESTAMP. These will insert an appropriate text value indicating the time the row was first created. If no DEFAULT constraint is given, the default value will be NULL.

The COLLATION constraint is used to assign a specific collation to a column. This not only defines the sort order for the column, it also defines how values are tested for equality (which is important for things such as UNIQUE constraints). SQLite includes three built-in collations: BINARY (the default), NOCASE, and RTRIM. BINARY treats all values as binary data that must match exactly. NOCASE is similar to binary, only it is case-insensitive for ASCII text values (in specific, character codes < 128). Also included is RTRIM (right-trim), which is like BINARY, but will trim any trailing whitespace from TEXT values before doing comparisons.

Finally, columns can contain a REFERENCES foreign key constraint. If given as a column constraint, the foreign table reference can contain no more than one foreign column name. If no column references are given, the foreign table must have a single-column primary key. For more information on foreign keys, see the section Foreign Keys. Note that a column-level foreign key constraint does not actually contain the words FOREIGN KEY. That syntax is for table-level foreign key constraints.

PRIMARY KEY constraint

The PRIMARY KEY constraint is used to define the primary key (or PK) for the table. From a database design and theory standpoint, it is desirable for every table to have a primary key. The primary key defines the core purpose of the table by defining the specific data points that make each row a unique and complete record.

From a practical standpoint, SQL does not require a table to have a PK. In fact, SQL does not require that rows within a table be unique. Nonetheless, there are some advantages to defining a primary key, especially when using foreign keys. In most cases a foreign key in one table will refer to the primary key of another table, and explicitly defining a primary key can make it easier to establish this relationship. SQLite also provides some additional features for single-column primary keys.

There can be only one PRIMARY KEY constraint per table. It can be defined at either the column level or the table level, but each table can have only one. A PRIMARY KEY constraint implies a UNIQUE constraint. As with a standalone UNIQUE constraint, this will cause the creation of an automatic unique index (with one exception). In most database systems, PRIMARY KEY also implies NOT NULL, but due to a long-standing bug, SQLite allows the use of NULL entries in a primary key column. For proper behavior, be sure to define at least one column of the primary key to be NOT NULL.

If a column has the type identifier INTEGER (it can be upper- or lowercase, but must be the exact word “integer”), an ascending collation (the default), and has a single-column PRIMARY KEY constraint, then that column will become an alias for the ROWID column. Behind the scenes, this makes an INTEGER PRIMARY KEY column the root column, used internally to index and store the database table. Using a ROWID alias allows for very fast row access without requiring a secondary index. Additionally, SQLite will automatically assign an unused ROWID value to any row that is inserted without an explicit column value.

Columns defined as INTEGER PRIMARY KEY can really truly hold only integer values. Additionally, unlike other primary key columns, they have an inherent NOT NULL constraint. Default values are assigned using the standard ROWID allocation algorithm. This algorithm will automatically assign a value that is one larger than the largest currently used ROWID value. If the maximum value is met, a random (unused) ROWID value will be chosen. As rows are added and removed from a table, this allows ROWID values to be recycled.

While recycling values is not a problem for internal ROWID values, it can cause problems for reference values that might be lurking elsewhere in the database. To avoid problems, the keyword AUTOINCREMENT can be used with an INTEGER PRIMARY KEY to indicate that automatically generated values should not be recycled. Default values assigned by AUTOINCREMENT will be one larger than the largest ROWID value that was ever used, but don't depend on each and every value being used. If the maximum value is reached, an error is returned.

When using a ROWID alias to automatically generate keys, it is a common practice to insert a new row and call the SQL function last_insert_rowid(), or the C function sqlite3_last_insert_rowid(), to retrieve the ROWID value that was just assigned. This value can be used to insert or update rows that reference the newly inserted row. It is also always possible to insert a row with a specific ROWID (or ROWID alias) value.