Overview

A relational database is a collection of tables, each of which has a fixed number of columns and a variable number of rows. Each column in a table has a name and contains values of the same data type, such as integer or string. Each row, or record, contains values that are related to each other, such as a particular patient’s name, date of birth, and blood type.

images/db/table.png

Superficially, each table looks like a spreadsheet or a file with one record per line (see The Readline Technique), but behind the scenes, the database does a lot of work to keep track of which values are where and how different tables relate to one another.

Many different brands of databases are available to choose from, including commercial systems like Oracle, IBM’s DB2, and Microsoft Access and open source databases like MySQL and PostgreSQL. Our examples use one called SQLite. It isn’t fast enough to handle the heavy loads that sites like Amazon.com experience, but it is free, it is simple to use, and as of Python 3.3.0, the standard library includes a module called sqlite3 for working with it.

A database is usually stored in a file or in a collection of files. These files aren’t formatted as plain text—if you open them in an editor, they will look like garbage, and any changes you make will probably corrupt the data and make the database unusable. Instead you must interact with the database in one of two ways:

In the examples in this chapter, our programs all start with this line:

 >>>​​ ​​import​​ ​​sqlite3

To put data into a database or to get information out, we’ll write commands in a special-purpose language called SQL, which stands for Structured Query Language and is pronounced either as the three letters “S-Q-L” or as the word “sequel.”