Simple Example: dblist Module

Create and Connect

The first set of functions we’ll be looking at are used to create or connect a virtual table instance. The functions you need to provide are:

int xCreate( sqlite3 *db, void *udp, int argc, char **argv, sqlite3_vtab **vtab, char **errMsg )

Required. This function is called by SQLite in response to a CREATE VIRTUAL TABLE command. This function creates a new instance of a virtual table and initializes any required data structures and database objects.

The first parameter is the database connection where the table needs to be created. The second parameter is the user-data pointer passed into sqlite3_create_module(). The third and fourth parameters pass in a set of creation arguments. The fifth parameter is a reference to a virtual table (vtab) structure pointer. Your function must allocate and return one of these structures. The final parameter is a reference to an error message. This allows you to pass back a custom error message if something goes wrong.

If everything works as planned, this function returns SQLITE_OK. If the return code is anything other than SQLITE_OK, the vtab structure should not be allocated.

Every module is passed at least three arguments. The variable argv[0] will always contain the name of the module used to create the virtual table. This allows the same xCreate() function to be used with similar modules. The logical name of the database (main, temp, etc.) is passed in as argc[1], and argv[2] contains the user-provided table name. Any additional arguments given to the CREATE VIRTUAL TABLE statement will be passed in, starting with argv[3], as text values.

int xConnect( sqlite3 *db, void *udp, int argc, char **argv, sqlite3_vtab **vtab, char **errMsg )

Required. The format and parameters of this function are identical to xCreate(). The main difference is that xCreate() is only called when a virtual table instance is first created. xConnect(), on the other hand, is called any time a database is opened. The function still needs to allocate and return a vtab structure, but it should not need to initialize any database objects.

If no object creation step is required, many modules use the same function for both xCreate() and xConnect().

These functions bring a virtual table instance into being. For each virtual table, only one of these functions will be called over the lifetime of a database connection.

The create and connect functions have two major tasks. First, they must allocate an sqlite3_vtab structure and pass that back to the SQLite engine. Second, they must define the table structure with a call to sqlite3_declare_vtab(). The xCreate() call must also create and initialize any storage, be it shadow tables, external files, or whatever is required by the module design. The order of these tasks is not important, so long as all of the tasks are accomplished before the xCreate() or xConnect() function returns.

Allocate the vtab structure

The xCreate() and xConnect() functions are responsible for allocating and passing back an sqlite3_vtab structure. That structure looks like this:

struct sqlite3_vtab {
    const sqlite3_module  *pModule;  /* module used by table */
    int                   nRef;      /* SQLite internal use only */
    char                  *zErrMsg;  /* Return error message */
};

The module is also (eventually) responsible for deallocating this structure, so you can technically use whatever memory management routines you want. However, for maximum compatibility, it is strongly suggested that modules use sqlite3_malloc(). This will allows the module to run in any SQLite environment.

The only field of interest in the sqlite3_vtab structure is the zErrMsg field. This field allows a client to pass a custom error message back to the SQLite core if any of the functions (other than xCreate() or xConnect()) return an error code. The xCreate() and xConnect() functions return any potential error message through their sixth parameter, since they are unable to allocate and pass back a vtab structure (including the zErrMsg pointer) unless the call to xCreate() or xConnect() was successful. The xCreate() and xConnect() functions initialize the vtab error message pointer to NULL after allocating the vtab structure.

Typically, a virtual table needs to carry around a lot of state. In many systems, this is done with some kind of user-data pointer or other generic pointer. While the sqlite3_create_module() function does allow you to pass in a user-data pointer, that pointer is only made available to the xCreate() and xConnect() functions. Additionally, the same user-data pointer is provided to every table instance managed by a given module, so it isn’t a good place to keep instance-specific data.

The standard way of providing instance-specific state data is to wrap and extend the sqlite3_vtab structure. For example, our dblist module will define a custom vtab data structure that looks like this:

typedef struct dblist_vtab_s {
    sqlite3_vtab          vtab;  /* this must go first */
    sqlite3               *db;   /* module-specific fields then follow */
} dblist_vtab;

By defining a custom data structure, the module can effectively extend the standard sqlite3_vtab structure with its own data. This will only work if the sqlite3_vtab structure is the first field, however. It must also be a vanilla C struct, and not a C++ class or some other managed object. Also, note that the vtab field is a full instance of the sqlite3_vtab structure, and not a pointer. That is, the custom vtab structure “contains-a” sqlite3_vtab structure, and does not “references-a” sqlite3_vtab structure.

It might seem a bit ugly to append module instance data to the sqlite3_vtab structure in this fashion, but this is how the virtual table interface is designed to work. In fact, this is the whole reason why the xCreate() and xConnect() functions are responsible for allocating the memory required by the sqlite3_vtab structure. By having the module allocating the memory, it can purposely overallocate the structure for its own means.

In the case of the dblist module, the only additional parameter the module requires is the database connection. Most modules require a lot more information. In specific, the xCreate() and xConnect() functions are the only time the module code will have access to the user-data pointer, the database connection pointer (the sqlite3 pointer), the database name, or the virtual table name. If the module needs access to these later, it needs to stash copies of this data in the vtab structure. The easiest way to make copies of these strings is with sqlite3_mprintf(). (See sqlite3_mprintf() in Appendix G.)

Most internal modules will need to make a copy of the database connection, the name of the database that contains the virtual module, and the virtual table name. Not only is this information required to create any shadow tables (which happens in xCreate()), but this information is also required to prepare any internal SQL statements (typically in xOpen()), as well as deal with xRename() calls. One of the most common bugs in module design is to assume there is only one database attached to the database connection, and that the virtual table is living in the main database. Be sure to test your module when virtual tables are created and manipulated in other databases that have been opened with ATTACH DATABASE.

The dblist module doesn’t have any shadow tables, and the data we need to return is specific to the database connection, not any specific database. The module still needs to keep a copy of the database connection, so that it can prepare the PRAGMA statement, but that’s it. As a result, the dblist_vtab structure is much simpler than most internal structures.

Define the table structure

The other major responsibility of the xCreate() and xConnect() functions is to define the structure of the virtual table.

int sqlite3_declare_vtab( sqlite3 *db, const char *sql ): This function is used to declare the format of a virtual table. This function may only be called from inside a user-defined xCreate() or xConnect() function. The first parameter is the database connection passed into xCreate() or xConnect(). The second parameter is a string that should contain a single, properly formed CREATE TABLE statement.

Although the module must provide a table name in the CREATE TABLE statement, the table name (and database name, if provided) is ignored. The given name does not need to be the name of the virtual table instance. In addition, any constraints, default values, or key definitions within the table definition are also ignored—this includes any definition of an INTEGER PRIMARY KEY as a ROWID alias. The only parts of the CREATE TABLE statement that really matters are the column names and column types. Everything else is up to the virtual table module to enforce.

Like standard tables, virtual tables have an implied ROWID column that must be unique across all of the rows in the virtual table. Most of the virtual table operations reference rows by their ROWID, so a module will need some way to keep track of that value or generate a unique ROWID key value for every row the virtual table manages.

The dblist virtual table definition is quite simple, reflecting the same structure as the PRAGMA database_list output. Since the table structure is also completely static, the code can just define the SQL statement as a static string:

static const char *dblist_sql = 
"CREATE TABLE dblist ( seq INTEGER, name TEXT, file TEXT );";

/* ... */
    sqlite3_declare_vtab( db, dblist_sql );

Depending on the design requirements, a module might need to dynamically build the table definition based off the user-provided CREATE VIRTUAL TABLE arguments.

It was already decided that the dblist example will simply use the seq values returned by the PRAGMA command as the source of both the seq output column and the ROWID values. Virtual tables have no implicit way of aliasing a standard column to the ROWID column, but a module is free to do this explicitly in the code.

It makes sense for a virtual table to define its own structure, rather than having it defined directly by the CREATE VIRTUAL TABLE statement. This allows the application to adapt to fit its own needs, and tends to greatly simplify the CREATE VIRTUAL TABLE statements. There is one drawback, however, in that if you want to look up the structure of a virtual table, you cannot simply look in the sqlite_master system table. Each virtual table instance will have an entry in this table, but the only thing you’ll find there is the original CREATE VIRTUAL TABLE statement. If you want to look up the column names and types of a virtual table instance, you’ll need to use the command PRAGMA table_info( table_name ). This will provide a full list of all the column names and types in a table, even for a virtual table. See table_info in Appendix F for more details.

Storage initialization

If a virtual table module manages its own storage, the xCreate() function needs to allocate and initialize the required storage structure. In the case of an internal module that uses shadow tables, the module will need to create the appropriate tables. Only the xCreate() function needs to do this. The next time the database is opened, xConnect(), and not xCreate(), will be called. The xConnect() function may want to verify the correct shadow tables exist in the correct database, but it should not create them.

If you’re writing an internal module that uses shadow tables, it is customary to name the shadow tables after the virtual table. In most cases you’ll also want to be sure to create the shadow tables in the same database as the virtual table. For example, if your module requires three shadow tables per virtual table instance, such as Data, IndexA, and IndexB, a typical way to create the tables within your xCreate() function would be something like this (see sqlite3_mprintf() in Appendix G for details on the %w format):

sql_cmd1 = sqlite3_mprintf(
         "CREATE TABLE \"%w\".\"%w_Data\"   (...)", argv[1], argv[2],... );
sql_cmd2 = sqlite3_mprintf(
         "CREATE TABLE \"%w\".\"%w_IndexA\" (...)", argv[1], argv[2],... );
sql_cmd3 = sqlite3_mprintf(
         "CREATE TABLE \"%w\".\"%w_IndexB\" (...)", argv[1], argv[2],... );

This format will properly create per-instance shadow tables in the same database as the virtual table. The double quotes also ensure you can handle nonstandard identifier names. You should use a similar format (with a fully quoted database name and table name) in every SQL statement your module may generate.

If you’re writing an external module that manages its own files, or something similar, you should try to follow some similar convention. Just remember not to use the database name (argv[1]) in your naming convention, as this can change, depending on how the database was open, or attached to the current database connection.

Create/connect dblist example

Although the dblist module could be considered an internal module, the module does not manage storage for any of the data it uses. This means there is no requirement to create shadow tables. This allows the module to use the same function for both the xCreate() and xConnect() function pointers.

Here is the full dblist create and connect function:

static int dblist_connect( sqlite3 *db, void *udp, int argc, 
        const char *const *argv, sqlite3_vtab **vtab, char **errmsg )
{
    dblist_vtab    *v = NULL;

    *vtab = NULL;
    *errmsg = NULL;
    if ( argc != 3 )  return SQLITE_ERROR;
    if ( sqlite3_declare_vtab( db, dblist_sql ) != SQLITE_OK ) {
        return SQLITE_ERROR;
    }

    v = sqlite3_malloc( sizeof( dblist_vtab ) ); /* alloc our custom vtab */
    *vtab = (sqlite3_vtab*)v;
    if ( v == NULL ) return SQLITE_NOMEM;

    v->db = db;                                  /* stash this for later */
    (*vtab)->zErrMsg = NULL;                     /* initalize this */
    return SQLITE_OK;
}

The create/connect function walks through the required steps point by point. We verify the argument count (in this case, only allowing the standard three arguments), define the table structure, and finally allocate and initialize our custom vtab structure. Remember that you should not pass back an allocated vtab structure unless you’re returning an SQLITE_OK status.

Disconnect and Destroy

Not surprisingly, the xCreate() and xConnect() functions each have their own counterparts:

int xDisconnect( sqlite3_vtab *vtab )

Required. This is the counterpart to xConnect(). It is called every time a database that contains a virtual table is detached or closed. This function should clean up any process resources used by the virtual table implementation and release the vtab data structure.

int xDestroy( sqlite3_vtab *vtab )

Required. This is the counterpart to xCreate(), and is called in response to a DROP TABLE command. If an internal module has created any shadow tables to store module data, this function should call DROP TABLE on those tables. As with xDisconnect(), this function should also release any process resources and release the virtual table structure.

Many modules that do not manage their own storage use the same function for xDisconnect() and xDestroy().

As with the xCreate() and xConnect() functions, only one of these functions will be called within the context of a given database connection. Both functions should release the memory allocated to the vtab pointer. The xDestroy() function should also delete, drop, or deallocate any storage used by the virtual table. Make sure you use fully qualified and quoted database and table names.

The dblist version of this function—which covers both xDisconnect() and xDestroy()—is very simple:

static int dblist_disconnect( sqlite3_vtab *vtab )
{
    sqlite3_free( vtab );
    return SQLITE_OK;
}

The code frees the vtab memory, and that’s about it.

Query Optimization

Virtual tables present a challenge for the query optimizer. In order to optimize SELECT statements and choose the most efficient query plan, the optimizer must weigh a number of factors. In addition to understanding the constraints on the query (such as WHERE conditions), optimization also requires some understanding of how large a table is, what columns are indexed, and how the table can be sorted.

There is no automatic way for the query optimizer to deduce this information from a virtual table. Virtual tables cannot have traditional indexes, and if the internal virtual table implements a fancy custom indexing system, the optimizer has no way of knowing about it or how to best take advantage of it. While every query could perform a full table scan on a virtual table, that largely defeats the usefulness of many internal modules that are specifically designed to provide an optimized type of lookup.

The solution is to allow the query optimizer to ask the virtual table module questions about the cost and performance of different kinds of lookups. This is done through the xBestIndex() function:

int xBestIndex( sqlite3_vtab *vtab, sqlite3_index_info *idxinfo ): Required. When an SQL statement that references a virtual table is prepared, the query optimizer calls this function to gather information about the structure and capabilities of the virtual table. The optimizer is basically asking the virtual table a series of questions about the most efficient access patterns, indexing abilities, and natural ordering provided by the module. This function may be called several times when a statement is prepared.

Communication between the query optimizer and the virtual table is done through the sqlite3_index_info structure. This data structure contains an input and output section. The SQLite library fills out the input section (input to your function), essentially asking a series of questions. You can fill out the output section of the structure, providing answers and expense weightings to the optimizer.

If xBestIndex() sounds complicated, that’s because it is. The good news is that if you ignore the optimizer, it will revert to a full table scan for all queries and perform any constraint checking on its own. In the case of the dblist module, we’re going to take the easy way out, and more or less ignore the optimizer:

static int dblist_bestindex( sqlite3_vtab *vtab, sqlite3_index_info *info )
{
    return SQLITE_OK;
}

Given how simple the module is, and the fact that it will never return more than 30 rows (there is an internal limit on the number of attached databases), this is a fair trade off between performance and keeping the code simple. Even with fairly large datasets, SQLite does a pretty good job at processing full table scans with surprising speed.

Of course, not all modules—especially internal ones—can get away with this. Most larger modules should try to provide an intelligent response to the optimizer’s questions. To see how this is done, we’ll take a more in-depth look at this function later on in the chapter. See Best Index and Filter.

Custom Functions

As much as possible, a good module will attempt to make virtual table instances look and act exactly like standard tables. Functions like xBestIndex() help enforce that abstraction, so that virtual tables can interact with the optimizer to correctly produce more efficient lookups—especially in the case of an internal module trying to provide a better or faster indexing method.

There are a few other cases when SQLite needs a bit of help to hide the virtual table abstraction from the database user and other parts of the SQLite engine. SQL function calls and expression processing is one such area.

The xFindFunction() allows a module to override an existing function and provide its own implementation. In most cases, this is not needed (or even recommended). The major exceptions are the SQL functions like() and match(), which are used to implement the SQL expressions LIKE and MATCH (see Appendix D for more details).

A text-search engine is likely to override the like() and match() functions to provide an implementation that can directly access the search string and base its index optimization off the provided arguments. Without the ability to override these functions, it would be very difficult to optimize text searches, as the standard algorithm would require a full stable scan, extracting each row value and doing an external comparison.

int xFindFunction( sqlite3_vtab *vtab, int arg, const char *func_name, custom_function_ref, void **udp_ref)

Optional. This function allows a module to override an existing function. It is called when preparing an SQL statement that uses a virtual table column as the first parameter in an SQL function (or the second, in the case of like(), glob(), match(), or regexp()). The first parameter is the vtab structure for this table instance. The second parameter indicates how many parameters are being passed to the SQL function, and the third parameter holds the name of the function. The fourth parameter is a reference to a scalar function pointer (see Scalar Functions), and the fifth parameter is a reference to a user-data pointer.

Using data from the first three parameters, a virtual table module needs to decide if it wants to override the existing function or not. If the module does not want to override the function, it should simply return zero. If the module does want to provide a custom function, it needs to set the function pointer reference (the fourth parameter) to the scalar function pointer of its choice and set the user-data pointer reference to a user-data pointer. The new function (and user-data pointer) will be called in the same context as the original function.

Most modules will not need to implement this function, and those that do should only need to override a few key functions. The dblist module does not provide an implementation.

Table Rename

Many modules, especially internal modules, key specific information off the name of the virtual table. This means that if the name of the virtual table is changed, the module needs to update any references to that name. This is done with the xRename() function.

int xRename( sqlite3_vtab *vtab, const char *new_name )

Required. This function is called in response to the SQL command ALTER TABLE...RENAME. The first parameter is the table instance being renamed, and the second parameter is the new table name.

In the case of an internal module, the most likely course of action is to rename any shadow tables to match the new name. Doing this properly will require knowing the original table name, as well as the database (main, temp, etc.), that was passed into xCreate() or xConnect().

In the case of an external module, this function can usually just return SQLITE_OK, unless the table name has significance to some external data mapping.

As with any virtual table function that deals with table names, the module needs to properly qualify any SQL operation with a full database and table name, both properly quoted.

The dblist module has a very short xRename() function:

static int dblist_rename( sqlite3_vtab *vtab, const char *newname )
{
    return SQLITE_OK;
}

The dblist module does not use the table name for anything, so it can safely do nothing.

Opening and Closing Table Cursors

We will now look at the process of scanning a table and retrieving the rows and column values from the virtual table. This is done by opening a table cursor. The cursor holds all the state data required for the table scan, including SQL statements, file handles, and other data structures. After the cursor is created, it is used to step over each row in the virtual table and extract any required column values. When the module indicates that no more rows are available, the cursor is either reset or released. Virtual table cursors can only move forward through the rows of a table, but they can be reset back to the beginning for a new table scan.

A cursor is created using the xOpen() function and released with the xClose() function. Like the vtab structure, it is the responsibility of the module to allocate an sqlite3_vtab_cursor structure and return it back to the SQLite engine.

int xOpen( sqlite3_vtab *vtab, sqlite3_vtab_cursor **cursor ): Required. This function must allocate, initialize, and return a cursor.
int xClose( sqlite3_vtab_cursor *cursor ): Required. This function must clean up and release the cursor structure. Basically, it should undo everything done by xOpen().

The native sqlite3_vtab_cursor structure is fairly minimal, and looks like this:

struct sqlite3_vtab_cursor {
    sqlite3_vtab    *pVtab;  /* pointer to table instance */
};

As with the sqlite3_vtab structure, a module is expected to extend this structure with whatever data the module requires. The custom dblist cursor looks like this:

typedef struct dblist_cursor_s {
    sqlite3_vtab_cursor   cur;    /* this must go first */
    sqlite3_stmt          *stmt;  /* PRAGMA database_list statement */
    int                   eof;    /* EOF flag */
} dblist_cursor;

For the dblist module, the only cursor-specific data that is needed is an SQLite statement pointer and an EOF flag. The flag is used to indicate when the module has reached the end of the PRAGMA database_list output.

Outside of allocating the dblist_cursor, the only other task the dblist xOpen() function needs to do is prepare the PRAGMA SQL statement:

static int dblist_open( sqlite3_vtab *vtab, sqlite3_vtab_cursor **cur )
{
    dblist_vtab    *v = (dblist_vtab*)vtab;
    dblist_cursor  *c = NULL;
    int            rc = 0;

    c = sqlite3_malloc( sizeof( dblist_cursor ) );
    *cur = (sqlite3_vtab_cursor*)c;
    if ( c == NULL ) return SQLITE_NOMEM;

    rc = sqlite3_prepare_v2( v->db, "PRAGMA database_list", -1, &c->stmt, NULL );
    if ( rc != SQLITE_OK ) {
        *cur = NULL;
        sqlite3_free( c );
        return rc;
    }
    return SQLITE_OK;
}

As with xCreate() and xConnect(), no sqlite3_vtab_cursor should be allocated or passed back unless an SQLITE_OK is returned. There is no need to initialize the pVtab field of the cursor—SQLite will take care of that for us.

The dblist version of xClose() is very simple. The module must make sure the prepared statement is finalized before releasing the cursor structure:

static int dblist_close( sqlite3_vtab_cursor *cur )
{
    sqlite3_finalize( ((dblist_cursor*)cur)->stmt );
    sqlite3_free( cur );
    return SQLITE_OK;
}

You may be wondering why the module puts the statement pointer into the cursor. This requires the module to reprepare the PRAGMA statement for each cursor. Wouldn’t it make more sense to put the statement pointer in the vtab structure? That way it could be prepared only once, and then reused for each cursor.

At first, that looks like an attractive option. It would be more efficient and, in most cases, work just fine—right up to the point were SQLite needs to create more than one cursor on the same table instance at the same time. Since the module depends on the statement structure to keep track of the position in the virtual table data (that is, the output of PRAGMA database_list), the module design needs each cursor to have its own statement. The easiest way to do this is simply to prepare and store the statement with the cursor, binding the statement lifetime to the cursor lifetime.

Filtering Rows

The xFilter() function works in conjunction with the xBestIndex() function, providing the SQLite query engine a means to communicate any specific constraints or conditions put on the query. The xBestIndex() function is used by the query optimizer to ask the module questions about different lookup patterns or limits. Once SQLite decides what to do, the xFilter() function is used to tell the module which plan of action is being taken for this particular scan.

int xFilter( sqlite3_vtab_cursor *cursor, int idx_num, const char *idx_str, int argc, sqlite3_value **argv ): Required. This function is used to reset a cursor and initiate a new table scan. SQLite will communicate any constraints that have been placed on the current cursor. The module may choose to skip over any rows that do not meet these constraints. All of the parameters are determined by actions taken by the xBestIndex() function. The first row of data must also be fetched.

The idea is to allow the module to “pre-filter” as many rows as it can. Each time the SQLite library asks the module to advanced the table cursor to the next row, the module can use the information provided to the xFilter() function to skip over any rows that do not meet the stated criteria for this table scan.

The xBestIndex() and xFilter() functions can also work together to specify a specific row ordering. Normally, SQLite makes no assumptions about the order of the rows returned by a virtual table, but xBestIndex() can be used indicate the ability to support one or more specific orderings. If one of those orderings is passed into xFilter(), the table is required to return rows in the specified order.

To get any use out of the xFilter() function, a module must also have a fully implemented xBestIndex() function. The xBestIndex() function sets up the data that is passed to the xFilter() function. Most of the data passed into xFilter() has no specific meaning to SQLite, it is simply based off code agreements between xBestIndex() and xFilter().

Implementing all this can be quite cumbersome. Thankfully, as with xBestIndex(), it is perfectly valid for a module to ignore the filtering system. If a user query applies a set of conditions on the rows it wants returned from a virtual table, but the module does not filter those out, the SQLite engine will be sure to do it for us. This greatly simplifies the module code, but with the trade-off that any operation against that module turns into a full table scan.

Full table scans may be acceptable for many types of external modules, but if you’re developing a customized index system, you have little choice but to tackle writing robust xBestIndex() and xFilter() functions. To get a better idea on how to do this, see Best Index and Filter.

Even if the actual filtering process is ignored, the xFilter() function is still required to do two important tasks. First, it must reset the cursor and prepare it for a new table scan. Second, xFilter() is responsible for fetching the first row of output data. Since the dblist module doesn’t utilize the filtering system, these are pretty much the only things the xFilter() function ends up doing:

static int dblist_filter( sqlite3_vtab_cursor *cur,
            int idxnum, const char *idxstr,
            int argc, sqlite3_value **value )
{
    dblist_cursor  *c = (dblist_cursor*)cur;
    int            rc = 0;

    rc = sqlite3_reset( c->stmt );     /* start a new scan */
    if ( rc != SQLITE_OK ) return rc;
    c->eof = 0;                        /* clear EOF flag */

    dblist_get_row( c );               /* fetch first row */
    return SQLITE_OK;
}

Although the dblist module does not utilize the xBestIndex() data, there are still important things to do. The xFilter() function must first reset the statement. This “rewinds” the pragma statement, putting our cursor at the head of the table. There are situations where sqlite3_reset() may be called on a freshly prepared (or freshly reset) statement, but that is not a problem. There are other calling sequences which may require xFilter() to reset the statement.

Because both xFilter() and the xNext() function (which we’ll look at next) need to fetch row data, we’ve broken that out into its own function:

static int dblist_get_row( dblist_cursor *c )
{
    int  rc;
    
    if ( c->eof ) return SQLITE_OK;
    rc = sqlite3_step( c->stmt );
    if ( rc == SQLITE_ROW ) return SQLITE_OK;     /* we have a valid row */

    sqlite3_reset( c->stmt );
    c->eof = 1;
    return ( rc == SQLITE_DONE ? SQLITE_OK : rc );  /* DONE -> OK */
}

The main thing this function does is call sqlite3_step() on the cursor’s SQL statement. If the module gets a valid row of data (SQLITE_ROW), everything is good. If the module gets anything else (including SQLITE_DONE) it considers the scan done. In that case, the module resets the statement before returning SQLITE_OK (if it got to the end of the table) or the error. Although the module could wait for xFilter() to reset the statement or xClose() to finalize it, it is best to reset the statement as soon as we know we’ve reached the end of the available data.

Extracting and Returning Data

We now, finally, get to the core of any virtual table implementation. Once the module has a valid cursor, it needs to be able to advanced that cursor over the virtual table data and return column values. This core set of four functions is used to do just that:

int xNext( sqlite3_vtab_cursor *cursor )

Required. This function is used to advance the cursor to the next row. When the SQLite engine no longer needs data from the current row, this is called to advance the virtual table scan to the next row. If a virtual table is already on the last row of the table when xNext() is called, it should not return an error.

Note that xNext() truly is a “next” function and not a “get row” function. It is not called to fetch the first row of data. The first row of data should be fetched and made available by the xFilter() function.

If the module is filtering rows via xBestIndex() and xFilter(), it is legitimate for xNext() to skip over any rows in the virtual table that do not meet the conditions put forth to xFilter(). Additionally, if xBestIndex() indicated an ability to return the data in a specific order, xNext() is obligated to do so. Otherwise, xNext() may return rows in any order it wishes, so long as they are all returned.

int xEof( sqlite3_vtab_cursor *cursor )

Required. This function is used to determine if the virtual table has reached the end of the table. Every call to xFilter() and xNext() will immediately be followed by a call to xEof(). If the previous call to xNext() advanced the cursor past the end of the table, xEof() should return a true (nonzero) value, indicating that the end of the table has been reached. If the cursor still points to a valid table row, xEof() should return false (zero).

xEof() is also called right after xFilter(). If a table is empty or will return no rows under the conditions defined by xFilter(), then xEof() needs to return true at this time.

There is no guarantee xNext() will keep being called until xEof() returns true. The query may decide to terminate the table scan at any time.

int xRowid( sqlite3_vtab_cursor *cursor, sqlite_int64 *rowid )

Required. This function is used to retrieve the ROWID value of the current row. The ROWID value should be passed back through the rowid reference provided as the second parameter.

int xColumn( sqlite3_vtab_cursor *cursor, sqlite3_context *ctx, int cidx )

Required. This function is used to extract column values from the cursor’s current row. The parameters include the virtual table cursor, an sqlite3_context structure, and a column index. Values should be returned using the sqlite3_context and the sqlite3_result_xxx() functions. The column index is zero-based, so the first column defined in the virtual table definition will have a column index of zero. This function is typically called multiple times between calls to xNext().

The first two functions, xNext() and xEof(), are used to advance a cursor through the virtual table data. A cursor can only be advanced through the data, it cannot be asked to back up, save for a full reset back to the beginning of the table. Unless xBestIndex() and xFilter() agreed on a specific filtering or ordering, xNext() is under no obligation to present the data in a specific order. The only requirement is that continuous calls to xNext() will eventually visit each row exactly once.

At each row, xRowid() and xColumn() can be used to extract values from the current row. xRowid() is used to extract the virtual ROWID value, while xColumn() is used to extract values from all the other columns. While a cursor is at a specific row, the xRowid() and xColumn() functions may be called any number of times in any order.

Since the dblist module depends on executing the PRAGMA statement to return data, most of these functions are extremely simple. For example, the dblist xNext() function calls the dblist_get_row() function, which in turn calls sqlite3_step() on the cursor’s statement:

static int dblist_next( sqlite3_vtab_cursor *cur )
{
    return dblist_get_row( (dblist_cursor*)cur );
}

The dblist xEof() function returns the cursor EOF flag. This flag is set by dblist_get_row() when the module reaches the end of the PRAGMA database_list data. The flag is simply returned:

static int dblist_eof( sqlite3_vtab_cursor *cur )
{
    return ((dblist_cursor*)cur)->eof;
}

The data extraction functions for the dblist module are also extremely simple. The dblist module uses the seq column from the PRAGMA database_list output as its virtual ROWID. This means that it can return the value of the seq column as our ROWID. As it happens, the seq column is the first column, so it has an index of zero:

static int dblist_rowid( sqlite3_vtab_cursor *cur, sqlite3_int64 *rowid )
{
    *rowid = sqlite3_column_int64( ((dblist_cursor*)cur)->stmt, 0 );
    return SQLITE_OK;
}

The xColumn() function is nearly as simple. Since there is a one-to-one mapping between the output columns of the PRAGMA statement and the dblist virtual table columns, the module can extract values directly from the PRAGMA output and pass them back as column values for our virtual table:

static int dblist_column( sqlite3_vtab_cursor *cur, sqlite3_context *ctx, int cidx )
{
    dblist_cursor   *c = (dblist_cursor*)cur;
    sqlite3_result_value( ctx, sqlite3_column_value( c->stmt, cidx ) );
    return SQLITE_OK;
}

In most cases, these functions would be considerably more complex than what the dblist module has here. The fact that the dblist module depends on only a single SQL command to return all of the required data makes the design of these functions quite simple—even more so, since the output of the SQL command exactly matches the data format we need.

To get a better idea of what a more typical module might look like, have a look at the implementation of these functions in the other example module (Advanced Example: weblog Module).

Virtual Table Modifications

As with any other table, modules support the ability to make modifications to a virtual table using the standard INSERT, UPDATE, and DELETE commands. All three operations are supported by the xUpdate() function. This is a table-level function that operates on a table instance, not a cursor.

int xUpdate( sqlite3_vtab *vtab, int argc, sqlite3_value **argv, sqlite_int64 *rowid ): Optional. This call is used to support all virtual table modifications. It will be called in response to any INSERT, UPDATE, or DELETE command. The first parameter is the table instance. The second and third parameters pass in a series of database values. The fourth parameter is a reference to a ROWID value, and is used to pass back the newly defined ROWID when a new row is inserted.

The argv parameter will have a valid sqlite3_value structure for each argument, although some of those values may have the type SQLITE_NULL. Rows are always inserted or updated as whole sets. Even if the SQL UPDATE command only updates a single column of a row, the xUpdate() command will always be provided with a value for every column in a row.

If only a single argument is provided, this is a DELETE request. The sole argument (argv[0]) will be an SQLITE_INTEGER that holds the ROWID of the row that needs to be deleted.

In all other cases, exactly n+2 arguments will be provided, where n is the number of columns, including HIDDEN ones (see Create and Connect) in the table definition. The first argument (argv[0]) is used to refer to existing ROWID values, while the second (argv[1]) is used to refer to new ROWID values. These two arguments will be followed by a value for each column in a row, starting with argv[2]. Essentially, the arguments argv[1] through argv[n+1] represent a whole set of row values starting with the implied ROWID column followed by all of the declared columns.

If argv[0] has the type SQLITE_NULL, this is an INSERT request. If the INSERT statement provided an explicit ROWID value, that value will be in argv[1] as an SQLITE_INTEGER. The module should verify the ROWID is appropriate and unique before using it. If no explicit ROWID value was given, argv[1] will have a type of SQLITE_NULL. In this case, the module should assign an unused ROWID value and pass it back via the rowid reference pointer in the xUpdate() parameters.

If argv[0] has the type SQLITE_INTEGER, this is an UPDATE. In this case, both argv[0] and argv[1] will be SQLITE_INTEGER types with ROWID values. The existing row indicated in argv[0] should be updated with the values supplied in argv[1] through argv[n+1]. In most cases, argv[0] will be the same as argv[1], indicating no change in the ROWID value. However, if the UPDATE statement includes an explicit update to the ROWID column, it may be the case that the first two arguments do not match. In that case, the row indicated by argv[0] should have its ROWID value changed to argv[1]. In either case, all the other columns should be updated with the additional arguments.

It is the module’s responsibility to enforce any constraints or typing requirements on the incoming data. If the data is invalid or otherwise cannot be inserted or updated into the virtual table, xUpdate() should return an appropriate error, such as SQLITE_CONSTRAINT (constraint violation).

There may be times when a modification (including a DELETE) happens while an active cursor is positioned at the row in question. The module design must be able to handle this situation.

The xUpdate() function is optional. If no implementation is provided, all virtual table instances provided by this module will be read-only. That is the case with our dblist module, so there is no implementation for xUpdate().

Cursor Sequence

Most cursor functions have very specific tasks. Some of these, like xEof(), are typically very small, while others, like xNext(), can be quite complex. Regardless of size or complexity, they all need to work together to perform the proper tasks and maintain the state of the sqlite3_vtab_cursor structure, including any extensions your module might require.

In order to maintain the proper state, it is important to understand which functions can be called, and when. Figure 10-1 provides a sequence map of when cursor functions can be called. Your module needs to be prepared to properly deal with any of these transitions.

Some of the call sequences can catch people by surprise, such as having xClose() called before xEof() returns true. This might happen if an SQL query has a LIMIT clause. Also, it is possible for xRowid() to be called multiple times between calls to xNext(). Similarly, xColumn() may be called multiple times with the same column index between calls to xNext(). It is also possible that neither xRowid() nor xColumn() (nor both) may be called at all between calls to xNext().

In addition to cursor functions, some table-level functions may also be called throughout this sequence. In specific, xUpdate() may be called at any time, possibly altering the row a cursor is currently processing. Generally, this happens by having an update statement open a cursor, find the row it is looking to modify, and then call xUpdate() outside of the cursor context.

Figure 10-1. The lifespan of a virtual table cursor. This shows the possible calling sequences for the cursor functions of a virtual table module.

It can be tricky to test your module and confirm that everything is working properly. The only advice I can offer is to test your module with a known and relatively small set of data, running it through as many query types as possible. Try to include different variations of GROUP BY, ORDER BY, and any number of join operations (including self-joins). When you’re first starting to write a module, it might also help to put simple printf() or other debug statements at the top of each function. This will assist in understanding the call patterns.

Transaction Control

Like any other database element, virtual tables are expected to be aware of database transactions and support them appropriately. This is done through four optional functions. These functions are table-level functions, not cursor-level functions.

int xBegin( sqlite3_vtab *vtab ): Optional. Indicates the start of a transaction involving the virtual table. Any return code other than SQLITE_OK will cause the transaction to fail.
int xSync( sqlite3_vtab *vtab ): Optional. Indicates the start of a transactional commit that involves the virtual table. Any return code other than SQLITE_OK will cause the whole transaction to automatically be rolled back.
int xCommit( sqlite3_vtab *vtab ): Optional. Indicates the finalization of a transactional commit that involves the virtual table. The return code is ignored—if xSync() succeeded, this function must succeed.
int xRollback( sqlite3_vtab *vtab ): Optional. Indicates that a transaction involving the virtual table is being rolled back. The module should revert its state to whatever state it was in prior to the call to xBegin(). The return code is ignored.

These functions are optional and are normally only required by external modules that provide write capabilities to external data sources. Internal modules that record their data into standard tables are covered by the existing transaction engine (which will automatically begin, commit, or roll back under the control of the user SQL session). Modules that are limited to read-only functionality do not need transactional control, since they are not making any modifications.

Note

Internal modules (modules that store all their data in shadow database tables) do not need to implement transaction control functions. The existing, built-in transaction system will automatically be applied to any changes made to standard database tables.

If you do need to support your own transactions, it is important to keep the program flow in mind. xBegin() will always be the first function to be called.^[2] Typically, there will be calls to xUpdate() followed by a two-step sequence of calls to xSync() and xCommit() to close and commit the transaction. Once xBegin() has been called, it is also possible to get a call to xRollback() to roll the transaction back. The xRollback() function can also be called after xSync() (but before xCommit()) if the sync step fails.

Full transactions do not nest, and virtual tables do not support save-points. Once a call to xBegin() has been made, you will not get another one until either xCommit() or xRollback() has been called.

In keeping with the ACID properties of a transaction, any modifications made between calls to xBegin() and xSync() should only be visible to this virtual table instance in this database connection (that is, this vtab structure). This can be done by delaying any writes or modifications to external data sources until the xSync() function is called, or by somehow locking the data source to ensure other module instances (or other applications) cannot modify or access the data. If the data is written out in xSync(), the data source still needs to be locked until a call to xCommit() or xRollback() is made. If xSync() returns SQLITE_OK, it is assumed that any call to xCommit() will succeed, so you want to try to make your modifications in xSync() and verify and release them in xCommit().

Proper transactional control is extremely hard, and making your transactions fully atomic, consistent, isolated, and durable is no small task. Most external modules that attempt to implement transactions do so by locking the external data source. You still need to support some type of rollback ability, but exclusive access eliminates any isolation concerns.

Since the dblist module is read-only, it does not need to provide any transactional functions.

Register the Module

Now that we’ve had a look at all the functions required to define a module, we need to register them. As we’ve already seen, this is done with the sqlite3_create_module() function. To register the module, we need to fill out an sqlite3_module structure and pass that to the create function.

You may have noticed that all of our module functions were marked static. This is because the module was written as an extension (see the section SQLite Extensions). By structuring the code that way, we can easily build our virtual table module into an application, or we can create a dynamic extension.

Here is the initialization function for our extension:

static sqlite3_module dblist_mod = {
    1,                   /* iVersion        */
    dblist_connect,      /* xCreate()       */
    dblist_connect,      /* xConnect()      */
    dblist_bestindex,    /* xBestIndex()    */
    dblist_disconnect,   /* xDisconnect()   */
    dblist_disconnect,   /* xDestroy()      */
    dblist_open,         /* xOpen()         */
    dblist_close,        /* xClose()        */
    dblist_filter,       /* xFilter()       */
    dblist_next,         /* xNext()         */
    dblist_eof,          /* xEof()          */
    dblist_column,       /* xColumn()       */
    dblist_rowid,        /* xRowid()        */
    NULL,                /* xUpdate()       */
    NULL,                /* xBegin()        */
    NULL,                /* xSync()         */
    NULL,                /* xCommit()       */
    NULL,                /* xRollback()     */
    NULL,                /* xFindFunction() */
    dblist_rename        /* xRename()       */
};

int dblist_init( sqlite3 *db, char **error, const sqlite3_api_routines *api )
{
    int   rc;
    SQLITE_EXTENSION_INIT2(api);

    /* register module */
    rc = sqlite3_create_module( db, "dblist", &dblist_mod, NULL );
    if ( rc != SQLITE_OK ) {
        return rc;
    }

    /* automatically create an instance of the virtual table */
    rc = sqlite3_exec( db, 
        "CREATE VIRTUAL TABLE temp.sql_database_list USING dblist", 
        NULL, NULL, NULL  );
    return rc;
}

The most important thing to notice is that the sqlite3_module structure is given a static allocation. The SQLite library does not make a copy of this structure when the module is registered, so the sqlite3_module structure must remain valid for the duration of the database connection. In this case, we use a file-level global that is statically initialized with all the correct values.

The extension entry point function is a bit unique, in that it not only defines the module, but it also goes ahead and creates an instance of a dblist virtual table. Normally, an extension initialization function wouldn’t (and shouldn’t) do something like this, but in this case it makes sense. Like any other table, virtual tables are typically bound to a specific database. But the active database list we get from the PRAGMA database_list command is a function of the current state of the database connection (and all attached databases), and isn’t really specific to a single database. If you were to create a dblist table in multiple databases that were all attached to the same database connection, they would all return the same data. It is the database connection (and not a specific database) that is the real source of the data.

So, in the somewhat unique case of the dblist module, we only need one instance of the virtual table per database connection. Ideally, it would always be there, no matter which databases are attached. It would also be best if the table wasn’t “left behind” in a database file after that database was closed or detached. Not only would this tie the database file to our module, it is also unnecessary since a dblist table instance doesn’t have any state beyond the database connection.

To meet all these needs, the module goes ahead and just makes a single instance of the table in the temporary database. Every database connection has a temporary database, and it is always named temp. This makes the table instance easy to find. Creating it in the temp database also keeps the table instance out of any “real” database files, and ties the lifetime of the table to the lifetime of the database connection. All in all, it is a perfect, though somewhat unusual, fit for this specific module.

The end result is that if you load the dblist extension, it will not only register the dblist module, it will also create an instance of the dblist virtual table at temp.sql_database_list. System tables in SQLite have the prefix sqlite_, but those names are reserved and the extension cannot create a table with that prefix. The name sql_database_list gets the idea across, however.

Example Usage

After all that work, what do we get? First, lets have a look at what the PRAGMA database_list does by itself. Here is some example output:

sqlite> PRAGMA database_list;
seq        name       file
---------- ---------- -----------------------------
0          main       /Users/jak/sqlite/db1.sqlite3
1          temp
2          memory
3          two        /Users/jak/sqlite/db2.sqlite3

In this case, I ran the sqlite3 utility with the file db1.sqlite3, created an empty temporary table (so the temp database shows up), attached an in-memory database as memory, and finally attached a second database file as two.

Now let’s load our module extension and see what we get:

sqlite> .load dblist.sqlite3ext dblist_init
sqlite> SELECT * FROM sqlite3_database_list;
seq        name       file
---------- ---------- -----------------------------
0          main       /Users/jak/sqlite/db1.sqlite3
1          temp
2          memory
3          two        /Users/jak/sqlite/db2.sqlite3

And we get…the exact same thing! Actually, that’s a good thing—that was the whole point. The key thing is that, unlike the PRAGMA command, we can do this:

sqlite> SELECT * FROM sqlite3_database_list WHERE file == '';
seq        name       file
---------- ---------- -----------------------------
1          temp
2          memory

This shows us all of the databases that have no associated filename. You’ll note that PRAGMA database_list (and hence the dblist module) returns an empty string, and not a NULL, for a database that does not have an associated database file.

Perhaps most useful, we can also make queries like this to figure out what the logical database name is for a particular database file:

sqlite> SELECT name FROM sqlite3_database_list
   ...>   WHERE file LIKE '%db2.sqlite3';
name
----------
two

I’d be the first to admit this isn’t exactly ground-breaking work. Parsing the direct output of PRAGMA database_list isn’t that big of a deal—for a program or for a human. The main point of this example wasn’t to show the full power of virtual tables, but to give us a problem to work with where we could focus on the functions and interface required by the virtual table system, rather than focusing on the complex code required to implement it.

Now that you’ve seen the module interface and have a basic idea of how things work, we’re going to shift our focus to something a bit more practical, and a bit more complex.