To get started with LMDB, we need to perform three tasks, usually in this order:
Fortunately, the LMDB API makes it straightforward to do each of these tasks and wrap them in idiomatic Scala functions.
To open an environment, we’ll use the following functions and data structures:
| type Env = Ptr[Byte] |
| type DB = UInt |
| def mdb_env_create(env:Ptr[Env]):Int = extern |
| def mdb_env_open(env:Env, path:CString, flags:Int, mode:Int):Int = extern |
Env here is an opaque pointer, and we have a convenient helper function mdb_env_create to allocate it for us. Once initialized, we use mdb_env_open to open an actual directory path. We can pass it a set of option flags packed into an integer (which we won’t be using), and we also need to provide a UNIX file permissions flag (we’ll only be using read-write permissions, represented as 0600 in octal notation, or 384 as a Scala integer literal). A helper function to do all this will look like so:
| def open(path:CString):Env = { |
| val env_ptr = stackalloc[Env] |
| check(mdb_env_create(env_ptr), "mdb_env_create") |
| val env = !env_ptr |
| // Unix permissions for octal 0644 (read/write) |
| check(mdb_env_open(env, path, 0, 420), "mdb_env_open") |
| env |
| } |
Likewise, to store data, we need to first allocate a Transaction object—again, an opaque pointer—with mdb_txn_begin.
Once we have a transaction object, we “open” a database within our environment with mdb_dbi_open. Once the transaction is begun and the database is opened, we’re ensured a consistent view of the database’s contents, without the risk of modification by other processes.
Now, we can call mdb_put to store data. This function takes the Transaction object as well as a Key and a Value; but, the key and value structs are simple and almost identical to libuv’s Buffer. Here are all the definitions:
| type Transaction = Ptr[Byte] |
| type Key = CStruct2[Long,Ptr[Byte]] |
| type Value = CStruct2[Long,Ptr[Byte]] |
| def mdb_txn_begin(env:Env, parent:Ptr[Byte], flags:Int, |
| tx:Ptr[Transaction]):Int = extern |
| def mdb_dbi_open(tx:Transaction, name:CString, flags:Int, |
| db:Ptr[DB]):Int = extern |
| def mdb_put(tx:Transaction, db:DB, key:Ptr[Key], value:Ptr[Value], |
| flags:Int):Int = extern |
| def mdb_txn_commit(tx:Transaction):Int = extern |
A straightforward Scala wrapper might work like this:
| def get(env:Env,key:CString):CString = { |
| val db_ptr = stackalloc[DB] |
| val tx_ptr = stackalloc[Transaction] |
| |
| check(mdb_txn_begin(env, null, 0, tx_ptr), "mdb_txn_begin") |
| val tx = !tx_ptr |
| |
| check(mdb_dbi_open(tx,null,0,db_ptr), "mdb_dbi_open") |
| val db = !db_ptr |
| |
| val rk = stackalloc[Key] |
| rk._1 = string.strlen(key) + 1 |
| rk._2 = key |
| val rv = stackalloc[Value] |
| |
| check(mdb_get(tx,db, rk, rv), "mdb_get") |
| |
| stdio.printf(c"key: %s value: %s\n", rk._2, rv._2) |
| val output = stdlib.malloc(rv._1) |
| string.strncpy(output,rv._2,rv._1) |
| check(mdb_txn_abort(tx), "mdb_txn_abort") |
| return output |
| } |
Finally, we can retrieve back the data we’ve written. The steps to reading data are similar to writing data: we still have to create a transaction, and are still ensured a consistent view of all data. However, if we specify that this will be a read-only transaction with a special flag, we can do so without blocking other readers. The signature of mdb_get is straightforward:
| def mdb_get(tx:Transaction, db:DB, key:Ptr[Key], |
| value:Ptr[Value]):Int = extern |
And we can wrap it in much the same way as mdb_put:
| def put(env:Env,key:CString,value:CString):Unit = { |
| val db_ptr = stackalloc[DB] |
| val tx_ptr = stackalloc[Transaction] |
| |
| check(mdb_txn_begin(env, null, 0, tx_ptr), "mdb_txn_begin") |
| val tx = !tx_ptr |
| check(mdb_dbi_open(tx,null,0,db_ptr), "mdb_dbi_open") |
| val db = !db_ptr |
| |
| val k = stackalloc[Key] |
| k._1 = string.strlen(key) + 1 |
| k._2 = key |
| val v = stackalloc[Value] |
| v._1 = string.strlen(value) + 1 |
| v._2 = value |
| |
| check(mdb_put(tx, db, k,v,0), "mdb_put") |
| check(mdb_txn_commit(tx), "mdb_txn_commit") |
| } |
Although there are quite a few more functions and capabilities in the full LMDB API, this is enough for us to write many useful programs!
Before we integrate with our HTTP server framework, let’s write a simple command-line utility to interact with a database. Since LMDB is a library, and not an application, it doesn’t include a command-line toolkit. This means we’ll need to use this kind of tool for database testing and maintenance, even after we’ve completed the HTTP integration. To keep the design simple, we’ll parse lines of input into two types of commands to store and lookup data:
| put $key $value |
| get $key |
With the Scala wrapper functions we’ve already prepared, this is just a few lines of code to implement:
| val line_buffer = stdlib.malloc(1024) |
| val get_key_buffer = stdlib.malloc(512) |
| val put_key_buffer = stdlib.malloc(512) |
| val value_buffer = stdlib.malloc(512) |
| |
| def main(args:Array[String]):Unit = { |
| val env = LMDB.open(c"./db") |
| stdio.printf(c"opened db %p\n", env) |
| stdio.printf(c"> ") |
| |
| while (stdio.fgets(line_buffer, 1024, stdio.stdin) != null) { |
| val put_scan_result = stdio.sscanf(line_buffer,c"put %s %s", |
| put_key_buffer, value_buffer) |
| val get_scan_result = stdio.sscanf(line_buffer,c"get %s", |
| get_key_buffer) |
| |
| if (put_scan_result == 2) { |
| stdio.printf(c"storing value %s into key %s\n", |
| put_key_buffer, value_buffer) |
| LMDB.put(env,put_key_buffer,value_buffer) |
| stdio.printf(c"saved key: %s value: %s\n", put_key_buffer, value_buffer) |
| } else if (get_scan_result == 1) { |
| stdio.printf(c"looking up key %s\n", get_key_buffer) |
| val lookup = LMDB.get(env,get_key_buffer) |
| stdio.printf(c"retrieved key: %s value: %s\n", get_key_buffer,lookup) |
| } else { |
| println("didn't understand input") |
| } |
| stdio.printf(c"> ") |
| } |
| println("done") |
| } |
Now let’s test it out. First, we’ll need to create an empty directory to serve as our database with mkdir, then we can store and retrieve some data:
| $ ./target/scala-2.11/lmdb_simple-out |
| mdb_env_create returned 0 |
| mdb_env_open returned 0 |
| opened db 0x7fa0e6500000 |
| > put foo bar |
| storing value foo into key bar for db 0x7fab02d00000 |
| mdb_txn_begin returned 0 |
| mdb_dbi_open returned 0 |
| mdb_put returned 0 |
| mdb_txn_commit returned 0 |
| saved key: foo value: bar |
| > get foo |
| looking up key foo for db 0x7fab02d00000 |
| mdb_txn_begin returned 0 |
| mdb_dbi_open returned 0 |
| mdb_get returned 0 |
| key: foo value: bar |
| mdb_txn_abort returned 0 |
| retrieved key: foo value: bar |
| > done |
Now, if you exit the program, you can observe that two files have been created in ./db:
| $ ls -al db/* |
| -rw-r--r-- 1 rwhaling staff 32768 Mar 2 12:10 db/data.mdb |
| -rw-r--r-- 1 rwhaling staff 8192 Mar 2 12:07 db/lock.mdb |
That’s our data! Now, if you run the program again, you can query for the key we set before:
| $ ./target/scala-2.11/lmdb_simple-out |
| mdb_env_create returned 0 |
| mdb_env_open returned 0 |
| opened db 0x7fa0e6500000 |
| > get foo |
| looking up key foo for db 0x7ffa20c03070 |
| mdb_txn_begin returned 0 |
| mdb_dbi_open returned 0 |
| mdb_get returned 0 |
| key: foo value: bar |
| mdb_txn_abort returned 0 |
| retrieved key: foo value: bar |
| > done |
And you get back exactly what we stored previously. However, we’re still only storing and retrieving plain strings. Next, we’ll extend what we’ve built to handle other kinds of data.