Working with the LMDB API

To get started with LMDB, we need to perform three tasks, usually in this order:

  1. Create a new environment object.
  2. Store data.
  3. Retrieve data.

Fortunately, the LMDB API makes it straightforward to do each of these tasks and wrap them in idiomatic Scala functions.

To open an environment, we’ll use the following functions and data structures:

LMDB/lmdb_simple/main.scala
 type​ ​Env​ = Ptr[​Byte​]
 type​ ​DB​ = UInt
 def​ mdb_env_create(env​:​​Ptr​[​Env​])​:​​Int​ = extern
 def​ mdb_env_open(env​:​​Env​, path​:​​CString​, flags​:​​Int​, mode​:​​Int​)​:​​Int​ = extern

Env here is an opaque pointer, and we have a convenient helper function mdb_env_create to allocate it for us. Once initialized, we use mdb_env_open to open an actual directory path. We can pass it a set of option flags packed into an integer (which we won’t be using), and we also need to provide a UNIX file permissions flag (we’ll only be using read-write permissions, represented as 0600 in octal notation, or 384 as a Scala integer literal). A helper function to do all this will look like so:

LMDB/lmdb_simple/main.scala
 def​ open(path​:​​CString​)​:​​Env​ = {
 val​ env_ptr ​=​ stackalloc[​Env​]
  check(mdb_env_create(env_ptr), ​"mdb_env_create"​)
 val​ env ​=​ !env_ptr
 // Unix permissions for octal 0644 (read/write)
  check(mdb_env_open(env, path, 0, 420), ​"mdb_env_open"​)
  env
 }

Likewise, to store data, we need to first allocate a Transaction object—again, an opaque pointer—with mdb_txn_begin.

Once we have a transaction object, we “open” a database within our environment with mdb_dbi_open. Once the transaction is begun and the database is opened, we’re ensured a consistent view of the database’s contents, without the risk of modification by other processes.

Now, we can call mdb_put to store data. This function takes the Transaction object as well as a Key and a Value; but, the key and value structs are simple and almost identical to libuv’s Buffer. Here are all the definitions:

LMDB/lmdb_simple/main.scala
 type​ ​Transaction​ = Ptr[​Byte​]
 type​ ​Key​ = CStruct2[​Long​,​Ptr​[​Byte​]]
 type​ ​Value​ = CStruct2[​Long​,​Ptr​[​Byte​]]
 def​ mdb_txn_begin(env​:​​Env​, parent​:​​Ptr​[​Byte​], flags​:​​Int​,
  tx​:​​Ptr​[​Transaction​])​:​​Int​ = extern
 def​ mdb_dbi_open(tx​:​​Transaction​, name​:​​CString​, flags​:​​Int​,
  db​:​​Ptr​[​DB​])​:​​Int​ = extern
 def​ mdb_put(tx​:​​Transaction​, db​:​​DB​, key​:​​Ptr​[​Key​], value​:​​Ptr​[​Value​],
  flags​:​​Int​)​:​​Int​ = extern
 def​ mdb_txn_commit(tx​:​​Transaction​)​:​​Int​ = extern

A straightforward Scala wrapper might work like this:

LMDB/lmdb_simple/main.scala
 def​ get(env​:​​Env​,key​:​​CString​)​:​​CString​ = {
 val​ db_ptr ​=​ stackalloc[​DB​]
 val​ tx_ptr ​=​ stackalloc[​Transaction​]
 
  check(mdb_txn_begin(env, ​null​, 0, tx_ptr), ​"mdb_txn_begin"​)
 val​ tx ​=​ !tx_ptr
 
  check(mdb_dbi_open(tx,​null​,0,db_ptr), ​"mdb_dbi_open"​)
 val​ db ​=​ !db_ptr
 
 val​ rk ​=​ stackalloc[​Key​]
  rk._1 ​=​ string.strlen(key) + 1
  rk._2 ​=​ key
 val​ rv ​=​ stackalloc[​Value​]
 
  check(mdb_get(tx,db, rk, rv), ​"mdb_get"​)
 
  stdio.printf(c​"key: %s value: %s\n"​, rk._2, rv._2)
 val​ output ​=​ stdlib.malloc(rv._1)
  string.strncpy(output,rv._2,rv._1)
  check(mdb_txn_abort(tx), ​"mdb_txn_abort"​)
 return​ output
 }

Finally, we can retrieve back the data we’ve written. The steps to reading data are similar to writing data: we still have to create a transaction, and are still ensured a consistent view of all data. However, if we specify that this will be a read-only transaction with a special flag, we can do so without blocking other readers. The signature of mdb_get is straightforward:

LMDB/lmdb_simple/main.scala
 def​ mdb_get(tx​:​​Transaction​, db​:​​DB​, key​:​​Ptr​[​Key​],
  value​:​​Ptr​[​Value​])​:​​Int​ = extern

And we can wrap it in much the same way as mdb_put:

LMDB/lmdb_simple/main.scala
 def​ put(env​:​​Env​,key​:​​CString​,value​:​​CString​)​:​​Unit​ = {
 val​ db_ptr ​=​ stackalloc[​DB​]
 val​ tx_ptr ​=​ stackalloc[​Transaction​]
 
  check(mdb_txn_begin(env, ​null​, 0, tx_ptr), ​"mdb_txn_begin"​)
 val​ tx ​=​ !tx_ptr
  check(mdb_dbi_open(tx,​null​,0,db_ptr), ​"mdb_dbi_open"​)
 val​ db ​=​ !db_ptr
 
 val​ k ​=​ stackalloc[​Key​]
  k._1 ​=​ string.strlen(key) + 1
  k._2 ​=​ key
 val​ v ​=​ stackalloc[​Value​]
  v._1 ​=​ string.strlen(value) + 1
  v._2 ​=​ value
 
  check(mdb_put(tx, db, k,v,0), ​"mdb_put"​)
  check(mdb_txn_commit(tx), ​"mdb_txn_commit"​)
 }

Although there are quite a few more functions and capabilities in the full LMDB API, this is enough for us to write many useful programs!

Before we integrate with our HTTP server framework, let’s write a simple command-line utility to interact with a database. Since LMDB is a library, and not an application, it doesn’t include a command-line toolkit. This means we’ll need to use this kind of tool for database testing and maintenance, even after we’ve completed the HTTP integration. To keep the design simple, we’ll parse lines of input into two types of commands to store and lookup data:

 put $key $value
 get $key

With the Scala wrapper functions we’ve already prepared, this is just a few lines of code to implement:

LMDB/lmdb_simple/main.scala
 val​ line_buffer ​=​ stdlib.malloc(1024)
 val​ get_key_buffer ​=​ stdlib.malloc(512)
 val​ put_key_buffer ​=​ stdlib.malloc(512)
 val​ value_buffer ​=​ stdlib.malloc(512)
 
 def​ main(args​:​​Array​[​String​])​:​​Unit​ = {
 val​ env ​=​ LMDB.open(c​"./db"​)
  stdio.printf(c​"opened db %p\n"​, env)
  stdio.printf(c​"> "​)
 
 while​ (stdio.fgets(line_buffer, 1024, stdio.stdin) != ​null​) {
 val​ put_scan_result ​=​ stdio.sscanf(line_buffer,c​"put %s %s"​,
  put_key_buffer, value_buffer)
 val​ get_scan_result ​=​ stdio.sscanf(line_buffer,c​"get %s"​,
  get_key_buffer)
 
 if​ (put_scan_result == 2) {
  stdio.printf(c​"storing value %s into key %s\n"​,
  put_key_buffer, value_buffer)
  LMDB.put(env,put_key_buffer,value_buffer)
  stdio.printf(c​"saved key: %s value: %s\n"​, put_key_buffer, value_buffer)
  } ​else​ ​if​ (get_scan_result == 1) {
  stdio.printf(c​"looking up key %s\n"​, get_key_buffer)
 val​ lookup ​=​ LMDB.get(env,get_key_buffer)
  stdio.printf(c​"retrieved key: %s value: %s\n"​, get_key_buffer,lookup)
  } ​else​ {
  println(​"didn't understand input"​)
  }
  stdio.printf(c​"> "​)
  }
  println(​"done"​)
 }

Now let’s test it out. First, we’ll need to create an empty directory to serve as our database with mkdir, then we can store and retrieve some data:

 $ ./target/scala-2.11/lmdb_simple-out
 mdb_env_create returned 0
 mdb_env_open returned 0
 opened db 0x7fa0e6500000
 > put foo bar
 storing value foo into key bar for db 0x7fab02d00000
 mdb_txn_begin returned 0
 mdb_dbi_open returned 0
 mdb_put returned 0
 mdb_txn_commit returned 0
 saved key: foo value: bar
 > get foo
 looking up key foo for db 0x7fab02d00000
 mdb_txn_begin returned 0
 mdb_dbi_open returned 0
 mdb_get returned 0
 key: foo value: bar
 mdb_txn_abort returned 0
 retrieved key: foo value: bar
 > done

Now, if you exit the program, you can observe that two files have been created in ./db:

 $ ls -al db/*
 -rw-r--r-- 1 rwhaling staff 32768 Mar 2 12:10 db/data.mdb
 -rw-r--r-- 1 rwhaling staff 8192 Mar 2 12:07 db/lock.mdb

That’s our data! Now, if you run the program again, you can query for the key we set before:

 $ ./target/scala-2.11/lmdb_simple-out
 mdb_env_create returned 0
 mdb_env_open returned 0
 opened db 0x7fa0e6500000
 > get foo
 looking up key foo for db 0x7ffa20c03070
 mdb_txn_begin returned 0
 mdb_dbi_open returned 0
 mdb_get returned 0
 key: foo value: bar
 mdb_txn_abort returned 0
 retrieved key: foo value: bar
 > done

And you get back exactly what we stored previously. However, we’re still only storing and retrieving plain strings. Next, we’ll extend what we’ve built to handle other kinds of data.