The Lock API

Whenever we process data on a regular basis, especially if it takes a while to complete, we might run into a situation in which parallel requests want to trigger that process again, while the first is still running. Most of the time, this is not a good thing as it can lead to conflicts and/or data corruption. A good example from Drupal core in which this can happen is the cron. If we start it, the process can end up taking a good few seconds. Remember, it needs to pull together the hook_cron() implementations and run them all. So while that is happening, if we trigger another cron run, it will give us a nice message asking us to chill because the cron is already running. It does this with the help of the Lock API.

The Lock API is a low-level Drupal solution for ensuring that processes don't trample each other. Since in this chapter we are talking about things such as batch operations, queues, and other kinds of potentially time-consuming processes, let's look at the Lock API to see how we can leverage it for our custom code. But first, let's get an understanding of how this locking works.

The concept is very simple. Before starting a process, we acquire a lock based on a given name. This means we check if, by any chance, this process has not already been started. If we get the green light (we acquired the lock), we go ahead and start the process. The API at this point locks down this named process so that other requests cannot acquire it again until the initial one has released it. This normally happens when the process is finished and other requests may then start it up again. Before that, though, we get a red light which tells us we cannot start it—to maintain the analogy of traffic lights. Speaking of which, the main Lock API implementation in Drupal, namely the one using the database, takes this analogy to heart, as it names the table where the locks are being stored semaphore.

The API is actually pretty simple. We have a Lock service, which is an implementation of LockBackendInterface. By default, Drupal 8 comes with two: the DatabaseLockBackend and PersistentDatabaseLockBackend. Usually, the former is used. The difference between the two is that the latter can be used to keep a lock across multiple requests. The former in fact releases all the locks at the end of the request. We'll be using this one to demonstrate how the API works, as that is what Drupal core uses mostly as well.

If you remember from Chapter 7, Your Own Custom Entity and Plugin Types, we created a Drush command that would run all of our Product importers. Of course, we so far have only created one plugin. But what we want to do is ensure that if this Drush command is executed multiple times at more or less the same time (before the actual import finishes), we don't run the imports simultaneously. It's probably not the most realistic example, as Drush commands have to actually be run by someone so there is good control over their timing. However, the same approach, as we will see, can be applied to processes triggered by unpredictable requests.

We defined the ProductCommands::runPluginImport() helper method that runs the import for a specific plugin. We can wrap this trigger with a lock block. First, though, we need to inject the service, and we can get to it using the lock key (or the static shorthand if we cannot inject it: \Drupal::lock()) . By now you should know how to inject a new service so I will not repeat that step here.

So instead of just running the import() method on the plugin, we can first have this:

if (!$this->lock->acquire($plugin->getPluginId())) { 
  $this->logger()->log('notice', t('The plugin @plugin is already running.', ['@plugin' => $plugin->getPluginDefinition()['label']])); 
  return; 
}  

We try to acquire the lock by passing an arbitrary name (in this case, our plugin ID). We are sticking to one plugin at a time here so multiple plugins should in fact be able to run at the same time. If the acquire() method returns FALSE, it means we have a red light, a lock has already been acquired. In this case, we print a message to that effect and get out of there. However, if not, it means we have a green light and we can proceed with the rest of our code as it was. The acquire() method has locked it down, and other requests can no longer acquire it until we release it. Speaking of which, there is one thing we need to add at the end (after the import):

$this->lock->release($plugin->getPluginId());  

We need to release the lock so other requests can run it again if they like. That is pretty much it. If we run our Drush command twice, more or less simultaneously, we will have something like this in the terminal:

As you can see, only one call to the Drush command actually went through. As expected.

But we can also do it a bit differently. Let's say that we want to wait with the second request until the first one is finished, and then still run it. After all, we don't want to miss out on any updates. We can do this using the wait() method of LockBackendInterface. The rework is minor:

if (!$this->lock->acquire($plugin->getPluginId())) { 
  $this->logger()->log('notice', t('The plugin @plugin is already running. Waiting for it to finish.', ['@plugin' => $plugin->getPluginDefinition()['label']])); 
  if ($this->lock->wait($plugin->getPluginId())) { 
    $this->logger()->log('notice', t('The wait is killing me. Giving up.')); 
    return; 
  } 
}  

So basically, if we don't acquire a lock, we print a message that we are waiting for the go-ahead. Then, we use the wait() method, which puts the request to sleep for a maximum of 30 seconds. Within that time frame, it will continuously check every 25 milliseconds (until it reaches 500 milliseconds, when it starts checking every 500 milliseconds) if the lock has become available. If it has, it breaks out of the loop and returns FALSE (meaning that we can go ahead, as the lock has become available). Otherwise, if the 30 seconds have passed, it returns TRUE, which means that we still need to wait. At this point we give up. Guess what: the second parameter of the wait() method is the number of maximum seconds to wait, so we can control that as well. I recommend you check out the code to better understand what it does.

Like this, we can run our two Drush commands in parallel and ensure that the second one that was requested only runs after the first finishes. If it takes longer than 30 seconds, we give up, because something probably went wrong. And there we have the Lock API.