The first thing we are going to look at is update hooks, revisiting our previous Sports module created in Chapter 8, The Database API. We will focus on the &$sandbox parameter we didn't use then. The goal is to run an update on each of our records in the players table and mark them as retired. The point is to illustrate how we can process each of these records one at a time in individual requests to prevent a PHP timeout. This is handy in case we have many records.
So to get us going, here is all the code, and we'll see right after what everything means:
/** * Update all the players to mark them as retired. */ function sports_update_8002(&$sandbox) { $database = \Drupal::database(); if (empty($sandbox)) { $results = $database->query("SELECT id FROM {players}")->fetchAllAssoc('id'); $sandbox['progress'] = 0; $sandbox['ids'] = array_keys($results); $sandbox['max'] = count($results); } $id = $sandbox['ids'] ? array_shift($sandbox['ids']) : NULL; $player = $database->query("SELECT * FROM {players} WHERE id = :id", [':id' => $id])->fetch(); $data = $player->data ? unserialize($player->data) : []; $data['retired'] = TRUE; $database->update('players') ->fields(['data' => serialize($data)]) ->condition('id', $id) ->execute(); $sandbox['progress']++; $sandbox['#finished'] = $sandbox['progress'] / $sandbox['max']; }
If you remember, the function name contains the new schema version for the module, which will be set once this is run. Refer back to Chapter 8, The Database API for more information.
When this hook is fired, the $sandbox argument (passed by reference) is empty. Its goal is to act as temporary storage between the requests needed to process everything inside the function. We can use it to store arbitrary data, but we should be mindful of the size as it has to fit inside a LONGBLOB table column.
The first thing we are doing is getting our hands on the database service to make queries to our players table. But more importantly, we are checking whether the $sandbox variable is empty, which indicates that this is the start of the process. If it is, we add some data to it that is specific to our process. In this case, we want to store the progress (this is quite common), the IDs of the players that need to be updated, and the total number of records (also quite common). To do this, we make a simple query.
Once the sandbox is set, we can get the first ID in the list while also removing it so that, iteratively, we have fewer records to process. Based on that ID, we load the relevant player, add our data to it, and update it back in the database. Once that is done, we increment the progress by 1 (as we processed one record). Finally, the #finished key in the sandbox is what Drupal looks at to determine whether the process is finished. It expects an integer between 0 and 1, the latter signifying that we are done. If anything below 1 is found, the function gets called again and the $sandbox array will contain the data as we left it (incremented progress and one less ID to process). In which case, the main body of the function runs again, processing the next record, and so on, until the progress divided by the maximum number of records is equal to 1. If we have 100 records, when the progress reaches 100, the following is true: 100 / 100 = 1. Then, Drupal knows to finish the process and not call the function again.
This process is also called batching in Drupal terms and is very useful because Drupal will make as many requests as needed to finish it. We can control the workload each request needs to make in one request. The previous example might be a bit of overkill in the sense that a request is perfectly capable of processing more than one player. We are actually losing time because, like this, Drupal needs to bootstrap itself again and again for each request. So, it's up to us to find that sweet spot. In our previous example, what we could have done was break up the array of IDs into chunks of maybe five, and allowed a request to process five records instead of one. That would have surely increased the speed, but I encourage you to go ahead and try that on your own now that you understand the principles behind using $sandbox for batching.