Chapter 5. Concurrency

by Martin Fowler and David Rice

Concurrency is one of the most tricky aspects of software development. Whenever you have multiple processes or threads manipulating the same data, you run into concurrency problems. Just thinking about concurrency is hard since it’s difficult to enumerate the possible scenarios that can get you into trouble. Whatever you do, there always seems to be something you miss. Furthermore, concurrency is hard to test for. We’re great fans of a large body of automated tests acting as a foundation for software development, but it’s hard to get tests to give us the security we need for concurrency problems.

One of the great ironies of enterprise application development is that few branches of software development use concurrency more yet worry about it less. The reason enterprise developers can get away with a naive view of concurrency is transaction managers. Transactions provide a framework that helps avoid many of the most tricky aspects of concurrency in an enterprise application. As long as you do all your data manipulation within a transaction, nothing really bad will happen to you.

Sadly, this doesn’t mean we can ignore concurrency problems completely, for the primary reason that many interactions with a system can’t be placed within a single database transaction. This forces us to manage concurrency in situations where data spans transactions. The term we use is offline concurrency, that is, concurrency control for data that’s manipulated during multiple database transactions.

The second area where concurrency rears its ugly head for enterprise developers is application servers—supporting multiple threads in an application server system. We’ve spent much less time on this because dealing with it is much simpler. Indeed, you can use server platforms that take care of much of it for you.

Sadly, to understand these issues, you need to understand at least some of the general concurrency concepts. So we begin this chapter by going over these issues. We don’t pretend that this chapter is a general treatment of concurrency in software development—for that we’d need at least a complete book. What this chapter does is introduce concurrency issues for enterprise application development. Once we’ve done that we’ll introduce the patterns for handling offline concurrency and say our brief words on application server concurrency.

In much of this chapter we’ll illustrate the ideas with examples from an area that we hope you are very familiar with—the source code control systems used by teams to coordinate changes to a code base. We do this because it’s relatively easy to understand as well as well as familiar. After all, if you aren’t familiar with source code control systems, you really shouldn’t be developing enterprise applications.

Concurrency Problems

We’ll start by going through the essential problems of concurrency. We call them essential because they’re the fundamental problems that concurrency control systems try to prevent. They aren’t the only problems of concurrency, because the control mechanisms often create a new set of problems in their solutions! However, they do focus on the essential point of concurrency control.

Lost updates are the simplest idea to understand. Say Martin edits a file to make some changes to the checkConcurrency method—a task that takes a few minutes. While he’s doing this David alters the updateImportantParameter method in the same file. David starts and finishes his alteration very quickly, so quickly that, even though he starts after Martin, he finishes before him. This is unfortunate. When Martin read the file it didn’t include David’s update, so when Martin writes the file it writes over the version that David updated and David’s update is lost forever.

An inconsistent read occurs when you read two things that are correct pieces of information but not correct at the same time. Say Martin wishes to know how many classes are in the concurrency package, which contains two subpackages for locking and multiphase. Martin looks in the locking package and sees seven classes. At this point he gets a phone call from Roy on some abstruse question. While Martin’s answering the phone, David finishes dealing with that pesky bug in the four-phase lock code and adds two classes to the locking package and three classes to the five that were in the multiphase package. His phone call over, Martin looks in the multiphase package to see how many classes there are and sees eight, producing a grand total of fifteen.

Sadly, fifteen classes was never the right answer. The correct answer was twelve before David’s update and seventeen afterward. Either answer would have been correct, even if not current, but fifteen was never correct. This problem is called an inconsistent read because the data that Martin read was inconsistent.

Both of these problems cause a failure of correctness (or safety), and they result in incorrect behavior that would not have occurred without two people trying to work with the same data at the same time. However, if correctness were the only issue, these problems wouldn’t be that serious. After all, we can arrange things so that only one of us can work the data at one time. While this helps with correctness, it reduces the ability to do things concurrently. The essential problem of any concurrent programming is that it’s not enough to worry about correctness; you also have to worry about liveness: how much concurrent activity can go on. Often people need to sacrifice some correctness to gain more liveness, depending on the seriousness and likelihood of the failures and the need for people to work on their data concurrently.

These aren’t all the problems you get with concurrency, but we think of these as the basic ones. To solve them we use various control mechanisms. Alas, there’s no free lunch. The solutions introduce problems of their own, although these problems are less serious than the basic ones. Still, this does bring up an important point: If you can tolerate the problems, you can avoid any form of concurrency control. This is rare, but occasionally you find circumstances that permit it.

Execution Contexts

Whenever processing occurs in a system, it occurs in some context and usually in more than one. There’s no standard terminology for execution contexts, so here we’ll define the ones that we’re assuming in this book.

From the perspective of interacting with the outside world, two important contexts are the request and the session. A request corresponds to a single call from the outside world which the software works on and for which it optionally sends back a response. During a request the processing is largely in the server’s court and the client is assumed to wait for a response. Some protocols allow the client to interrupt a request before it gets a response, but this is fairly rare. More often a client may issue another request that may interfere with one it just sent. So a client may ask to place an order and then issue a separate request to cancel that order. From the client’s view the two requests may be obviously connected, but depending on your protocol that may not be so obvious to the server.

A session is a long-running interaction between a client and a server. It may consist of a single request, but more commonly it consists of a series of requests that the user regards as a consistent logical sequence. Commonly a session will begin with a user logging in and doing various bits of work that may involve issuing queries and one or more business transactions (to be discussed shortly). At the end of the session the user logs out, or he may just go away and assume that the system interprets that as logging out.

Server software in an enterprise application sees both requests and sessions from two angles, as the server from the client and as the client to other systems. Thus, you’ll often see multiple sessions: HTTP sessions from the client and database sessions with various databases.

Two important terms from operating systems are processes and threads. A process is a, usually heavyweight, execution context that provides a lot of isolation for the internal data it works on. A thread is a lighter-weight active agent that’s set up so that multiple threads can operate in a single process. People like threads because they support multiple requests within a single process—which is good utilization of resources. However, threads usually share memory, and such sharing leads to concurrency problems. Some environments allow you to control what data a thread may access, allowing you to have isolated threads that don’t share memory.

The difficulty with execution contexts comes when they don’t line up as well as we might like. In theory each session would have an exclusive relationship with a process for its whole lifetime. Since processes are properly isolated from each other, this would help reduce concurrency conflicts. Currently we don’t know of any server tools that allow you to work this way. A close alternative is to start a new process for each request, which was the common mode for early Perl Web systems. People tend to avoid that now because starting processes tie up a lot of resources, but it’s quite common for systems to have a process handle only one request at a time—and that can save many concurrency headaches.

When you’re dealing with databases there’s another important context—a transaction. Transactions pull together several requests that the client wants treated as if they were a single request. They can occur from the application to the database (a system transaction) or from the user to an application (a business transaction). We’ll dig into these terms more later on.

Isolation and Immutability

The problems of concurrency have been around for a while, and software people have come up with various solutions. For enterprise applications two solutions are particularly important: isolation and immutability.

Concurrency problems occur when more than one active agent, such as a process or thread, has access to the same piece of data. One way to deal with this is isolation: Partition the data so that any piece of it can only be accessed by one active agent. Processes work like this in operating system memory: The operating system allocates memory exclusively to a single process, and only that process can read or write the data linked to it. Similarly you find file locks in many popular productivity applications. If Martin opens a file, nobody else can open it. They may be allowed to open a read-only copy of the file as it was when Martin started, but they can’t change it and they don’t get to see the file between his changes.

Isolation is a vital technique because it reduces the chance of errors. Too often we’ve seen people get themselves into trouble because they use a technique that forces everyone to worry about concurrency all the time. With isolation you arrange things so that the program enters an isolated zone, within which you don’t have to worry about concurrency. Good concurrency design is thus to find ways of creating such zones and to ensure that as much programming as possible is done in one of them.

You only get concurrency problems if the data you’re sharing can be modified. So one way to avoid concurrency conflicts is to recognize immutable data. Obviously we can’t make all data immutable, as the whole point of many systems is data modification. But by identifying some data as immutable, or at least immutable almost all the time, we can relax our concurrency concerns for it and share it widely. Another option is to separate applications that are only reading data, and have them use copied data sources, from which we can then relax all concurrency controls.

Optimistic and Pessimistic Concurrency Control

What happens when we have mutable data that we can’t isolate? In broad terms there are two forms of concurrency control that we can use: optimistic and pessimistic.

Let’s suppose that Martin and David both want to edit the Customer file at the same time. With optimistic locking both of them can make a copy of the file and edit it freely. If David is the first to finish, he can check in his work without trouble. The concurrency control kicks in when Martin tries to commit his changes. At this point the source code control system detects a conflict between Martin’s changes and David’s changes. Martin’s commit is rejected and it’s up to him to figure out how to deal with the situation. With pessimistic locking whoever checks out the file first prevents anyone else from editing it. So if Martin is first to check out, David can’t work with the file until Martin is finished with it and commits his changes.

A good way of thinking about this is that an optimistic lock is about conflict detection while a pessimistic lock is about conflict prevention. As it turns out real source code control systems can use either type, although these days most source code developers prefer to work with optimistic locks. (There is a reasonable argument that says that optimistic locking isn’t really locking, but we find the terminology too convenient, and widespread, to ignore.)

Both approaches have their pros and cons. The problem with the pessimistic lock is that it reduces concurrency. While Martin is working on a file he locks it, so everybody else has to wait. If you’ve worked with pessimistic source code control mechanisms, you know how frustrating this can be. With enterprise data it’s often worse because, if someone is editing data, nobody else is allowed to read it, let alone edit it.

Optimistic locks allow people to make much better progress, because the lock is only held during the commit. The problem with them is what happens when you get a conflict. Essentially everybody after David’s commit has to check out the version of the file that David checked in, figure out how to merge their changes with David’s changes, and then check in a newer version. With source code this happens not to be too difficult. Indeed, in many cases the source code control system can automatically do the merge for you, and even when it can’t automerge, tools can make it much easier to see the differences. But business data is usually too difficult to automerge, so often all you can do is throw away everything and start again.

The essence of the choice between optimistic and pessimistic locks is the frequency and severity of conflicts. If conflicts are sufficiently rare, or if the consequences are no big deal, you should usually pick optimistic locks because they give you better concurrency and are usually easier to implement. However, if the results of a conflict are painful for users, you’ll need to use a pessimistic technique instead.

Neither of these approaches is exactly free of problems. Indeed, by using them you can easily introduce problems that cause almost as much trouble as the basic concurrency problems that you’re trying to solve in the first place. We’ll leave a detailed discussion of these ramifications to a proper book on concurrency, but here are a few highlights to bear in mind.

Preventing Inconsistent Reads

Consider this situation. Martin edits the Customer class, which makes calls on the Order class. Meanwhile David edits the Order class and changes the interface. David compiles and checks in; Martin then compiles and checks in. Now the shared code is broken because Martin didn’t realize that the Order class was altered underneath him. Some source code control systems will spot this inconsistent read, but others require some kind of manual discipline to enforce consistency, such as updating your files from the trunk before you check in.

In essence this is the inconsistent read problem, and it’s often easy to miss because most people tend to focus on lost updates as the essential problem in concurrency. Pessimistic locks have a well-worn way of dealing with this problem through read and write locks. To read data you need a read (or shared) lock; to write data you need a write (or exclusive) lock. Many people can have read locks on the data at one time, but if anyone has a read lock nobody can get a write lock. Conversely, once somebody has a write lock, then nobody else can have any lock. With this system you can avoid inconsistent reads with pessimistic locks.

Optimistic locks usually base their conflict detection on some kind of version marker on the data. This can be a timestamp or a sequential counter. To detect lost updates the system checks the version marker of your update with the version marker of the shared data. If they’re the same, the system allows the update and updates the version marker.

Detecting an inconsistent read is essentially similar: In this case every bit of data that was read also needs to have its version marker compared with the shared data. Any differences indicate a conflict.

Controlling access to every bit of data that’s read often causes unnecessary problems due to conflicts or waits on data that doesn’t actually matter that much. You can reduce this burden by separating out data you’ve used from data you’ve merely read. With a pick list of products it doesn’t matter if a new product appears in it after you start your changes. But a list of charges that you’re summarizing for a bill may be more important. The difficulty is that this requires some careful analysis of what it’s used for. A zip code in a customer’s address may seem innocuous, but, if a tax calculation is based on where somebody lives, that address has to be controlled for concurrency. As you can see, figuring out what you need to control and what you don’t is an involved exercise whichever form of concurrency control you use.

Another way to deal with inconsistent read problems is to use Temporal Reads. These prefix each read of data with some kind of timestamp or immutable label, and the database returns the data as it was according to that time or label. Very few databases have anything like this, but developers often come across this in source code control systems. The problem is that the data source needs to provide a full temporal history of changes, which takes time and space to process. This is reasonable for source code but both more difficult and more expensive for databases. You may need to provide this capability for specific areas of your domain logic: see [Snodgrass] and [Fowler TP] for ideas on how to do that.

Deadlocks

A particular problem with pessimistic techniques is deadlock. Say Martin starts editing the Customer file and David starts editing the Order file. David realizes that to complete his task he needs to edit the Customer file too, but Martin has a lock on it so he has to wait. Then Martin realizes he has to edit the Order file, which David has locked. They are now deadlocked—neither can make progress until the other completes. Described like this, deadlocks sound easy to prevent, but they can occur with many people involved in a complex chain, and that makes them more tricky.

There are various techniques you can use to deal with deadlocks. One is to have software that can detect a deadlock when it occurs. In this case you pick a victim, who has to throw away his work and his locks so the others can make progress. Deadlock detection is very difficult and causes pain for victims. A similar approach is to give every lock a time limit. Once you hit that limit you lose your locks and your work—essentially becoming a victim. Timeouts are easier to implement than a deadlock detection mechanism, but if anyone holds locks for a while some people will be victimized when there actually is no deadlock present.

Timeouts and detection deal with a deadlock when it occurs, other approaches try to stop deadlocks occurring at all. Deadlocks essentially occur when people who already have locks try to get more (or to upgrade from read to write locks.) Thus, one way of preventing them is to force people to acquire all their locks at once at the beginning of their work and then prevent them gaining more.

You can force an order on how everybody gets locks. An example might be to always get locks on files in alphabetical order. This way, once David had a lock on the Order file, he can’t try to get a lock on the Customer file because it’s earlier in the sequence. At that point he essentially becomes a victim.

You can also make it so that, if Martin tries to acquire a lock and David already has one, Martin automatically becomes a victim. It’s a drastic technique, but it’s simple to implement. And in many cases such a scheme works just fine.

If you’re very conservative, you can use multiple schemes. For example, you force everyone to get all their locks at the beginning, but add a timeout in case something goes wrong. That may seem like using a belt and braces, but such conservatism is often wise with deadlocks because they are pesky things that are easy to get wrong.

It’s very easy to think you have a deadlock-proof scheme and then find some chain of events you didn’t consider. As a result we prefer very simple and conservative schemes for enterprise application development. They may cause unnecessary victims, but that’s usually much better than the consequences of missing a deadlock scenario.

Transactions

The primary tool for handling concurrency in enterprise applications is the transaction. The word “transaction” often brings to mind an exchange of money or goods. Walking up to an ATM machine, entering your PIN, and withdrawing cash is a transaction. Paying the $3 toll at the Golden Gate Bridge is a transaction. Buying a beer at the local pub is a transaction.

Looking at typical financial dealings such as these provides a good definition for the term. First, a transaction is a bounded sequence of work, with both start and endpoints well defined. An ATM transaction begins when the card is inserted and ends when cash is delivered or an inadequate balance is discovered. Second, all participating resources are in a consistent state both when the transaction begins and when the transaction ends. A man purchasing a beer has a few bucks less in his wallet but has a nice pale ale in front of him. The sum value of his assets hasn’t changed. It’s the same for the pub—pouring free beer would be no way to run a business.

In addition, each transaction must complete on an all-or-nothing basis. The bank can’t subtract from an account holder’s balance unless the ATM machine actually delivers the cash. While the human element might make this last property optional during the above transactions, there is no reason software can’t make a guarantee on this front.

ACID

Software transactions are often described in terms of the ACID properties:

Atomicity: Each step in the sequence of actions performed within the boundaries of a transaction must complete successfully or all work must roll back. Partial completion is not a transactional concept. Thus, if Martin is transferring some money from his savings to his checking account and the server crashes after he’s withdrawn the money from his savings, the system behaves as if he never did the withdrawal. Committing says both things occurred; a roll back says neither occurred. It has to be both or neither.

Consistency: A system’s resources must be in a consistent, noncorrupt state at both the start and the completion of a transaction.

Isolation: The result of an individual transaction must not be visible to any other open transactions until that transaction commits successfully.

Durability: Any result of a committed transaction must be made permanent. This translates to “Must survive a crash of any sort.”

Transactional Resources

Most enterprise applications run into transactions in terms of databases. But there are plenty of other things that can be controlled using transactions, such as message queues, printers, and ATMs. As a result, technical discussions of transactions use the term “transactional resource” to mean anything that’s transactional—that is, that uses transactions to control concurrency. “Transactional resource” is a bit of a mouthful, so we just use “database,” since that’s the most common case. But when we say “database,” the same applies for any other transactional resource.

To handle the greatest throughput, modern transaction systems are designed to keep transactions as short as possible. As a result the general advice is to never make a transaction span multiple requests. A transaction that spans multiple requests is generally known as a long transaction.

For this reason a common approach is to start a transaction at the beginning of a request and complete it at the end. This request transaction is a nice simple model, and a number of environments make it easy to do declaratively, by just tagging methods as transactional.

A variation on this is to open a transaction as late as possible. With a late transaction you may do all the reads outside it and only open it up when you do updates. This has the advantage of minimizing the time spent in a transaction. If there’s a lengthy time lag between the opening of the transaction and the first write, this may improve liveness. However, this means that you don’t have any concurrency control until you begin the transaction, which leaves you open to inconsistent reads. As a result it’s usually not worth doing this unless you have very heavy contention or you’re doing it anyway because of business transactions that span multiple requests (which is the next topic).

When you use transactions, you need be somewhat aware of what exactly is being locked. For many database actions the transaction system locks the rows involved, which allows multiple transactions to access the same table. However, if a transaction locks a lot of rows in a table, then the database has more locks than it can handle and escalates the locking to the entire table—locking out other transactions. This lock escalation can have a serious effect on concurrency, and it’s particularly why you shouldn’t have some “object” table for data at the domain’s Layer Supertype (475) level. Such a table is a prime candidate for lock escalation, and locking that table shuts everybody else out of the database.

Reducing Transaction Isolation for Liveness

It’s common to restrict the full protection of transactions so that you can get better liveness. This is particularly the case when it comes to handling isolation. If you have full isolation, you get serializable transactions. Transactions are serializable if they can be executed concurrently and you get a result that’s the same as you get from some sequence of executing the transactions serially. Thus, if we take our earlier example of Martin counting his files, serializability guarantees that he gets a result that corresponds to completing his transaction either entirely before David’s transaction starts (twelve) or entirely after David’s finishes (seventeen). Serializability can’t guarantee which result, as in this case, but at least it guarantees a correct one.

Most transactional systems use the SQL standard which defines four levels of isolation. Serializable is the strongest level, and each level below allows a particular kind of inconsistent read to enter the picture. We’ll explore these with the example of Martin counting files while David modifies them. There are two packages: locking and multiphase. Before David’s update there are seven files in the locking package and five in the multiphase package; after his update there are nine in the locking package and eight in the multiphase package. Martin looks at the locking package and David then updates both; then Martin looks at the multiphase package.

If the isolation level is serializable, the system guarantees that Martin’s answer is either twelve or seventeen, both of which are correct. Serializability can’t guarantee that every run through this scenario will give the same result, but it always gets either the number before David’s update or the number afterwards.

The first isolation level below serializable is repeatable read, which allows phantoms. Phantoms occur when you add some elements to a collection and the reader sees only some of them. The case here is that Martin looks at the files in the locking package and sees seven. David then commits his transaction, after which Martin looks at the multiphase package and sees eight. Hence, Martin gets an incorrect result. Phantoms occur because they are valid for some of Martin’s transaction but not all of it, and they’re always things that are inserted.

Next down the list is the isolation level of read committed, which allows unrepeatable reads. Imagine that Martin looks at a total rather than the actual files. An unrepeatable read allows him to read a total of seven for locking. Next David commits; then he reads a total of eight for multiphase. It’s called an unrepeatable read because, if Martin were to reread the total for the locking package after David committed, he would get the new number of nine. His original read of seven can’t be repeated after David’s update. It’s easier for databases to spot unrepeatable reads than phantoms, so the repeatable read gives you more correctness than read committed but less liveness.

The lowest level of isolation is read uncommitted, which allows dirty reads. At read uncommitted you can read data that another transaction hasn’t actually committed yet. This causes two kinds of errors. Martin might look at the locking package when David adds the first of his files but before he adds the second. As a result he sees eight files in the locking package. The second kind of error comes if David adds his files but then rolls back his transaction—in which case Martin sees files that were never really there.

Table 5.1 lists the read errors caused by each isolation level.

Table 5.1. Isolation Levels and the Inconsistent Read Errors They Allow

Image

 

To be sure of correctness you should always use the serializable isolation level. The problem is that choosing serializable really messes up the liveness of a system, so much so that you often have to reduce serializability in order to increase throughput. You have to decide what risks you want to take and make your own trade-off of errors versus performance.

You don’t have to use the same isolation level for all transactions, so you should look at each transaction and decide how to balance liveness versus correctness for it.

Business and System Transactions

What we’ve talked about so far, and most of what most people talk about, is what we call system transactions, or transactions supported by RDBMS systems and transaction monitors. A database transaction is a group of SQL commands delimited by instructions to begin and end it. If the fourth statement in the transaction results in an integrity constraint violation, the database must roll back the effects of the first three statements and notify the caller that the transaction has failed. If all four statements had completed successfully all would have been made visible to other users at the same time rather than one at a time. RDBMS systems and application server transaction managers are so commonplace that they can pretty much be taken for granted. They work well and are well understood by application developers.

However, a system transaction has no meaning to the user of a business system. To an online banking system user a transaction consists of logging in, selecting an account, setting up some bill payments, and finally clicking the OK button to pay the bills. This is what we call a business transaction, and that it displays the same ACID properties as a system transaction seems a reasonable expectation. If the user cancels before paying the bills, any changes made on previous screens should be canceled. Setting up payments shouldn’t result in a system-visible balance change until the OK button is pressed.

The obvious answer to supporting the ACID properties of a business transaction is to execute the entire business transaction within a single system transaction. Unfortunately business transactions often take multiple requests to complete, so using a single system transaction to implement one results in a long system transaction. Most transaction systems don’t work very efficiently with long transactions.

This doesn’t mean that you should never use long transactions. If your database has only modest concurrency needs, you may well be able to get away with it. And if you can get away with it, we suggest you do it. Using a long transaction means you avoid a lot of awkward problems. However, the application won’t be scalable because long transactions will turn the database into a major bottleneck. In addition, the refactoring from long to short transactions is both complex and not well understood.

For this reason many enterprise applications can’t risk long transactions. In this case you have to break the business transaction down into a series of short transactions. This means that you are left to your own devices to support the ACID properties of business transactions between system transactions—a problem we call offline concurrency. System transactions are still very much part of the picture. Whenever the business transaction interacts with a transactional resource, such as a database, that interaction will execute within a system transaction in order to maintain the integrity of that resource. However, as you’ll read below it’s not enough to string together a series of system transactions to properly support a business transaction. The business application must provide a bit of glue between them.

Atomicity and durability are the ACID properties most easily supported for business transactions. Both are supported by running the commit phase of the business transaction, when the user hits Save, within a system transaction. Before the session attempts to commit all its changes to the record set, it first opens a system transaction. The system transaction guarantees that the changes will commit as a unit and will be made permanent. The only potentially tricky part here is maintaining an accurate change set during the life of the business transaction. If the application uses a Domain Model (116), a Unit of Work (184) can track changes accurately. Placing business logic in a Transaction Script (110) requires a manual tracking of changes, but that’s probably not much of a problem as the use of transaction scripts implies rather simple business transactions.

The tricky ACID property to enforce with business transactions is isolation. Failures of isolation lead to failures of consistency. Consistency dictates that a business transaction not leave the record set in an invalid state. Within a single transaction the application’s responsibility in supporting consistency is to enforce all available business rules. Across multiple transactions the application’s responsibility is to ensure that one session doesn’t step all over another session’s changes, leaving the record set in the invalid state of having lost a user’s work.

As well as the obvious problems of clashing updates, there are the more subtle problems of inconsistent reads. When data is read over several system transactions, there’s no guarantee that it will be consistent. The different reads can even introduce data in memory that’s sufficiently inconsistent to cause application failures.

Business transactions are closely tied to sessions. In the user’s view each session is a sequence of business transactions (unless they’re only reading data), so we usually make the assumption that all business transactions execute in a single client session. While it’s certainly possible to design a system that has multiple sessions for one business transaction, that’s a very good way of getting yourself badly confused—so we’ll assume that you won’t do that.

Patterns for Offline Concurrency Control

As much as possible, you should let your transaction system deal with concurrency problems. Handling concurrency control that spans system transactions plonks you firmly in the murky waters of dealing with concurrency yourself. This water is full of virtual sharks, jellyfish, piranhas, and other, less friendly creatures. Unfortunately, the mismatch between business and system transactions means you sometimes just have to wade in. The patterns that we’ve provided here are some techniques that we’ve found helpful in dealing with concurrency control that spans system transactions.

Remember that these are techniques you should only use if you have to. If you can make all your business transactions fit into a system transaction by ensuring that they fit within a single request, then do that. If you can get away with long transactions by forsaking scalability, then do that. By leaving concurrency control in the hands of your transaction software you’ll avoid a great deal of trouble. These techniques are what you have to use when you can’t do that. Because of the tricky nature of concurrency, we have to stress again that the patterns are a starting point, not a destination. We’ve found them useful, but we don’t claim to have found a cure for all concurrency ills.

Our first choice for handling offline concurrency problems is Optimistic Offline Lock (416), which essentially uses optimistic concurrency control across the business transactions. We like this as a first choice because it’s an easier approach to program and yields the best liveness. The limitation of Optimistic Offline Lock (416) is that you only find out that a business transaction is going to fail when you try to commit it, and in some circumstances the pain of that late discovery is too much. Users may have put an hour’s work into entering details about a lease, and if you get lots of failures users lose faith in the system. Your alternative is Pessimistic Offline Lock (426), with which you find out early if you’re in trouble but lose out because it’s harder to program and it reduces your liveness.

With either of these approaches you can save considerable complexity by not trying to manage locks on every object. A Coarse-Grained Lock (438) allows you to manage the concurrency of a group of objects together. Another way you can make life easier for application developers is to use Implicit Lock (449), which saves them from having to manage locks directly. Not only does this save work, it also avoids bugs when people forget—and these bugs are hard to find.

A common statement about concurrency is that it’s a purely technical decision that can be decided on after requirements are complete. We disagree. The choice of optimistic or pessimistic controls affects the whole user experience of the system. An intelligent design of Pessimistic Offline Lock (426) needs a lot of input about the domain from the users of the system. Similarly domain knowledge is needed to choose good Coarse-Grained Locks (438).

Futzing with concurrency is one of the most difficult programming tasks. It’s very difficult to test concurrent code with confidence. Concurrency bugs are hard to reproduce and very difficult to track down. The patterns we’ve described have worked for us so far, but this is particularly difficult territory. If you need to go down this path, it’s worth getting some experienced help. At the very least consult the books we mention at the end of this chapter.

Application Server Concurrency

So far we’ve talked about concurrency mainly in terms of multiple sessions running against a shared data source. Another form of concurrency is the process concurrency of the application server itself: How does that server handle multiple requests concurrently and how does this affect the design of the application on the server? The big difference from the other concurrency issues we’ve talked about so far is that application server concurrency doesn’t involve transactions, so working with them means stepping away from the relatively controlled transactional world.

Explicit multithreaded programming, with locks and synchronization blocks, is complicated to do well. It’s easy to introduce defects that are very hard to find—concurrency bugs are almost impossible to reproduce—resulting in a system that works correctly 99 percent of the time but throws random fits. Such software is incredibly frustrating to use and debug, so our policy is to avoid the need for explicit handling of synchronization and locks as much as possible. Application developers should almost never have to deal with these explicit concurrency mechanisms.

The simplest way to handle this is to use process-per-session, where each session runs in its own process. Its great advantage is that the state of each process is completely isolated from the other processes so application programmers don’t have to worry at all about multithreading. As far as memory isolation goes, it’s almost equally effective to have each request start a new process or to have one process tied to the session that’s idle between requests. Many early Web systems would start a new Perl process for each request.

The problem with process-per-session is that it uses up a lot resources, since processes are expensive beasties. To be more efficient you can pool the processes, such that each one only handles a single request at one time but can handle multiple requests from different sessions in a sequence. This approach of pooled process-per-request will use many fewer processes to support a given number of sessions. Your isolation is almost as good: You don’t have many of the nasty multithreading issues. The main problem of process for request over process-per-session is that you have to ensure that any resources used to handle a request are released at the end of the request. The current Apache mod-perl uses this scheme, as do a lot of serious large-scale transaction processing systems.

Even process-per-request will need many processes running to handle a reasonable load. You can further improve throughput by having a single process run multiple threads. With this thread-per-request approach, each request is handled by a single thread within a process. Since threads use much fewer server resources than a process, you can handle more requests with less hardware this way, so your server is more efficient. The problem with using thread-per-request is that there’s no isolation between the threads and any thread can touch any piece of data that it can get access to.

In our view there’s a lot to be said for using process-per-request. Although it’s less efficient than thread-per-request, using process-per-request is equally scalable. You also get better robustness—if one thread goes haywire it can bring down an entire process, so using process-per-request limits the damage. Particularly with a less experienced team, the reduction of threading headaches (and the time and cost of fixing bugs) is worth the extra hardware costs. We find that few people actually do any performance testing to assess the relative costs of thread-per-request and process-per-request for their application.

Some environments provide a middle ground of allowing isolated areas of data to be assigned to a single thread. COM does this with the single-threaded apartment, and J2EE does it with Enterprise Java Beans (and will in the future with isolates). If your platform has something like this available, it can allow you to have your cake and eat it—whatever that means.

If you use thread-per-request, the most important thing is to create and enter an isolated zone where application developers can mostly ignore multithreaded issues. The usual way to do this is to have the thread create new objects as it starts handling the request and to ensure that these objects aren’t put anywhere (such as in a static variable) where other threads can see them. That way the objects are isolated because other threads have no way of referencing them.

Many developers are concerned about creating new objects because they’ve been told that object creation is an expensive process. As a result they often pool objects. The problem with pooling is that you have to synchronize access to the pooled objects in some way. But the cost of object creation is very dependent on the virtual machine and memory management strategies. In modern environments object creation is actually pretty fast [Peckish]. (Off the top of your head: how many Java date objects do you think we can create in one second on Martin’s 600Mhz P3 with Java 1.3? We’ll tell you later.) Creating fresh objects for each session avoids a lot of concurrency bugs and can actually improve scalability.

While this tactic works in many cases, there are still some areas that developers need to avoid. One is static, class-based variables or global variables because any use of these has to be synchronized. This is also true of singletons. If you need some kind of global memory, use a Registry (480), which you can implement in such a way that it looks like a static variable but actually uses thread-specific storage.

Even if you’re able to create objects for the session, and thus make a comparatively safe zone, some objects are expensive to create and thus need to be handled differently—the most common example of this is a database connection. To deal with this you can place these objects in an explicit pool where you acquire a connection while you need it and return it when done. These operations will need to be synchronized.

Further Reading

In many ways, this chapter only skims the surface of a much more complex topic. To investigate further we suggest starting with [Bernstein and Newcomer], [Lea], and [Schmidt et al.].