Asynchronous Programming

Some things are intrinsically slow. Reading all of the audio data off a CD, downloading a large file from a server at the end of a low-bandwidth connection on the opposite side of the world, or playing a sound—all of these processes have constraints that mean they’ll take a long time to complete, maybe seconds, minutes, or even hours. How should these sorts of operations look to the programmer?

One simple answer is that they don’t have to look different than faster operations. Our code consists of a sequence of statements—one thing after another—and some statements take longer than others. This has the useful property of being easy to understand. For example, if our code calls the WebClient class’s DownloadString method, our program doesn’t move on to the next step until the download is complete, and so we can know not just what our code does, but also the order in which it does it.

This style of API is sometimes described as synchronous—the time at which the API returns is determined by the time at which the operation finishes; execution progresses through the code in sync with the work being done. These are also sometimes known as blocking APIs, because they block the calling thread from further progress until work is complete.

Blocking APIs are problematic for user interfaces because the blocked thread can’t do anything else while slow work is in progress. Thread affinity means that code which responds to user input has to run on the UI thread, so if you’re keeping that thread busy, the UI will become unresponsive. It’s really annoying to use programs that stop responding to user input when they’re working—these applications seem to freeze anytime something takes too long, making them very frustrating to use. Failing to respond to user input within 100 ms is enough to disrupt the user’s concentration. (And it gets worse if your program’s user interface uses animation—the occasional glitch of just 15 ms is enough to make a smooth animation turn into something disappointingly choppy.)

Threads offer one solution to this: if you do all your potentially slow work on threads that aren’t responsible for handling user input, your application can remain responsive. However, this can sometimes seem like an overcomplicated solution—in a lot of cases, slow operations don’t work synchronously under the covers. Take fundamental operations such as reading and writing data from and to devices such as network cards or disks, for example. The kernel-mode device drivers that manage disk and network I/O are instructed by the operating system to start doing some work, and the OS expects the driver to configure the hardware to perform the necessary work and then return control to the operating system almost immediately—on the inside, Windows is built around the assumption that most slow work proceeds asynchronously, that there’s no need for code to progress strictly in sync with the work.

This asynchronous model is not limited to the internals of Windows—there are asynchronous public APIs. These typically return very quickly, long before the work in question is complete, and you then use either a notification mechanism or polling to discover when the work is finished. The exact details vary from one API to another, but these basic principles are universal. Many synchronous APIs really are just some code that starts an asynchronous operation and then makes the thread sleep until the operation completes.

An asynchronous API sounds like a pretty good fit for what we need to build responsive interactive applications.[45] So it seems somewhat ludicrous to create multiple threads in order to use synchronous APIs without losing responsiveness, when those synchronous APIs are just wrappers on top of intrinsically asynchronous underpinnings. Rather than creating new threads, we may as well just use asynchronous APIs directly where they are available, cutting out the middle man.

.NET defines two common patterns for asynchronous operations. There’s a low-level pattern which is powerful and corresponds efficiently to how Windows does things under the covers. And then there’s a slightly higher-level pattern which is less flexible but considerably simpler to use in GUI code.

The Asynchronous Programming Model (APM) is a pattern that many asynchronous APIs in the .NET Framework conform to. It defines common mechanisms for discovering when work is complete, for collecting the results of completed work, and for reporting errors that occurred during the asynchronous operation.

APIs that use the APM offer pairs of methods, starting with Begin and End. For example, the Socket class in the System.Net.Sockets namespace offers numerous instances of this pattern: BeginAccept and EndAccept, BeginSend and EndSend, BeginConnect and EndConnect, and so on.

The exact signature of the Begin method depends on what it does. For example, a socket’s BeginConnect needs the address to which you’d like to connect, whereas BeginReceive needs to know where you’d like to put the data and how much you’re ready to receive. But the APM requires all Begin methods to have the same final two parameters: the method must take an AsyncCallback delegate and an object. And it also requires the method to return an implementation of the IAsyncResult interface. Here’s an example from the Dns class in System.Net:

public static IAsyncResult BeginGetHostEntry(
    string hostNameOrAddress,
    AsyncCallback requestCallback,
    object stateObject
)

Callers may pass a null AsyncCallback. But if they pass a non-null reference, the type implementing the APM is required to invoke the callback once the operation is complete. The AsyncCallback delegate signature requires the callback method to accept an IAsyncResult argument—the APM implementation will pass in the same IAsyncResult to this completion callback as it returns from the Begin method. This object represents an asynchronous operation in progress—many classes can have multiple operations in progress simultaneously, and the IAsyncResult distinguishes between them.

Example 16-16 shows one way to use this pattern. It calls the asynchronous BeginGetHostEntry method provided by the Dns class. This looks up the IP address for a computer, so it takes a string—the name of the computer to find. And then it takes the two standard final APM arguments—a delegate and an object. We can pass anything we like as the object—the function we call doesn’t actually use it, it just hands it back to us later. We could pass null because our example doesn’t need the argument, but we’re passing a number just to demonstrate where it comes out. The reason the APM offers this argument is so that if you have multiple simultaneous asynchronous operations in progress at once, you have a convenient way to associate information with each operation. (This mattered much more in older versions of C#, which didn’t offer anonymous methods or lambdas—back then this argument was the easiest way to pass data into the callback.)

The Main method waits until a key is pressed—much like with work items in the thread pool, having active asynchronous requests will not keep the process alive, so the program would exit before finishing its work without that ReadKey. (A more robust approach for a real program that needed to wait for work to complete would be to use the CountdownEvent described earlier.)

The Dns class will call the OnGetHostEntryComplete method once it has finished its lookup. Notice that the first thing we do is call the EndGetHostEntry method—the other half of the APM. The End method always takes the IAsyncResult object corresponding to the call—recall that this identifies the call in progress, so this is how EndGetHostEntry knows which particular lookup operation we want to get the results for.

The End method in the APM returns any data that comes out of the operation. In this case, there’s a single return value of IPHostEntry, but some implementations may return more by having out or ref arguments. Example 16-16 then prints the results, and finally prints the AsyncState property of the IAsyncResult, which will be 42—this is where the value we passed as the final argument to BeginGetHostEntry pops out.

This is not the only way to use the Asynchronous Programming Model—you are allowed to pass null as the delegate argument. You have three other options, all revolving around the IAsyncResult object returned by the Begin call. You can poll the IsCompleted property to test for completion. You can call the End method at any time—if the work is not finished this will block until it completes.[46] Or you can use the AsyncWaitHandle property—this returns an object that is a wrapper around a Win32 synchronization handle that will become signaled when the work is complete. (That last one is rarely used, and has some complications regarding ownership and lifetime of the handle, which are described in the MSDN documentation. We mention this technique only out of a pedantic sense of duty to completeness.)

Note

You are required to call the End method at some point, no matter how you choose to wait for completion. Even if you don’t care about the outcome of the operation you must still call the End method. If you don’t, the operation might leak resources.

Asynchronous operations can throw exceptions. If the exception is the result of bad input, such as a null reference where an object is required, the Begin method will throw an exception. But it’s possible that something failed while the operation was in progress—perhaps we lost network connectivity partway through some work. In this case, the End method will throw an exception.

The Asynchronous Programming Model is widely used in the .NET Framework class library, and while it is an efficient and flexible way to support asynchronous operations, it’s slightly awkward to use in user interfaces. The completion callback typically happens on some random thread, so you can’t update the UI in that callback. And the support for multiple simultaneous operations, possible because each operation is represented by a distinct IAsyncResult object, may be useful in server environments, but it’s often just an unnecessary complication for client-side code. So there’s an alternative pattern better suited to the UI.

Some classes offer an alternative pattern for asynchronous programming. You start an operation by calling a method whose name typically ends in Async; for example, the WebClient class’s DownloadDataAsync method. And unlike the APM, you do not pass a delegate to the method. Completion is indicated through an event, such as the DownloadDataCompleted event. Classes that implement this pattern are required to use the SynchronizationContext class (or the related AsyncOperationManager) to ensure that the event is raised in the same context in which the operation was started. So in a user interface, this means that completion events are raised on the UI thread.

This is, in effect, a single-threaded asynchronous model. You have the responsiveness benefits of asynchronous handling of slow operations, with fewer complications than multithreaded code. So in scenarios where this pattern is an option, it’s usually the best choice, as it is far simpler than the alternatives. It’s not always available, because some classes offer only the APM. (And some don’t offer any kind of asynchronous API, in which case you’d need to use one of the other multithreading mechanisms in this chapter to maintain a responsive UI.)

There are two optional features of the event-based asynchronous model. Some classes also offer progress change notification events, such as the WebClient class’s DownloadProgressChanged event. (Such events are also raised on the original thread.) And there may be cancellation support. For example, WebClient offers a CancelAsync method.

There’s no fundamental need for code to use either the APM or the event-based asynchronous pattern. These are just conventions. You will occasionally come across code that uses its own unusual solution for asynchronous operation. This can happen when the design of the code in question is constrained by external influences—for example, the System.Threading namespace defines an Overlapped class that provides a managed representation of a Win32 asynchronous mechanism. Win32 does not have any direct equivalent to either of the .NET asynchronous patterns, and just tends to use function pointers for callbacks. .NET’s Overlapped class mimics this by accepting a delegate as an argument to a method. Conceptually, this isn’t very different from the APM, it just happens not to conform exactly to the pattern.

The standard asynchronous patterns are useful, but they are somewhat low-level. If you need to coordinate multiple operations, they leave you with a lot of work to do, particularly when it comes to robust error handling or cancellation. The Task Parallel Library provides a more comprehensive scheme for working with multiple concurrent operations.



[45] Asynchronous APIs tend to be used slightly differently in server-side code in web applications. There, they are most useful for when an application needs to communicate with multiple different external services to handle a single request.

[46] This isn’t always supported. For example, if you attempt such an early call on an End method for a networking operation on the UI thread in a Silverlight application, you’ll get an exception.