Some things are intrinsically slow. Reading all of the audio data off a CD, downloading a large file from a server at the end of a low-bandwidth connection on the opposite side of the world, or playing a sound—all of these processes have constraints that mean they’ll take a long time to complete, maybe seconds, minutes, or even hours. How should these sorts of operations look to the programmer?
One simple answer is that they don’t have to look different than
faster operations. Our code consists of a sequence of statements—one thing
after another—and some statements take longer than others. This has the
useful property of being easy to understand. For example, if our code
calls the WebClient
class’s DownloadString
method,
our program doesn’t move on to the next step until the download is
complete, and so we can know not just what our code does, but also the
order in which it does it.
This style of API is sometimes described as synchronous—the time at which the API returns is determined by the time at which the operation finishes; execution progresses through the code in sync with the work being done. These are also sometimes known as blocking APIs, because they block the calling thread from further progress until work is complete.
Blocking APIs are problematic for user interfaces because the blocked thread can’t do anything else while slow work is in progress. Thread affinity means that code which responds to user input has to run on the UI thread, so if you’re keeping that thread busy, the UI will become unresponsive. It’s really annoying to use programs that stop responding to user input when they’re working—these applications seem to freeze anytime something takes too long, making them very frustrating to use. Failing to respond to user input within 100 ms is enough to disrupt the user’s concentration. (And it gets worse if your program’s user interface uses animation—the occasional glitch of just 15 ms is enough to make a smooth animation turn into something disappointingly choppy.)
Threads offer one solution to this: if you do all your potentially slow work on threads that aren’t responsible for handling user input, your application can remain responsive. However, this can sometimes seem like an overcomplicated solution—in a lot of cases, slow operations don’t work synchronously under the covers. Take fundamental operations such as reading and writing data from and to devices such as network cards or disks, for example. The kernel-mode device drivers that manage disk and network I/O are instructed by the operating system to start doing some work, and the OS expects the driver to configure the hardware to perform the necessary work and then return control to the operating system almost immediately—on the inside, Windows is built around the assumption that most slow work proceeds asynchronously, that there’s no need for code to progress strictly in sync with the work.
This asynchronous model is not limited to the internals of Windows—there are asynchronous public APIs. These typically return very quickly, long before the work in question is complete, and you then use either a notification mechanism or polling to discover when the work is finished. The exact details vary from one API to another, but these basic principles are universal. Many synchronous APIs really are just some code that starts an asynchronous operation and then makes the thread sleep until the operation completes.
An asynchronous API sounds like a pretty good fit for what we need to build responsive interactive applications.[45] So it seems somewhat ludicrous to create multiple threads in order to use synchronous APIs without losing responsiveness, when those synchronous APIs are just wrappers on top of intrinsically asynchronous underpinnings. Rather than creating new threads, we may as well just use asynchronous APIs directly where they are available, cutting out the middle man.
.NET defines two common patterns for asynchronous operations. There’s a low-level pattern which is powerful and corresponds efficiently to how Windows does things under the covers. And then there’s a slightly higher-level pattern which is less flexible but considerably simpler to use in GUI code.
The Asynchronous Programming Model (APM) is a pattern that many asynchronous APIs in the .NET Framework conform to. It defines common mechanisms for discovering when work is complete, for collecting the results of completed work, and for reporting errors that occurred during the asynchronous operation.
APIs that use the APM offer pairs of methods, starting with
Begin
and End
. For example, the
Socket
class in the System.Net.Sockets
namespace offers numerous
instances of this pattern: BeginAccept
and EndAccept
, BeginSend
and EndSend
, BeginConnect
and EndConnect
, and so on.
The exact signature of the Begin
method depends on what it does. For
example, a socket’s BeginConnect
needs the address to which you’d like to connect, whereas BeginReceive
needs to
know where you’d like to put the data and how much you’re ready to
receive. But the APM requires all Begin
methods to have the same final two
parameters: the method must take an AsyncCallback
delegate and an object
. And it also requires the method to
return an implementation of the IAsyncResult
interface. Here’s an example from
the Dns
class in System.Net
:
public static IAsyncResult BeginGetHostEntry( string hostNameOrAddress, AsyncCallback requestCallback, object stateObject )
Callers may pass a null
AsyncCallback
. But if they pass a non-null reference, the type
implementing the APM is required to invoke the callback once the
operation is complete. The AsyncCallback
delegate signature requires the
callback method to accept an IAsyncResult
argument—the APM implementation
will pass in the same IAsyncResult
to
this completion callback as it returns from the Begin
method. This object represents an
asynchronous operation in progress—many classes can have multiple
operations in progress simultaneously, and the IAsyncResult
distinguishes between
them.
Example 16-16 shows
one way to use this pattern. It calls the asynchronous BeginGetHostEntry
method
provided by the Dns
class. This looks
up the IP address for a computer, so it takes a string—the name of the
computer to find. And then it takes the two standard final APM
arguments—a delegate and an object. We can pass anything we like as the
object—the function we call doesn’t actually use it, it just hands it
back to us later. We could pass null
because our example doesn’t need the argument, but we’re passing a
number just to demonstrate where it comes out. The reason the APM offers
this argument is so that if you have multiple simultaneous asynchronous
operations in progress at once, you have a convenient way to associate
information with each operation. (This mattered much more in older
versions of C#, which didn’t offer anonymous methods or lambdas—back
then this argument was the easiest way to pass data into the
callback.)
Example 16-16. Using the Asynchronous Programming Model
class Program { static void Main(string[] args) { Dns.BeginGetHostEntry("oreilly.com", OnGetHostEntryComplete, 42); Console.ReadKey(); } static void OnGetHostEntryComplete(IAsyncResult iar) { IPHostEntry result = Dns.EndGetHostEntry(iar); Console.WriteLine(result.AddressList[0]); Console.WriteLine(iar.AsyncState); } }
The Main
method waits until a
key is pressed—much like with work items in the thread pool, having
active asynchronous requests will not keep the process alive, so the
program would exit before finishing its work without that ReadKey
. (A more robust approach for a real
program that needed to wait for work to complete would be to use the
CountdownEvent
described
earlier.)
The Dns
class will call the
OnGetHostEntryComplete
method once it has finished its lookup. Notice that the first thing we
do is call the EndGetHostEntry
method—the other half of the APM. The End
method always takes
the IAsyncResult
object corresponding
to the call—recall that this identifies the call in progress, so this is
how EndGetHostEntry
knows which
particular lookup operation we want to get the results for.
The APM says nothing about which thread your callback will be
called on. In practice, it’s often a thread pool thread, but not
always. Some individual implementations might make guarantees about
what sort of thread you’ll be called on, but most don’t. And since you
don’t usually know what thread the callback occurred on, you will need
to take the same precautions you would when writing multithreaded code
where you explicitly create new threads. For example, in a WPF or
Windows Forms application, you’d need to use the SynchronizationContext
class or an
equivalent mechanism to get back to a UI thread if you wanted to make
updates to the UI when an asynchronous operation completes.
The End
method in the APM
returns any data that comes out of the operation. In this case, there’s
a single return value of IPHostEntry
,
but some implementations may return more by having out
or ref
arguments. Example 16-16
then prints the results, and finally prints the AsyncState
property of the IAsyncResult
, which will be 42—this is where
the value we passed as the final argument to BeginGetHostEntry
pops out.
This is not the only way to use the Asynchronous Programming
Model—you are allowed to pass null
as
the delegate argument. You have three other options, all revolving
around the IAsyncResult
object
returned by the Begin
call. You can
poll the IsCompleted
property to test for
completion. You can call the End
method at any time—if the work is
not finished this will block until it completes.[46] Or you can use the AsyncWaitHandle
property—this returns an
object that is a wrapper around a Win32 synchronization handle that will
become signaled when the work is complete. (That last one is rarely
used, and has some complications regarding ownership and lifetime of the
handle, which are described in the MSDN documentation. We mention this
technique only out of a pedantic sense of duty to completeness.)
You are required to call the End
method at some point, no matter how you
choose to wait for completion. Even if you don’t care about the
outcome of the operation you must still call the End
method. If you don’t, the operation
might leak resources.
Asynchronous operations can throw exceptions. If the exception is
the result of bad input, such as a null reference where an object is
required, the Begin
method will throw
an exception. But it’s possible that something failed while the
operation was in progress—perhaps
we lost network connectivity partway through some work. In this case,
the End
method will throw an
exception.
The Asynchronous Programming Model is widely used in the .NET
Framework class library, and while it is an efficient and flexible way
to support asynchronous operations, it’s slightly awkward to use in user
interfaces. The completion callback typically happens on some random
thread, so you can’t update the UI in that callback. And the support for
multiple simultaneous operations, possible because each operation is
represented by a distinct IAsyncResult
object, may be useful in server
environments, but it’s often just an unnecessary complication for
client-side code. So there’s an alternative pattern better suited to the
UI.
Some classes offer an alternative pattern for asynchronous
programming. You start an operation by calling a method whose name
typically ends in Async; for example, the WebClient
class’s DownloadDataAsync
method. And unlike the APM, you do not pass a delegate to the method.
Completion is indicated through an event, such as the DownloadDataCompleted
event. Classes that implement this pattern are required to use the
SynchronizationContext
class (or the
related AsyncOperationManager
) to
ensure that the event is raised in the same context in which the
operation was started. So in a user interface, this means that
completion events are raised on the UI thread.
This is, in effect, a single-threaded asynchronous model. You have the responsiveness benefits of asynchronous handling of slow operations, with fewer complications than multithreaded code. So in scenarios where this pattern is an option, it’s usually the best choice, as it is far simpler than the alternatives. It’s not always available, because some classes offer only the APM. (And some don’t offer any kind of asynchronous API, in which case you’d need to use one of the other multithreading mechanisms in this chapter to maintain a responsive UI.)
Single-threaded asynchronous code is more complex than sequential code, of course, so there’s still scope for trouble. For example, you need to be careful that you don’t attempt to set multiple asynchronous operations in flight simultaneously that might conflict. Also, components that implement this pattern call you back on the right thread only if you use them from the right thread in the first place—if you use a mixture of this pattern and other multithreading mechanisms, be aware that operations you kick off from worker threads will not complete on the UI thread.
There are two optional features of the event-based asynchronous
model. Some classes also offer progress change notification events, such
as the WebClient
class’s DownloadProgressChanged
event. (Such events
are also raised on the original thread.) And there may be cancellation
support. For example, WebClient
offers a CancelAsync
method.
There’s no fundamental need for code to use either the APM or the
event-based asynchronous pattern. These are just conventions. You will
occasionally come across code that uses its own unusual solution for
asynchronous operation. This can happen when the design of the code in
question is constrained by external influences—for example, the System.Threading
namespace defines an Overlapped
class that provides a managed
representation of a Win32 asynchronous mechanism. Win32 does not have
any direct equivalent to either of the .NET asynchronous patterns, and
just tends to use function pointers for callbacks. .NET’s Overlapped
class mimics this by accepting a
delegate as an argument to a method. Conceptually, this isn’t very
different from the APM, it just happens not to conform exactly to the
pattern.
The standard asynchronous patterns are useful, but they are somewhat low-level. If you need to coordinate multiple operations, they leave you with a lot of work to do, particularly when it comes to robust error handling or cancellation. The Task Parallel Library provides a more comprehensive scheme for working with multiple concurrent operations.
[45] Asynchronous APIs tend to be used slightly differently in server-side code in web applications. There, they are most useful for when an application needs to communicate with multiple different external services to handle a single request.
[46] This isn’t always supported. For example, if you attempt such
an early call on an End
method
for a networking operation on the UI thread in a Silverlight
application, you’ll get an exception.