A primary reason for Trey Research migrating and reconfiguring the Orders application to run on the Windows Azure™ technology platform was to take advantage of the improved scalability and availability that this environment provides. As Trey Research expand their operations, they expect to attract an ever-increasing number of customers, so they designed their solution to be able to handle a large volume of orders from clients located anywhere in the world. This was a challenging proposition as the Orders application had to remain responsive but cost-effective, regardless of the number of customers requesting service at any given point in time. Trey Research analyzed the operations of their system, and identified three principal requirements. The solution should:
- Automatically scale up as demand increases (to process orders in a timely manner), but then scale back when demand drops to keep running costs down.
- Reduce the network latency associated with a customer accessing the Orders web application and its resources, hosted in a location remote across a public and uncontrollable (from Trey Research's perspective) network such as the Internet .
- Optimize the response time and throughput of the web application to maintain a satisfactory user experience.
This chapter describes how Trey Research met these requirements in their solution.
Scenario and Context
Trey Research implemented their solution as a web application primarily targeting customers located in the United States, so they initially deployed it to two Windows Azure datacenters; US North and US South. However, Trey Research plans to expand their operations and can foresee a time when the application will have to be deployed to other datacenters worldwide to satisfy the demands of overseas customers. For this reason, when Trey Research constructed the Orders application, they designed it to allow customers to connect to any instance, and to provide functionality that was consistent across all instances. In this way, Trey Research can start and stop instances of the Orders application and customers will always be able to view products, place orders, and query the status of their orders regardless of which instance they are connected to at any point in time. This approach also enables Trey Research to balance the load evenly across the available instances of the Orders application and maintain throughput. Additionally, if an instance becomes unavailable or a connection to an instance fails, customers can be directed to other working instances of the application.
In practice, this solution depends on a variety of components to provide the necessary infrastructure to determine the optimal instance of the Orders application to which a customer should connect, transparently route customer requests (and reroute requests in the event a connectivity failure with an instance occurs), and maintain consistent data across all datacenters. Additionally, the resources that the Orders application uses involve making further requests across the network; for example to retrieve the Products catalog, Customer details, or Order information. Access to these resources must be accomplished in a scalable and timely manner to provide customers with a responsive user experience.
Controlling Elasticity in the Orders Application
Trey Research noticed that the Orders application experienced peaks and troughs in demand throughout the working day. Some of these patterns were predictable, such as low usage during the early hours of the morning and high usage during the latter part of the working day, while others were more unexpected; sometimes demand increased due to a specific product being reviewed in a technical journal, but occasionally the volume of use changed for no foreseeable reason.
Trey Research needed a solution that would enable them to scale the Orders application to enable a highly variable number of customers to log in, browse products, and place orders without experiencing extended response times, while at the same time remaining cost-efficient.
Bharath says: | |
---|---|
Windows Azure provides elasticity and scale by allowing you to start and stop instances of roles on demand. However, unless you actually do manage your role instance count proactively, you are missing out on some of the major benefits that cloud computing offers. |
Choosing How to Manage Elasticity in the Orders Application
Trey Research considered a number of options for determining how best to scale the Orders application. These options, together with their advantages and limitations, are described in the following sections.
Do Not Scale the Application
This is the simplest option. The Orders application has been designed and implemented to take advantage of concurrent web and worker role instances, and utilizes asynchronous messaging to send and receive messages while minimizing the response time to users. In this case, why not simply deploy the application to number of web and worker role instances in each possible datacenter, and allocate each role the largest possible virtual machine size, with the maximum number of CPU cores and the largest available volume of memory?
This approach is attractive because it involves the least amount of maintenance on the part of the operations staff at Trey Research. It is also very straightforward to implement. However, it could be very expensive; hosting a web or worker role using the "Extra Large" virtual machine size (as defined by the Windows Azure pricing model) is currently 24 times more expensive on an hourly rate than hosting the same role in an "Extra Small" virtual machine. If the volume of customers for much of the time does not require the processing or memory capabilities of an extra-large virtual machine, then Trey Research would be paying to host a largely idle virtual machine. If you multiply the charges by the number of instances being hosted across all datacenters, the final sum can be a significant amount of money.
There is one other question that this approach poses; how many web and worker role instances should Trey Research create? If the number selected is too large, then the issues of cost described in the previous paragraph become paramount. However, if Trey Research create too few instances, then although the company is not necessarily paying for wasted resources, customers are likely to be unhappy due to extended response times and slow service, possibly resulting in lost business.
For these reasons this approach is probably not going to be cost effective or desirable.
Implement Manual Scaling
Clearly, some kind of scale-up and scale-down solution is required. By using the Windows Azure Management Portal it is possible to start and stop instances of web and worker roles manually; or even deploy new instances of the Orders application to datacenters around the world. Decisions about when to start, stop, or deploy new instances could be made based on usage information gathered by monitoring the application; Chapter 7, "Monitoring and Managing the Orders Application" contains more information on how to perform tasks such as these. However, this is potentially a very labor intensive approach, and may require an operator to determine when best to perform these tasks.
Some of these operations can be scripted using the Windows Azure Powershell Cmdlets, but there is always the possibility that having started up a number of expensive instances to handle a current peak in demand, the operator may forget to shut them down again at the appropriate time, leaving Trey Research liable for additional costs.
Implement Automatic Scaling using a Custom Service
Starting and stopping role instance manually was considered to be too inefficient and error prone, so Trey Research turned their attention to crafting an automated solution. Theoretically, it should be able to follow the same pattern and implement the same practices as the manual approach, except in a more reliable and less labor intensive manner. To this end, Trey Research considered configuring the web and worker roles to gather key statistical information, such as the number of concurrent requests, the average response time, the activity of the CPU and disks, and the memory utilization. This information could be obtained by using the Windows Azure Diagnostics and other trace event sources available to code running in the cloud, and then periodically downloaded and analyzed by a custom application running on-premises within the Trey Research Head Office. This custom application could then determine whether to start and stop additional role instances, and in which datacenters.
The downside of this approach is the possible complexity of the on-premises application; it could take significant effort to design, build, and test such an application thoroughly (especially the logic that determines whether to start and stop role instances). Additionally, gathering the diagnostic information and downloading it from each datacenter could impose a noticeable overhead on each role, impacting the performance.
Poe says: | |
---|---|
External services that can manage autoscaling are also available. These services remove the overhead of developing your own custom solution but you must provide these services with your Windows Azure management certificate so that they can access the role instances, which may not be acceptable to your organization. |
Implement Automatic Scaling using the Enterprise Library Autoscaling Application Block
The Microsoft Enterprise Library Autoscaling Application Block provides a facility that you can integrate directly into your web and worker roles running in the cloud, and also into on-premises applications. It is part of the Microsoft Enterprise Library 5.0 Integration Pack for Windows Azure, and can automatically scale your Windows Azure application or service based on rules that you define specifically for that application or service. You can use these rules to help your application or service maintain its throughput in response to changes in its workload, while at the same time minimize and control hosting costs. The application block enables a cloud application to start and stop role instances, change configuration settings to allow the application to throttle its functionality and reduce its resource usage, and send notifications according to a defined schedule.
The key advantages of this approach include the low implementation costs and ease of use; all you need to do is to provide the configuration information that specifies the circumstances (in terms of a schedule and performance measures) under which the block will apply instance scaling or application throttling actions.
How Trey Research Controls Elasticity in the Orders Application
To simplify installation and setup, and reduce the prerequisites and the requirements for users to establish extensive Windows Azure accounts, the Trey Research example solution provided with this guide is designed to be deployed to a single datacenter, and is not configured to support autoscaling or request rerouting. Consequently, the sections in this chapter describing how Trey Research implemented the Enterprise Library Autoscaling Application Block and Windows Azure Traffic Manager are provided for information only.
Trey Research decided to use the Enterprise Library Autoscaling Application Block to start and stop instances of web and worker roles as the load from users changes. Initially, Trey Research deployed the Orders application and made it available to a small but statistically significant number of users, evenly spread across an area representing the geographical location of their expected market. By gathering statistical usage information based on this pilot, Trey Research identified how the system functions at specific times in the working day, and identified periods during which performance suffered due to an increased load from customers. Specifically, Trey Research noticed that:
- Many customers tended to place their orders towards the end of the working day (between 15:30 and 20:30 Central time, allowing for the spread of users across the United States), with an especially large number placed between 17:30 and 18:30.
- On Fridays, the peak loading tended to start and finish two hours earlier (13:30 to 18:30 Central time).
- On the weekend, very few customers placed orders.
To cater for these peaks and troughs in demand, Trey Research decided to implement the Enterprise Library Autoscaling Application Block as follows:
- The developers configured constraint rules for the application to start an additional three instances of the web and worker roles at each datacenter at 15:15 Central time (it can take 10 or 15 minutes for the new instances to become available), and to shut these instances down at 20:30 between Monday and Thursday.
- At 17:15 the application was configured to start a further two instances of each role, which are shut down at 18:30.
- On Fridays, the times at which the extra instances start and stop are two hours earlier.
- To handle unexpected demand, Trey Research also configured reactive rules to monitor the number of customer requests, and start additional instances if the average CPU usage for a web role exceeds 85% for 10 or more minutes, up to a maximum of 12 instances per datacenter. When the CPU usage drops below 50%, instances are shut down, subject to a minimum of two instances per datacenter.
- On weekends, the system is constrained to allow a maximum of four instances of each role at each datacenter, and any additional instances above this number are shut down to reduce running costs.
- When the system is inactive or lightly loaded, the system returns to its baseline configuration comprising two instances of each role per datacenter.
Hosting the Autoscaling Application Block
The Autoscaling Application Block monitors the performance of one or more roles, starting and stopping roles, applying throttling changes to configuration, or sending notifications as specified by the various constraint rules and reactive rules. The Autoscaling Application Block also generates diagnostic information and captures data points indicating the work that it has performed. For more information about the information collected, see "Autoscaling Application Block Logging" on MSDN.
To perform this work, the Autoscaling Application Block uses an Autoscaler object (defined in the Microsoft.Practices.EnterpriseLibrary.WindowsAzure.Autoscaling namespace), and you must arrange for this object to start running when your application executes. The Trey Research solution performs this task in the Run method in the WorkerRole class (in the Orders.Workers project), and stops the Autoscaler in the OnStop method:
public class WorkerRole : RoleEntryPoint
{
private Autoscaler autoscaler;
...
public override void Run()
{
this.autoscaler = EnterpriseLibraryContainer.Current.
GetInstance<Autoscaler>();
this.autoscaler.Start();
...
}
...
public override void OnStop()
{
this.autoscaler.Stop();
...
}
...
}
The information about which roles to monitor, the storage account to use for storing diagnostic data, and the location of the rules defining the behavior of the Autoscaler object are specified in the <serviceModel> section of the service information store file. This file was uploaded to blob storage and stored in the blob specified by the <serviceInformationStores> section of the app.config file for the worker role. For more information and an example of defining the service model for the Autoscaling Application Block, see Chapter 5, "Making Tailspin Surveys More Elastic," of the Developer's Guide to the Enterprise Library 5.0 Integration Pack for Windows Azure on MSDN.
Defining the Autoscaling Rules
The Trey Research solution implements a combination of constraint rules and reactive rules. The constraint rules specify the schedule the Autoscaler object should use, in addition to the maximum and minimum number of instances of roles during each scheduled period. The Autoscaler initiates creation of instances of the web and worker roles, or stops existing instances, when the schedule changes the boundaries and when the instance count is outside that new boundary. The reactive rules start further instances of the web role or stop them, according to the CPU loading of the web role. As an initial starting point, Trey Research defined the following set of rules:
<?xml version="1.0" encoding="utf-8" ?>
<rules xmlns="http://schemas.microsoft.com/practices/2011/
entlib/autoscaling/rules">
<constraintRules>
<rule name="Weekday" enabled="true" rank="10">
<timetable startTime="00:00:00" duration="23:59:59"
utcOffset="-06:00">
<weekly days=
"Monday Tuesday Wednesday Thursday Friday"/>
</timetable>
<actions>
<range target="Orders.Workers"
min="2" max="12"/>
<range target="Orders.Website"
min="2" max="12"/>
</actions>
</rule>
<rule name="Weekend" enabled="true" rank="10">
<timetable startTime="00:00:00" duration="23:59:59"
utcOffset="-06:00">
<weekly days="Sunday Saturday"/>
</timetable>
<actions>
<range target="Orders.Workers"
min="2" max="4"/>
<range target="Orders.Website"
min="2" max="4"/>
</actions>
</rule>
<rule name="MondayToThursday" enabled="true" rank="2">
<timetable startTime="15:15:00" duration="05:15:00"
utcOffset="-06:00">
<weekly days="Monday Tuesday Wednesday Thursday"/>
</timetable>
<actions>
<range target=" Orders.Workers"
min="4" max="12"/>
<range target=" Orders.Website"
min="4" max="12"/>
</actions>
</rule>
<rule name="MondayToThursdayPeak" enabled="true"
rank="3">
<timetable startTime="17:15:00" duration="03:15:00"
utcOffset="-06:00">
<weekly days="Monday Tuesday Wednesday Thursday"/>
</timetable>
<actions>
<range target=" Orders.Workers"
min="6" max="12"/>
<range target=" Orders.Website"
min="6" max="12"/>
</actions>
</rule>
<rule name="Friday" enabled="true" rank="2">
<timetable startTime="13:15:00" duration="05:15:00"
utcOffset="-06:00">
<weekly days="Friday"/>
</timetable>
<actions>
<range target=" Orders.Workers"
min="6" max="12"/>
<range target=" Orders.Website"
min="6" max="12"/>
</actions>
</rule>
<rule name="FridayPeak" enabled="true" rank="3">
<timetable startTime="15:15:00" duration="03:15:00"
utcOffset="-06:00">
<weekly days="Friday"/>
</timetable>
<actions>
<range target=" Orders.Workers"
min="7" max="12"/>
<range target=" Orders.Website"
min="7" max="12"/>
</actions>
</rule>
</constraintRules>
<reactiveRules>
<rule name="HotCPU" enabled="true" rank="4">
<when>
<greater operand="CPU" than="85" />
</when>
<actions>
<scale target="Orders.Website"
by ="1"/>
</actions>
</rule>
<rule name="CoolCPU" enabled="true" rank="4">
<when>
<less operand="CPU" than="50" />
</when>
<actions>
<scale target="Orders.Website"
by ="-1"/>
</actions>
</rule>
</reactiveRules>
<operands>
<performanceCounter alias="CPU"
source="AccidentReporting_WebRole" performanceCounterName=
"\Processor(_Total)\% Processor Time" timespan="00:10:00"
aggregate="Average"/>
</operands>
</rules>
Note: |
---|
Hosting costs for Windows Azure services are calculated on an hourly basis, with each part hour charged as a complete hour. This means that if, for example, you start a new service instance at 14.50, and shut it down at 16.10, you will be charged for 3 hours. You should keep this in mind when configuring the schedule for the Autoscaler. For more information, see the "Pricing Overview" page for Windows Azure. |
The rules were uploaded to blob storage, to the blob specified by the <rulesStores> section of the app.config file for the worker role.
The CPU operand referenced by the reactive rules calculates the average processor utilization over a 30 minute period, using the \Processor(_Total)\% Processor Time performance counter. The Orders.Website web role was modified to collect this information by using the following code shown in bold in the StartDiagnostics method (called from the OnStart method) in the file WebRole.cs:
public class WebRole : RoleEntryPoint
{
...
private static void StartDiagnostics()
{
var config =
DiagnosticMonitor.GetDefaultInitialConfiguration();
...
config.PerformanceCounters.ScheduledTranferPeriod =
Timespan.FromMinutes(10);
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier =
@"\Processor(_Total)\% Processor Time",
SampleRate = TimeSpan.FromMinutes(30)
});
...
DiagnosticMonitor.Start(
"DiagnosticsConnectionString", config);
}
...
}
The performance counter data is written to the WADPerformanceCountersTable table in Windows Azure Table storage. The web role must be running in full trust mode to successfully write to this table.
Poe says: | |
---|---|
Using automatic scaling is an iterative process. The configuration defined by the team at Trey Research is kept constantly under review, and the performance of the solution is continuously monitored as the pattern of use by customers evolves. The operators may modify the configuration and autoscaling rules in the future to change the times and circumstances under which additional instances are created and destroyed. |
Managing Network Latency and Maximizing Connectivity to the Orders Application
Trey Research initially deployed the Orders application to two datacenters, US North and US South, both located in the United States. The rationale behind this decision was the US North datacenter is located just a few miles from Trey Research, affording reasonable network response times for the compliance application hosted in this datacenter (as described in Chapter 4, "Implementing Reliable Messaging and Communications with the Cloud"), while the majority of Trey Research's customers are expected to be located in the continental United States.
However, as Trey Research expands their customer base, it expects users connecting to the Orders application to be situated farther afield, perhaps on a different continent. The distance between customers and the physical location in which the Orders application is deployed can have a significant bearing on the response time of the system. Therefore Trey Research felt it necessary to adopt a strategy that minimizes this distance and reduces the associated network latency for users accessing the Orders application.
As its customers became distributed around the world, Trey Research considered hosting additional instances of the Orders application in datacenters that are similarly distributed. Customers could then connect to the closest available instance of the application. The question that Trey Research needed to address in this scenario was how to route a customer to the most local instance of the Orders application?
Choosing How to Manage Network Latency and Maximize Connectivity to the Orders Application
Trey Research investigated a number of solutions for directing customers to the most local instance of the Orders application, including deploying and configuring a number of DNS servers around the world (in conjunction with a number of network partners) based on the DNS address of the machine from which the customer's request originated. However, many of these solutions proved impractical or expensive, leaving Trey Research to consider the two options described in the following sections.
Build a Custom Service to Redirect Traffic
Trey Research examined the possibility of building a custom service through which all customers would connect, and then deploying this service to the cloud. The purpose of this service would be to examine each request and forward it on to the Orders application running in the most appropriate datacenter. This approach would enable Trey Research to filter and redirect requests based on criteria such as the IP address of each request. The custom service could also detect whether the Orders application at each datacenter was still running, and if it was currently unavailable it could transparently redirect customer requests to functioning instances of the application. Additionally, the custom service could attempt to distribute requests evenly across datacenters that are equally close (in network terms) to the customer, implementing a load-balancing mechanism to ensure that no one instance of the Orders application became unduly overloaded while others remained idle.
This type of custom service is reasonably common, and can be implemented by using the System.ServiceModel.Routing.RoutingService class of Windows Communication Foundation. However, a custom service such as this is non-trivial to design, build, and test; and the routing rules that determine how the service redirects messages can quickly become complex and difficult to maintain. Additionally, this service must itself be hosted somewhere with sufficient power to handle every customer request, and with good network connectivity to all customers. If the service is underpowered it will become a bottleneck, and if customers cannot connect to the service quickly then the advantages of using this service are nullified. Furthermore, this service constitutes a single point of failure; if it becomes unavailable then customers may not be able to connect to any instance of the Orders application.
Use Windows Azure Traffic Manager to Route Customers' Requests
Windows Azure Traffic Manager is a Windows Azure service that enables you to set up request routing and load balancing based on predefined policies and configurable rules. It provides a mechanism for routing requests to multiple deployments of your Windows Azure-hosted applications and services, regardless of the datacenter location. The applications or services could be deployed in one or more datacenters.
Traffic Manager is effectively a DNS resolver. When you use Traffic Manager, web browsers and services accessing your application will perform a DNS query to Traffic Manager to resolve the IP address of the endpoint to which they will connect, just as they would when connecting to any other website or resource.
Traffic Manager addresses the network latency and application availability issues by providing three mechanisms, or policies, for routing requests:
- The Performance policy redirects requests from users to the application in the closest data center. This may not be the application in the data center that is closest in purely geographical terms, but instead the one that provides the lowest network latency. Traffic Manager also detects failed applications and does not route to these, instead choosing the next closest working application deployment.
- The Failover policy allows you to configure a prioritized list of applications, and Traffic Manager will route requests to the first one in the list that it detects is responding to requests. If that application fails, Traffic Manager will route requests to the next applications in the list, and so on.
- The Round Robin policy routes requests to each application in turn; though it detects failed applications and does not route to these. This policy evens out the loading on each application, but may not provide users with the best possible response times as it ignores the relative locations of the user and data center.
You select which one of these policies is most appropriate to your requirements; Performance to minimize network latency, Failover to maximize availability, or Round Robin to distribute requests evenly (and possibly improve response time as a result).
Traffic Manager is managed and maintained by Microsoft, and the service is hosted in their datacenters. This means that there is no maintenance overhead.
How Trey Research Minimizes Network Latency and Maximizes Connectivity to the Orders Application
Using the Enterprise Library Autoscaling Application Block helps to ensure that sufficient instances of the Orders application web and worker roles are running to service the volume of customers connecting to a specific datacenter at a given point in time. With this in mind, the operations staff at Trey Research decided to use Traffic Manager simply to route customers' requests to the nearest responsive datacenter by implementing the Performance policy.
Bharath says: | |
---|---|
The selection of the Performance policy was very easy; the Failover policy is not suitable for the Trey Research scenario, and the Enterprise Library Autoscaling Application Block ensures that an appropriate number of instances of the Orders application roles will be available at each datacenter to facilitate good performance so the Round Robin policy is unnecessary. |
Implementing the Round Robin policy may be detrimental to customers as they might be routed to a more distant datacenter, incurring additional network latency and impacting the response time of the application. Additionally, the Round Robin policy may conceivably route two consecutive requests from the same customer to different datacenters, possibly leading to confusion if the data cached at each datacenter is not completely consistent. The Performance policy has the advantage of reducing the network latency while ensuring that requests from the same customer are much more likely to be routed to the same datacenter.
The operation staff configured the policy to include the DNS addresses of the Orders application deployed to the US North and US South datacenters, and monitoring the Home page of the web application to determine availability. The operations staff selected the DNS prefix ordersapp.treyresearch, and mapped the resulting address (ordersapp.treyresearch.trafficmanager.net) to the public address used by customers, store.treyresearch.net. In this way, a customer connecting to the URL http://store.treyresearch.net is transparently rerouted by Traffic Manager to the Orders application running in the US North datacenter or the US South datacenter. Figure 1 shows the structure of this configuration.
Notice that the Orders application in both datacenters must connect to the head office audit log listener service. Both deployments of the Orders application must also connect to all of Trey Research's transport partners; although, for simplicity, this is not shown in the diagram. Some features of the application, such as the use of the SQL Azure™ technology platform Reporting Service and the deployment of the compliance application in a Windows Azure VM Role, are not duplicated in both datacenters. The Orders data is synchronized across both datacenters and so one instance of the Reporting Service and the compliance application will provide the required results, without incurring additional hosting and service costs.
However, the designers at Trey Research realized that using a mechanism that may route users to different deployments of the application in different datacenters will have some impact. For example, data such as the user's current shopping cart is typically stored in memory or local storage (such as Windows Azure table storage or SQL Azure). When a user is re-routed to a different datacenter, this data is lost unless the application specifically synchronizes it across all datacenters.
In addition, if Trey Research configured ACS in more than one datacenter to protect against authentication issues should ACS in one datacenter be unavailable, re-routing users to another datacenter would mean they would have to sign in again.
However, Trey Research considers that both of these scenarios were unlikely to occur often enough to be an issue.
Optimizing the Response Time of the Orders Application
Windows Azure is a highly scalable platform that offers high performance for applications. However, available computing power alone does not guarantee that an application will be responsive; an application that is designed to function in a serial manner will not make best use of this platform and may spend a significant period blocked waiting for slower, dependent operations to complete. The solution is to perform these operations asynchronously, and the techniques that Trey Research adopted to implement this approach have been described in Chapter 4, "Implementing Reliable Messaging and Communications with the Cloud" and Chapter 5, "Processing Orders in the Trey Research Solution."
Aside from the design and implementation of the application logic, the key factor that governs the response time and throughput of a service is the speed with which it can access the resources and data that it needs. In the case of the Orders application, the primary data source is the SQL Azure database containing the customer, order, and product details. Chapter 2, "Deploying the Orders Application and Data in the Cloud" described how Trey Research positioned the data within each datacenter to try and minimize the network overhead associated with accessing this information. However, databases are still relatively slow when compared to other forms of data storage. So, Trey Research was left facing the question: How do you provide scalable, reliable, and fast access to the customer, order and product data as this could be key to minimizing the response time of the Orders application?
Choosing How to Optimize the Response Time of the Orders Application
Upon investigating the issues surrounding response times in more detail, Trey Research found that that there were two complimentary approaches available (both can be used, if appropriate).
Implement Windows Azure Caching
Windows Azure Caching is a service that enables you to cache data in the cloud, and provides scalable, reliable, and shared access to this data.
On profiling the Orders application, the developers at Trey Research found that it spent a significant proportion of its time querying the SQL Azure database, and the latency associated with connecting to this database, together with the inherent overhead of querying and updating data in the database, accounted for a large part of this time. By caching data with the Windows Azure Caching service, Trey Research hoped to reduce the overhead associated with repeatedly accessing remote data, eliminate the network latency associated with remote data access, and improve the response times for applications referencing this data.
The overhead associated with querying and updating data in SQL Azure are not a criticism of this database management system (DBMS). All DBMSs that support concurrent multiuser access have to ensure consistency and integrity of data, typically by serializing concurrent requests from different users and locking data. SQL Azure meets these requirements very efficiently. However, retrieving data from a cache does not have this overhead; it is simply retrieved or updated. This efficiency comes at a cost, as the application itself now has to take responsibility for ensuring the integrity and consistency of cached data.
Additionally, sooner or later any updates to cached data must be copied back to the database, otherwise the cache and the database will be inconsistent with each other or data may be lost; the cache has a finite size, and the Windows Azure Caching service may evict data if there is insufficient space available, or expire data that has remained in the cache for a lengthy period of time.
The Windows Azure Caching Service is also chargeable; it is hosted and maintained by Microsoft in their datacenters, and they offer guarantees concerning the availability of this service and the cached data, but you will be charged depending upon the size of the cache and the volume of traffic read from or written to the cache. For more information, see "Caching, based on cache size per month."
Configure the Content Delivery Network
The Windows Azure Content Delivery Network (CDN) is a service designed to improve the response time of web applications by caching the static output generated by hosted services, and also frequently accessed blob data, closer to the users that request them. While Windows Azure Caching is primarily useful for improving the performance of web applications and services running in the cloud, users will frequently be invoking these web applications and services from their desktop, either by using a custom application that connects to them or by using a web browser. The data returned from a web application or service may be of a considerable size, and if the user is very distant it may take a significant time for this data to arrive at the user's desktop. The CDN enables you to cache the output of web pages and frequently queried data at a variety of locations around the world. When a user makes a request, the web content and data can be served from the most optimal location based on the current volume of traffic at the various Internet nodes through which the request is routed.
Note: |
---|
Detailed information, samples, and exercises showing how to configure CDN are available on MSDN; see the topic "Windows Azure CDN" at http://msdn.microsoft.com/en-us/gg405416. Additionally Chapter 3, "Accessing the Surveys Application" in the guide "Developing Applications for the Cloud, 2nd Edition" provides further implementation details. |
While CDN is a useful technology, investigation by the developers at Trey Research suggested that it would not be applicable in the current version of the Orders application; CDN is ideally suited to caching web pages with static content and blob data for output or streaming to client applications, while many of the pages generated by the Orders application may be relatively dynamic, and the application does not store or emit blob data.
How Trey Research Optimizes the Response Time of the Orders Application
The Orders application uses several types of data; customer information, order details, and the product catalog. Order information is relatively dynamic, and customer details are accessed infrequently compared to other data (only when the customer logs in). Furthermore the same customer and order information tends not to be required by concurrent instances of the Order application. However the product catalog is queried by every instance of the Orders application when the user logs in. It is also reasonably static; product information is updated very infrequently. Additionally, the product catalog can comprise a large number of items. For these reasons, the developers at Trey Research elected to cache the product catalog by using a shared Windows Azure cache in each datacenter, while they decided that caching order and customer details would bring few benefits.
Defining and Configuring the Windows Azure Cache
The Windows Azure Caching service runs in the cloud, and an application should really connect only to an instance of the Windows Azure Caching service located in the same datacenter that hosts the application code. Therefore, Trey Research used the Windows Azure Caching service to create separate caches in the US North and US South datacenters, called TreyResearchCacheUSN (for the US North datacenter) and TreyResearchCacheUSS (for the US South datacenter). This ensures that each cache has a unique and easily recognizable name. The developers estimated that a 128MB cache (the minimum size available, with the cheapest cost) would be sufficient. However, the caches can easily be increased in size if necessary, without impacting the operation of the Orders application.
The web application, implemented in the Orders.Website project, defines the configuration parameters for accessing the cache in the service configuration file for the solution (ServiceConfiguration.csfg).
The Trey Research example application provided in the sample application is only deployed to a single datacenter, and the cache is named TreyResearchCache.
[data omitted]" /> <Setting name="IsLocalCacheEnabled" value="false" /> <Setting name="LocalCacheObjectCount" value="1000" /> <Setting name="LocalCacheTtlValue" value="60" /> <Setting name="LocalCacheSync" value="TimeoutBased" /> ... </ConfigurationSettings> ... </Role> </ServiceConfiguration>
<?xml version="1.0" encoding="utf-8"?>
<ServiceConfiguration serviceName="Orders.Azure" ...>
...
<Role name="Orders.Website">
...
<ConfigurationSettings>
...
<Setting name="CacheHost"
value="TreyResearchCache.cache.windows.net" />
<Setting name="CachePort" value="22233" />
<Setting name="CacheAcsKey" value="[data omitted]" />
<Setting name="IsLocalCacheEnabled" value="false" />
<Setting name="LocalCacheObjectCount" value="1000" />
<Setting name="LocalCacheTtlValue" value="60" />
<Setting name="LocalCacheSync"
value="TimeoutBased" />
...
</ConfigurationSettings>
...
</Role>
</ServiceConfiguration>
Synchronizing the Caches and Databases in the Orders Application
The Orders application was modified to retrieve and update data from the local instance of the Windows Azure Cache, only fetching data from the SQL Azure database if the data is not currently available in the cache. Any changes made to cached data are copied back to SQL Azure. The following subsections describe how Trey Research implemented this approach.
Trey Research also had to consider the effects of caching on their data synchronization strategy. Each datacenter has a copy of the SQL Azure database holding the customers, orders, and products data. The Orders application can amend customers and orders information, and when it does so the cached copy of this information is copied back to the local SQL Azure database. This database is subsequently synchronized with the SQL Azure databases located in the other datacenters, as described in Chapter 2, "Deploying the Orders Application and Data in the Cloud."
However, suppose that the details of an order or customer have been cached by the Orders application running in the US North datacenter, and the same details are queried and cached by the Orders application running in the US South datacenter. At this point the two caches hold the same data. If the information in the US North datacenter is changed and written back to the SQL Azure database in the US North datacenter, and this database is subsequently synchronized with the US South datacenter, then the cached data in the US South datacenter is now out of date. However, when the cached data held in the US South datacenter expires or is evicted, the cache will be populated with the fresh data the next time it is queried.
So, although caching can improve the response time for many operations, it can also lead to issues of consistency if two instances of an item of data are not identical. Consequently, applications that use caching should be designed to cope with data that may be stale but that eventually becomes consistent.
This issue can become more acute if the same cached data is updated simultaneously in the US North and US South datacenters; SQL Azure Data Sync will ensure consistency between the different databases, but at least one of the caches will hold inconsistent data. For more advice and guidance on how to address these problems refer to the section "Guidelines for Using Windows Azure Caching" in "Appendix E - Maximizing Scalability, Availability, and Performance."
Retrieving and Managing Data in the Orders Application
The Orders application uses a set of classes for storing and retrieving each of the types of information it references. These classes are located in the DataStores folder of the Orders.Website project. For example, the ProductStore class in the ProductStore.cs file provides methods for querying products. These methods are defined by the IProductsStore interface:
public interface IProductStore
{
IEnumerable<Product> FindAll();
Product FindOne(int productId);
}
The FindAll method returns a list of all available products from the SQL Azure database, and the FindOne method fetches the product with the specified product ID. In a similar vein, the OrderStore class implements the IOrdersStore interface which defines methods for retrieving and managing orders. None of these classes implements any form of caching.
Implementing Caching Functionality for the Products Catalog
The Orders.Website project contains a generic library of classes for caching data, located in the DataStores\Caching folder. This library is capable of caching any of the data items defined by the types in the DataStores folder, but for the reasons described earlier caching is only implemented for the ProductStore class.
The DataStores\Caching folder contains the ICachingStrategy interface, the CachingStrategy class, and the ProductStoreWithCache class. The following sections describe these classes.
The ICachingStrategy Interface
This is a simple interface that abstracts the caching functionality implemented by the library. It exposes a property named DefaultTimeout and a method called Get, as follows:
public interface ICachingStrategy
{
TimeSpan DefaultTimeout
{
get;
set;
}
object Get<T>(string key, Func<T> fallbackAction,
TimeSpan? timeout) where T : class;
}
The key parameter of the Get method specifies the unique identifier of the object to retrieve from the cache. If the object is not currently cached, the fallbackAction parameter specifies a delegate for a method to run to retrieve the corresponding data, and the timeout parameter specifies the lifetime of the object if it is added to the cache. If the timeout parameter is null, an implementation of this interface should set the lifetime of the object to the value specified by the DefaultTimeout property.
The CachingStrategy Class
This class implements the ICachingStrategy interface. The constructor for this class uses the Windows Azure caching APIs to authenticate and connect to the Windows Azure cache using the values provided as parameters (the web application retrieves these values from the service configuration file, and invokes the constructor by using the Unity framework as described later in this chapter, in the section "Instantiating and Using a ProductsStoreWithCache Object.")
The Get method of the CachingStrategy class queries the cache using the specified key, and if the object is found it is returned. If the object is not found, the method invokes the delegate to retrieve the missing data and adds it to the cache, specifying either the timeout value provided as the parameter to the Get method (if it is not null) or the default timeout value for the CachingStrategy object. The following code sample shows the important elements of this class:
public class CachingStrategy :
ICachingStrategy, IDisposable
{
private readonly RetryPolicy cacheRetryPolicy;
private DataCacheFactory cacheFactory;
...
private TimeSpan defaultTimeout =
TimeSpan.FromMinutes(10);
public CachingStrategy(string host, int port,
string key, bool isLocalCacheEnabled,
long objectCount, int ttlValue, string sync)
{
// Declare array for cache host.
var servers = new DataCacheServerEndpoint[1];
servers[0] = new DataCacheServerEndpoint(
host, port);
// Setup DataCacheSecurity configuration.
var secureAcsKey = new SecureString();
foreach (char a in key)
{
secureAcsKey.AppendChar(a);
}
secureAcsKey.MakeReadOnly();
var factorySecurity =
new DataCacheSecurity(secureAcsKey);
// Setup the DataCacheFactory configuration.
var factoryConfig =
new DataCacheFactoryConfiguration
{
Servers = servers,
SecurityProperties = factorySecurity
};
...
this.cacheFactory =
new DataCacheFactory(factoryConfig);
this.cacheRetryPolicy = RetryPolicyFactory.
GetDefaultAzureCachingRetryPolicy();
...
}
public TimeSpan DefaultTimeout
{
get { return this.defaultTimeout; }
set { this.defaultTimeout = value; }
}
public virtual object Get<T>(string key,
Func<T> fallbackAction, TimeSpan? timeout)
where T : class
{
...
try
{
var dataCache =
this.cacheFactory.GetDefaultCache();
var cachedObject =
this.cacheRetryPolicy.ExecuteAction(
() => dataCache.Get(key));
if (cachedObject != null)
{
...
return cachedObject;
}
...
var objectToBeCached = fallbackAction();
if (objectToBeCached != null)
{
try
{
this.cacheRetryPolicy.ExecuteAction(() =>
dataCache.Put(key, objectToBeCached,
timeout != null ?
timeout.Value : this.DefaultTimeout));
...
return objectToBeCached;
}
...
}
}
}
}
...
}
Notice that this class traps transient errors that may occur when fetching an item from the cache, by using the Transient Fault Handling Application Block. The static GetDefaultAzureCachingRetryPolicy method of the RetryPolicyFactory class referenced in the constructor returns the default policy for detecting a transient caching exception, and provides a construct for indicating how such an exception should be handled. The default policy implements the "Fixed Interval Retry Strategy" defined by the Transient Fault Handling Block, and the web.config file configures this strategy to retry to the failing operation up to six times with a five second delay between attempts.
The Get property of the CachingStrategy class invokes the ExecuteAction method of the retry policy object, passing it a delegate that attempts to read the requested data from the cache (this is the code that may exhibit a transient error, and if necessary will be retried based on the settings defined by the retry policy object). If a non-transient error occurs or an attempt to read the cache fails after six attempts, the exception handling strategy in the Get method (omitted from the code above) will return the value from the underlying store, retrieved by calling the fallbackAction delegate.
The ProductStoreWithCache Class
This class provides the caching version of the ProductStore class. It implements the IProductsStore interface, but internally employs an ICachingStrategy object to fetch data in the FindAll and FindOne methods, as shown by the following code sample:
public class ProductStoreWithCache : IProductStore
{
private readonly IProductStore productStore;
private readonly ICachingStrategy cachingStrategy;
public ProductStoreWithCache(
IProductStore productStore,
ICachingStrategy cachingStrategy)
{
this.productStore = productStore;
this.cachingStrategy = cachingStrategy;
}
public IEnumerable<Product> FindAll()
{
...
return (IEnumerable<Product>)
this.cachingStrategy.Get(
"ProductStore/FindAll",
() => this.productStore.FindAll(),
TimeSpan.FromMinutes(10));
}
public Product FindOne(int productId)
{
...
return (Product)this.cachingStrategy.Get(
string.Format(
"ProductStore/Product/{0}", productId),
() => this.productStore.FindOne(productId),
TimeSpan.FromMinutes(10));
}
}
Instantiating and Using a ProductsStoreWithCache Object
The Orders application creates a ProductsStoreWithCache object by using the Unity Application Block. The static ContainerBootstrapper class contains the following code:
public static class ContainerBootstrapper
{
public static void RegisterTypes(
IUnityContainer container)
{
...
container.RegisterType<IProductStore,
ProductStoreWithCache>(
new InjectionConstructor(
new ResolvedParameter<ProductStore>(),
new ResolvedParameter<ICachingStrategy>()));
container.RegisterType<ProductStore>();
// To change the caching strategy, replace the
// CachingStrategy class with the strategy that
// you want to use instead.
var cacheAcsKey = CloudConfiguration.
GetConfigurationSetting("CacheAcsKey", null);
var port = Convert.ToInt32(CloudConfiguration.
GetConfigurationSetting("CachePort", null));
var host = CloudConfiguration.
GetConfigurationSetting("CacheHost", null);
var isLocalCacheEnabled = Convert.ToBoolean(
CloudConfiguration.GetConfigurationSetting(
"IsLocalCacheEnabled", null));
var localCacheObjectCount = Convert.ToInt64(
CloudConfiguration.GetConfigurationSetting(
"LocalCacheObjectCount", null));
var localCacheTtlValue = Convert.ToInt32(
CloudConfiguration.GetConfigurationSetting(
"LocalCacheTtlValue", null));
var localCacheSync =
CloudConfiguration.GetConfigurationSetting(
"LocalCacheSync", null);
container.RegisterType<ICachingStrategy,
CachingStrategy> (
new ContainerControlledLifetimeManager(),
new InjectionConstructor(host, port, cacheAcsKey,
isLocalCacheEnabled, localCacheObjectCount,
localCacheTtlValue, localCacheSync));
}
}
These statements register the ProductStore and CachingStrategy objects, and the Unity Application Block uses them to create a ProductStoreWithCache object whenever the application instantiates an IProductStore object. Notice that the CachingStrategy class is configured to use the ContainerControlledLifetimeManager class of the Unity framework. This effectively ensures that the CachingStrategy object used by the application is created as a singleton that spans the life of the application. This is useful as the DataCacheFactory object that the CachingStrategy class encapsulates is very expensive and time consuming to create, so it is best to create a single instance of this class that is available throughout the duration of the application. Additionally, the parameters for the constructor for the CachingStrategy object are read from the configuration file and are passed to the CachingStrategy class by using a Unity InjectionConstructor object.
Markus says: | |
---|---|
The RegisterTypes method of the ContainerBootstrapper class is called from the SetupDependencies method in the Global.asax.cs file when the Orders application starts running. The SetupDependencies method also assigns the dependency resolver for the Orders application to the Unity container that registered these types. For more information about using the Unity Application Block see "Unity Application Block" on MSDN. |
The StoreController class calls the FindAll method of the ProductStoreWithCache object when it needs to fetch and display the entire product catalog, and the FindOne method when it needs to retrieve the details for a single product:
public class StoreController : Controller
{
private readonly IProductStore productStore;
public StoreController(IProductStore productStore)
{
...
this.productStore = productStore;
}
public ActionResult Index()
{
var products = this.productStore.FindAll();
return View(products);
}
public ActionResult Details(int id)
{
var p = this.productStore.FindOne(id);
return View(p);
}
}
This code transparently accesses the Windows Azure cache, populating it if the requested data is not currently available in the cache. You can change the caching configuration, and even elect not to cache data if caching is found to have no benefit, without modifying the business logic for the Orders application; all you need to do is switch the type for the IProductsStore interface in the ContainerBootstrapper class to ProductStore, as highlighted in bold in the following code example:
public static class ContainerBootstrapper
{
public static void RegisterTypes(
IUnityContainer container)
{
...
container.RegisterType<IProductStore, ProductStore>();
...
}
}
Summary
This chapter has described the Windows Azure technologies that Trey Research used to improve the scalability, availability, and performance of the Orders application.
Windows Azure Traffic Manager can play an important role in reducing the network latency associated with sending requests to a web application by transparently routing these requests to the most appropriate deployment of the web application relative to the location of the client submitting these requests. Traffic Manager can also help to maximize availability by intelligently detecting whether the application is responsive, and if not, re-routing requests to a different deployment of the application.
Windows Azure provides a highly scalable environment for hosting web applications and services, and the Enterprise Library Autoscaling Application Block implements a mechanism that can take full advantage of this scalability by monitoring web applications and automatically starting and stopping instances as the demand from clients requires.
Finally, Windows Azure caching is an essential element in improving the responsiveness of web applications and services. It enables Trey Research to cache data locally to these applications, in the same datacenter. This technique removes much of the network latency associated with remote data access. However, as Trey Research discovered, you must be prepared to balance this improvement in performance against the possible complexity introduced by maintaining multiple copies of data.
More Information
All links in this book are accessible from the book's online bibliography available at: http://msdn.microsoft.com/en-us/library/hh968447.aspx.
- "Autoscaling Application Block Logging" at
http://msdn.microsoft.com/en-us/library/hh680883(v=pandp.50).aspx - Chapter 5, "Making Tailspin Surveys More Elastic," of the Developer's Guide to the Enterprise Library 5.0 Integration Pack for Windows Azure at
http://msdn.microsoft.com/en-us/library/hh680942(PandP.50).aspx - "Pricing Overview" at http://www.windowsazure.com/en-us/pricing/details/.
- "Caching, based on cache size per month" at
http://www.windowsazure.com/en-us/pricing/details/#caching - "Windows Azure CDN" at
http://msdn.microsoft.com/en-us/gg405416. - Chapter 3, "Accessing the Surveys Application" in the guide "Developing Applications for the Cloud, 2nd Edition" http://msdn.microsoft.com/en-us/library/ff966499.aspx.
- "Unity Application Block" at http://msdn.microsoft.com/en-us/library/ff647202.aspx.
- "Windows Azure Traffic Manager" at http://msdn.microsoft.com/en-us/gg197529.
- "Windows Azure Service Instances Auto Scaling" at http://azureautoscaling.codeplex.com/releases/view/62421.
- "Windows Azure Caching Service" at http://msdn.microsoft.com/en-us/library/gg278356.aspx.
- "Windows Azure CDN" at http://msdn.microsoft.com/en-us/gg405416