When the design and implementation of the hybrid Orders application was completed, Trey Research considered how to monitor and manage the application as it runs on the Windows Azure™ technology platform.

The Orders application comprises a number of components, built using a variety of technologies, and distributed across a range of sites and connected by networks of varying bandwidth and reliability. With this complexity, it was very important for Trey Research to be able to monitor how well the system is functioning, and quickly take any necessary restorative action in the event of failure. However, monitoring a complex system is itself a complex task, requiring tools that can quickly gather performance data to help analyze throughput and pinpoint the causes of any errors, failures, or other shortcomings in the system. The range of problems can vary significantly, from simple failures caused by application errors in a service running in the cloud, through issues with the environment hosting individual elements, to complete systemic failure and loss of connectivity between components whether they are running on-premises or in the cloud.

This chapter focuses on the challenges associated with monitoring the Orders application, and the decisions Trey Research made when tackling these challenges.

Scenario and Context

The hybrid Orders application has components running remotely from the on-premises services; including a website, background order processing code, and databases. The application also communicates with transport partners as it processes orders, listens for status messages from these partners, and sends messages to the on-premises Audit Log service.

The designers at Trey Research had to decide how to monitor the application as it runs so that administrators can measure performance, ensure it meets Service Level Agreements, and verify that it provides acceptable response times to visitors. Administrators must also be able to retrieve data about errors or exceptions that occur at runtime, and be able to trace operations to assist in debugging the application. Developers had to add some code to the application before it was deployed in order to accomplish many of these tasks.

Trey Research also had to consider how to deploy the application to Windows Azure, and how to manage factors such as re-configuration and management of the individual Windows Azure services it uses while the application is deployed and running. Trey Research created a set of scripts and other executable programs that allow these kinds of tasks to be performed repeatedly, accurately, and securely.

Monitoring Services, Logging Activity, and Measuring Performance

Even though the Orders application runs remotely from Trey Research's head office, it is still possible for Trey Research administrators to obtain the same kinds of information about its operation and any exceptions or errors that occur as they would when administering an application deployed locally in their own data center. However, the way that this data is collected and accessed is very different in Windows Azure compared to a local server deployment.

You can configure Windows Azure Diagnostics to collect performance and diagnostics information. This data is stored in memory in the worker or web role being monitored, but it can be transferred to Windows Azure storage on a scheduled basis or on demand, so that it can be accessed from on-premises applications and monitoring solutions.

Bharath says:
	You can configure the Windows Azure Diagnostics mechanism to collect the data you need to monitor and debug applications, and to transfer this data to Windows Azure storage so that you can access it.

Figure 1 shows a high-level view of the monitoring mechanism in Windows Azure and some of the ways that Trey Research considered using it. The Windows Azure Diagnostics mechanism can be configured to collect data from a range of sources, such as Windows event logs and performance counters. It is also possible to use a third party logging mechanism, such as the Enterprise Library Logging Application Block, or custom code that writes events to the Window Azure Diagnostics mechanism.

Figure 1 - Monitoring approaches that Trey Research considered for the Orders application

Figure 1

Monitoring approaches that Trey Research considered for the Orders application

Choosing a Monitoring and Logging Solution

Trey Research considered four ways of collecting information for monitoring services, logging activity, and measuring performance in the Orders application: Windows Azure Diagnostics, the Enterprise Library Logging Application Block, a third party monitoring solution, and using custom code in the application to generate logging messages. The following sections describe each of these options.

Poe says:
	A range of comprehensive ready-built monitoring solutions designed to work with applications deployed in Windows Azure is available. These products typically provide functions for collecting and analyzing monitoring information, displaying it in a dashboard, and notifying operators of significant events. Such solutions include Microsoft System Center Operations Manager and products from third parties.

Poe says:

A range of comprehensive ready-built monitoring solutions designed to work with applications deployed in Windows Azure is available. These products typically provide functions for collecting and analyzing monitoring information, displaying it in a dashboard, and notifying operators of significant events. Such solutions include Microsoft System Center Operations Manager and products from third parties.

Windows Azure Diagnostics

Windows Azure Diagnostics is the built-in mechanism for collecting all kinds of monitoring and diagnostic information in Windows Azure. It requires no additional code or assemblies. Windows Azure Diagnostics can collect data from Windows event logs and performance counters; the IIS log and failed request log; infrastructure logs; crash dump files; and custom error logs. Developers or administrators at Trey Research simply configure the diagnostics mechanism to collect the required data, and specify the intervals for this data to be transferred to Windows Azure storage. They can also use Windows Azure PowerShell cmdlets to reconfigure the diagnostics settings as the application runs, initiate a transfer of the data to Windows Azure storage on demand, and download the logged data to an on-premises store.

However, there is only a limited set of options for filtering and categorizing the logged information, and it can only be stored in Windows Azure storage. There are no opportunities to store the data in a database, or in a custom format or repository.

Note:
For more information about using Windows Azure Diagnostics, see "Appendix F - Monitoring and Managing Hybrid Applications" of this guide.

Enterprise Library Logging Application Block

The Logging Application Block is a component of Enterprise Library, a framework of components for managing cross-cutting concerns in most types of applications. Trey Research could configure the Logging Application Block to send log entries to the Windows Azure Diagnostic trace listener, which is a component of the Windows Azure Diagnostics system that stores the log entries in memory so that they can be transferred to Windows Azure storage along with any other diagnostic data that the system collects.

Alternatively, Trey Research could configure the Logging Application Block to send log entries to other types of storage such as a database, text files in a range of formats, and XML files. One option that Trey Research considered was using the Logging Application Block to write log entries directly to a SQL Azure™ technology platform database located in the cloud, and then transfer this data back to an on-premises database for analysis. SQL Azure Data Sync could be used to simplify the task of synchronizing the data between the cloud and an on-premises database.

Markus says:
	You might also consider using the Enterprise Library Exception Handling Application Block to provide a structured policy-driven mechanism for collecting and managing exception information. The Exception Handling block can send its log entry messages to the Enterprise Library Logging Application Block for exposure through the Windows Azure Diagnostics mechanism.

The Logging Application Block is highly configurable and extensible, and includes a wide range of options for filtering and categorizing log messages. This would make it easy for developers at Trey Research to generate different types of log entries and provide useful additional support for administrators and operators.

The main limitation of the Enterprise Library Logging Application Block is that it cannot collect data from the host system; such as Windows event log entries, performance counters, or IIS log files. It is purely an activity logging mechanism where code generates the log entries in response to events occurring in the application. Using any of the Enterprise Library Application Blocks also means that external library assemblies must be uploaded and installed with the application code in Windows Azure.

Note:
For more information about the Enterprise Library Logging Application Block and Exception Handling Application Block, see "About This Release of Enterprise Library." There is also a whitepaper available that describes how you can use the Enterprise Library 5.0 application blocks with Windows Azure-hosted applications. You can download the whitepaper from the Enterprise Library CodePlex site here.

Third Party Monitoring Solution

Trey Research could have adopted a third party monitoring solution. There are several solutions available that are aimed wholly or partly at monitoring Windows Azure applications and services. They include the following:

Windows Azure Management Pack for Microsoft System Center Operations Manager
Azure Diagnostics Manager from Cerebrata
AzureWatch from Paraleap Technologies
ManageAxis from Cumulux

These solutions can monitor role status, collect performance information, gather event data, and raise notifications to administration staff.

Custom Logging Solution

The developers at Trey Research considered building a custom logging and diagnostics solution for the Orders application. The Windows Azure Diagnostic trace listener exposes methods for creating and storing log entries, and so provides a way for Trey Research to monitor activity and expose these log entries through the standard Windows Azure Diagnostics mechanism. For example, the developers could add code to the Orders application that generates a message each time a visitor is initially authenticated and signs in. This code can call the methods of the diagnostic trace listener to store the message as a log entry. When the diagnostics data is later transferred to Windows Azure storage it will include entries created by the custom code.

It is also possible to create custom logging solutions that store data in other formats and locations. For example, like the Enterprise Library Logging Application Block, the code could store the log entries in a database, text file, or a repository in some other format. This approach will require a mechanism for accessing the data remotely from the on-premises applications and tools, or for transferring the data back to on-premises storage for future analysis.

Markus says:
	You can use the Windows Azure Diagnostic trace listener to generate log entries containing information you need for monitoring events and activity in your application. The data is exposed through the Windows Azure Diagnostics mechanism and can be transferred to Windows Azure storage for analysis as required.

The main limitation that Trey Research considered with using a custom logging solution is that, like the Logging Application Block, it cannot collect data from the host system; such as Windows event log entries, performance counters, or IIS log files. It is purely an activity logging mechanism where code generates the log entries in response to events occurring in the application.

How Trey Research Chose a Monitoring and Logging Solution

Trey Research wanted to be able to generate some monitoring and activity tracing information in the Orders application all of the time it is running. However, the administrators did not want to collect full tracing information or operating system diagnostics information all of the time. Instead, they want to be able to change the configuration so that additional information can be collected when required, such as when debugging a problem with the application.

After careful consideration, Trey Research decided to use a custom solution for activity tracing and recording specific errors by generating these log entries and then writing them to the Windows Azure Diagnostics mechanism. While the Enterprise Library Logging Application Block (and the Exception Handling Application Block) would have been suitable, the types of information that Trey Research collects are limited, and so the additional complexity of using these blocks was not felt to be an advantage in Trey Research's scenario.

How Trey Research Uses Windows Azure Diagnostics

Trey Research implements diagnostic logging, and downloads the information from the cloud to their on-premises servers. Trey Research traces the execution of each role instance, and also records the details of any exceptions raised by the role instances using a combination of a custom TraceHelper class and the standard Windows Azure Diagnostics mechanism. The data is initially stored in a table named WADLogsTable, in table storage at each datacenter. Trey Research considered the following two options for monitoring this data and using it to manage the system:

Using System Center Operations Manager with the Windows Azure Management Pack, or another third party solution, to connect directly to each datacenter, examine the diagnostic data, generate the appropriate usage reports, and alert an operator if an instance failed or some other significant event occurred.
Periodically transferring the diagnostic data to a secure on-premises location, and then reformatting this data for use by their own custom reporting and analytical tools.

Although System Center Operations Manager and other third party solutions provide many powerful features, the existing investment that Trey Research has already made in developing and procuring custom analytical tools led to the second option being more appealing. Additionally, it meant that Trey Research could more easily retain a complete audit log of all significant operations and events locally, which might be a requirement as ever-stricter compliance regulations become legally binding. However, this solution does not preclude Trey Research from deploying System Center Operations Manager or another third party solution in the future.

Selecting the Data and Events to Record

Trey Research decided to record different types of events using trace messages and Windows Azure Diagnostics. Under normal operation, Trey Research collects only trace messages that have a severity of Warning or higher. However, the mechanism Trey Research implemented allows administrators to change the behavior to provide more detail when debugging the application or monitoring specific activity.

The following table shows the logging configuration that Trey Research uses. Notice that Trey Research does not collect Windows event log events or Windows performance counter data. Instead, Trey Research captures information at all stages of the operation of the application and sends this information to the Windows Azure Diagnostics mechanism through a custom class named TraceHelper.

Bharath says:
	To collect Windows event logs and performance counter data you must configure Windows Azure Diagnostics to transfer the required data to Windows Azure Table storage. Windows event log entries are transferred to a table named WADWindowsEventLogsTable, and performance counter data is transferred to a table named WADPerformanceCountersTable. If Trey Research needs to capture this data in the on-premises management application, its developers must write additional code to download the data in these tables.

Bharath says:

To collect Windows event logs and performance counter data you must configure Windows Azure Diagnostics to transfer the required data to Windows Azure Table storage. Windows event log entries are transferred to a table named WADWindowsEventLogsTable, and performance counter data is transferred to a table named WADPerformanceCountersTable. If Trey Research needs to capture this data in the on-premises management application, its developers must write additional code to download the data in these tables.

Configuring the Diagnostics Mechanism

The worker role in the Orders.Workers project and the web role in the Orders.Website project are both configured to use Windows Azure Diagnostics. The configuration file for both applications contains the following settings:

XML

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  ...
  <system.diagnostics>
    <sources>
      <source name="TraceSource">
        <listeners>
          <add type="Microsoft.WindowsAzure.Diagnostics
                     .DiagnosticMonitorTraceListener, ..."
               name="AzureDiagnostics">
            <filter type="" />
          </add>
        </listeners>
      </source>
    </sources>
  </system.diagnostics>
</configuration>

This configuration defines a diagnostic source listener named TraceSource that sends trace messages to the Windows Azure DiagnosticsMonitorTraceListener class. There is no filter level defined in the configuration because this will be set in the code that initializes the TraceSource listener.

To configure the diagnostics and schedule the transfer of diagnostic data to Windows Azure storage, Trey Research initially considered using an imperative approach by adding code such as that shown below to the OnStart methods of the classes implementing the web and worker roles.

...
// Get default initial configuration.
var config =
    DiagnosticMonitor.GetDefaultInitialConfiguration();

// Update the initial configuration.
config.Logs.ScheduledTransferLogLevelFilter
       = LogLevel.Undefined;
config.Logs.ScheduledTransferPeriod
       = TimeSpan.FromSeconds(60);

// Start the monitor with this configuration.
DiagnosticMonitor.Start("DiagnosticsConnectionString",
                        config);
...

However, although simple, Trey Research found that this approach could prove inflexible. If necessary, Trey Research needs to be able to modify the diagnostics configuration and transfer schedule remotely (by using Windows Azure Powershell cmdlets or one of the available third party monitoring solutions such as Cerebrata Diagnostics Manager listed earlier), but performing these tasks in code will cause any remote changes to the configuration to be lost if a role restarts.

Therefore Trey Research opted to configure the diagnostics by using the diagnostics configuration file, diagnostics.wadcfg. This file can be held in blob storage (and therefore survive role restarts) and read by the Windows Azure Diagnostics monitor when the role starts. For more information, see "Using the Windows Azure Diagnostics Configuration File" on MSDN.

Implementing Trace Message Logging and Specifying the Level of Detail

Trey Research collects trace messages generated by a custom class named TraceHelper (located in the Helpers folder of the Orders.Shared project). The TraceHelper class instantiates a TraceSource instance and exposes a set of static methods that make it easy to write trace messages with different severity levels.

public class TraceHelper
{
  private static readonly TraceSource Trace;

  static TraceHelper()
  {
    Trace = new TraceSource("TraceSource",
                            SourceLevels.Information);
  }

  [EnvironmentPermissionAttribute(
    SecurityAction.LinkDemand, Unrestricted = true)]
  public static void Configure(SourceLevels sourceLevels)
  {
    Trace.Switch.Level = sourceLevels;
  }

  public static void TraceVerbose(string format,
                          params object[] args)
  {
    Trace.TraceEvent(TraceEventType.Verbose, 0,
                     format, args);
  }

  public static void TraceInformation(string format,
                          params object[] args)
  {
    Trace.TraceEvent(TraceEventType.Information, 0,
                     format, args);
  }

  public static void TraceWarning(string format,
                          params object[] args)
  {
    Trace.TraceEvent(TraceEventType.Warning, 0,
                     format, args);
  }

  public static void TraceError(string format,
                          params object[] args)
  {
    Trace.TraceEvent(TraceEventType.Error, 0,
                     format, args);
  }
}

Data recorded in this way is directed to the Windows Azure Diagnostics monitor trace listener (due to the configuration shown in the section "Configuring the Diagnostics Mechanism" earlier in this chapter), and subsequently into the WADLogsTable. By default the TraceHelper class captures messages with the severity filter level of Information. However, this setting can be changed by calling the Configure method the TraceHelper class exposes, and supplying a value for the severity level of messages to trace. The worker roles and web roles both configure this setting in the OnStart method by reading it from the service configuration file.

public override bool OnStart()
{
  ...
  ConfigureTraceListener(
    RoleEnvironment.GetConfigurationSettingValue(
      "TraceEventTypeFilter"));
  ...
}

private static void ConfigureTraceListener(
                       string traceEventTypeFilter)
{
  SourceLevels sourceLevels;
  if (Enum.TryParse(traceEventTypeFilter, true,
                    out sourceLevels))
  {
    TraceHelper.Configure(sourceLevels);
  }
}

The roles also set up handlers for the RoleEnvironmentChanging and RoleEnvironmentChanged events. These handlers reconfigure the TraceHelper class for the role when the configuration changes. This enables administrators to change the severity filter level to obtain additional information for debugging and monitoring while the application is running.

Markus says:
	The mechanism that Trey Research implements for specifying the trace level offers a high degree of control over the volume and nature of data that is captured. However, an alternative approach is to capture data for all events and then apply filters when transferring the trace data to Windows Azure storage by setting the ScheduledTransferLogLevelFilter property of the diagnostic monitor configuration; this property can be specified as part of the Windows Azure Diagnostics monitor configuration stored in the diagnostics.wadcfg and can be updated remotely without requiring the roles to be restarted.

Markus says:

The mechanism that Trey Research implements for specifying the trace level offers a high degree of control over the volume and nature of data that is captured. However, an alternative approach is to capture data for all events and then apply filters when transferring the trace data to Windows Azure storage by setting the ScheduledTransferLogLevelFilter property of the diagnostic monitor configuration; this property can be specified as part of the Windows Azure Diagnostics monitor configuration stored in the diagnostics.wadcfg and can be updated remotely without requiring the roles to be restarted.

Writing Trace Messages

The web and worker roles use the TraceHelper class to record information about events, errors, and other significant occurrences. For example, exceptions are captured using code such as that shown in the following example taken from the ReceiveNextMessage method of the ServiceBusReceiverHandler class in the Orders.Shared project. Note that this code calls the TraceError method of the TraceHelper class to write a trace message with severity "Error".

private void ReceiveNextMessage(
  CancellationToken cancellationToken)
{
  ...
  this.ReceiveNextMessage(cancellationToken);

  if (taskResult.Exception != null)
  {
    TraceHelper.TraceError(taskResult.Exception.Message);
    throw taskResult.Exception;
  }
  ...
}

The TraceHelper class is also used in the web role. Code in the CustomAttributes folder of the Orders.Website project defines a custom attribute called LogActionAttribute that calls the TraceInformation method of the TraceHelper class to write a trace message with severity "Information".

public class LogActionAttribute : ActionFilterAttribute
{
  public override void OnActionExecuting(
    ActionExecutingContext filterContext)
  {
    ...
    TraceHelper.TraceInformation(
      "Executing Action '{0}', from controller '{1}'",
      filterContext.ActionDescriptor.ActionName,
      filterContext.ActionDescriptor.
        ControllerDescriptor.ControllerName);
  }

  public override void OnActionExecuted(
    ActionExecutedContext filterContext)
  {
    ...
    TraceHelper.TraceInformation(
      "Action '{0}', from controller '{1}'
        has been executed",
      filterContext.ActionDescriptor.ActionName,
      filterContext.ActionDescriptor.
        ControllerDescriptor.ControllerName);
      }
  }
}

The controller classes in the Orders.Website project are tagged with this attribute. The following code shows the StoreController class, which retrieves products for display.

[LogAction]
public class StoreController : Controller
{
  ...
  public ActionResult Index()
  {
    var products = this.productStore.FindAll();
    return View(products);
  }

  public ActionResult Details(int id)
  {
    var p = this.productStore.FindOne(id);
    return View(p);
  }
}

This feature enables the application to generate a complete record of all tagged actions performed on behalf of every user simply by changing the TraceEventTypeFilter setting in the ServiceConfiguration.cscfg file to Information.

Transferring Diagnostics Data from the Cloud

Trey Research uses a custom mechanism for collating and analyzing diagnostics information. It requires that all applications store event and trace messages in an on-premises database named DiagnosticsLog that the monitoring and analysis mechanism queries at preset intervals.

Trey Research could use a third-party tool to download the data from the WADLogsTable, or write scripts that use the Windows Azure PowerShell cmdlets (see http://wappowershell.codeplex.com). However, the Windows Azure SDK provides classes that make it easy to interact with Windows Azure storage through the management API using the .NET Framework. This is the approach that Trey Research chose.

The on-premises monitoring and management application (implemented in the HeadOffice project of the example) contains a page that administrators use to download and examine the diagnostics data collected in the Orders application.

The code that interacts with Windows Azure storage and updates the on-premises DiagnosticsLog database table is in the DiagnosticsController class, located in the Controllers folder of the HeadOffice project. The DiagnosticsController class uses the Enterprise Library Transient Fault Handling Block to retry any failed connection to Windows Azure storage and the on-premises database. The constructor of the DiagnosticsController class reads the retry policy from the application configuration file.

this.storageRetryPolicy
= RetryPolicyFactory.GetDefaultAzureStorageRetryPolicy();

When an administrator opens the Diagnostics page of the HeadOffice application, the TransferLogs action is executed. This action extracts a list of the datacenters from which it will download data from the application configuration, and then reads the corresponding account details (from the same configuration) for each datacenter. As the code iterates over the list of datacenters it creates a suitable CloudStorageAccount instance using the credentials collected earlier, and then calls a method named TransferLogs to download the data from this datacenter.

[HttpPost]
public ActionResult TransferLogs(
                    FormCollection formCollection)
{
  var deleteEntries
    = formCollection.GetValue("deleteEntries") != null;
  var dataCenters
    = WebConfigurationManager.AppSettings["dataCenters"]
                             .Split(',');
  ...
  // Get account details for accessing each datacenter.
  var dataCenters2 = dataCenters.Select(
       dc => dc.Trim()).Where(dc =>
             !string.IsNullOrEmpty(dc.Trim()));
  var accountNames = dataCenters2.Select(
    dc => string.Format(CultureInfo.InvariantCulture,
          "diagnosticsStorageAccountName.{0}", dc));
  var accountKeys = dataCenters2.Select(
    dc => string.Format(CultureInfo.InvariantCulture,
          "diagnosticsStorageAccountKey.{0}", dc));

  for (var i = 0; i < dataCenters2.Count(); i++)
  {
    // Create credentials for this datacenter.
    var cred = new StorageCredentialsAccountAndKey(
        WebConfigurationManager.AppSettings[
                        accountNames.ElementAt(i)],
        WebConfigurationManager.AppSettings[
                        accountKeys.ElementAt(i)]);
    var storageAccount = new CloudStorageAccount(cred,
                                                true);

    // Download the data from this datacenter.
    this.TransferLogs(dataCenters2.ElementAt(i),
                      storageAccount, deleteEntries);
  }
  ...
}

The TransferLogs method uses the CreateCloudTableClient class to access Windows Azure Table storage. The code accesses the table service context and generates a query over the WADLogsTable in Windows Azure storage. For each entry returned from the query (each row in the table) it creates a new DiagnosticsLog instance and saves this instance in the DiagnosticsLog database by using the DiagnosticsLogStore repository class. Notice how this method can also delete the entries in the WADLogsTable in Windows Azure storage at the same time to reduce storage requirements in the cloud.

private void TransferLogs(string dataCenter,
               CloudStorageAccount storageAccount,
               bool deleteWADLogsTableEntries)
{
  var tableStorage
      = storageAccount.CreateCloudTableClient();
  ...
  var context = tableStorage.GetDataServiceContext();

  if (!deleteWADLogsTableEntries)
  {
    context.MergeOption = MergeOption.NoTracking;
  }

  IQueryable<WadLog> query
    = this.storageRetryPolicy.ExecuteAction(() =>
           context.CreateQuery<WadLog>("WADLogsTable"));

  foreach (var logEntry in query)
  {
    var diagLog = new DiagnosticsLog
      {
        Id = Guid.NewGuid(),
        PartitionKey = logEntry.PartitionKey,
        RowKey = logEntry.RowKey,
        DeploymentId = logEntry.DeploymentId,
        DataCenter = dataCenter,
        Role = logEntry.Role,
        RoleInstance = logEntry.RoleInstance,
        Message = logEntry.Message,
        TimeStamp = logEntry.Timestamp
      };
    this.store.Save(diagLog);

    if (deleteWADLogsTableEntries)
    {
      context.DeleteObject(logEntry);
      this.storageRetryPolicy.ExecuteAction(() =>
                              context.SaveChanges());
    }
  }
}

Bharath says:
	When accessing and performing operations on the Windows Azure Tables that store diagnostics information, consider the transaction charges that these operations will incur. It may be better to pay for storage than to pay for a large number of transactions that delete individual rows; then drop and recreate the table at appropriate intervals.

The strategy that Trey Research adopted for deleting diagnostic data from Windows Azure storage after it has been downloaded prevents this data from growing indefinitely, but it does come at a cost. Each record being deleted is counted as a single transaction against Windows Azure storage and Trey Research is billed accordingly. As the number of customers increase, the volume of diagnostics data that Trey Research captures will increase as well, and eventually the transaction charges associated with deleting each record individually as it is downloaded may become prohibitive.

To counter this overhead, Trey Research are currently evaluating an alternative approach; rather than deleting individual records from tables in Windows Azure storage, simply drop and recreate the tables themselves at an appropriate juncture, after downloading the data. This approach comprises far fewer transactions, but adds complexity to the code in the role that downloads the data; it may need to implement a locking mechanism to prevent a scheduled transfer of diagnostics data to a table that has just been dropped but has yet to be recreated. Additionally, dropping and creating tables may be more time consuming than removing individual records from an existing table, so this functionality may need to be implemented as a background task in a web or worker role.

Deployment and Management

Trey Research wanted to be able to configure and manage all of the services within its Windows Azure account that are used by the Orders application. Trey Research required that the configuration of features such as ACS and Service Bus be automated and reliably repeatable. This configuration is necessarily complex, and includes configuration of services in multiple datacenters where the application is deployed.

Trey Research also wanted to automate the deployment and redeployment of the application as it is updated and extended. Automating the deployment reduces the chances of errors, and helps to control the permissions for the employees that can perform these tasks.

Choosing Deployment and Management Solutions

Trey Research considered a range of solutions for deploying and managing the Orders application. These options included using the Windows Azure Management Portal, the Windows Azure Service Management REST API and Windows Azure SDK, and the Windows Azure PowerShell Ccmdlets. The following sections describe each of these options.

Windows Azure Management Portal

The Management Portal is the primary location for creating service namespaces, and can also be used to configure all of the Windows Azure services for a subscription. It provides a graphical and intuitive interface that is easy to use, and provides feedback on the state of each service. However, all users of the Management Portal must access it by providing the administrative credentials for the Trey Research Windows Azure subscription, which means that they will have access to all features of the subscription.

The administrators at Trey Research are aware that using the Management Portal is the only way to create new namespaces for the services such as Service Bus, ACS, and Traffic Manager, although these namespaces can be configured afterwards using the portal, scripts, or code.

Windows Azure Service Management REST API and Windows Azure SDK

With the exception of creating namespaces for services, all of the features of the Windows Azure services can be accessed using the Windows Azure Service Management REST API and the Windows Azure SDK. The Windows Azure SDK contains assemblies used for performing service management tasks against the Service Management REST API. Alternatively, you can use third party tools or create your own code that accesses the REST interfaces of the Service Management API to automate management tasks. This approach is useful if you are building a solution based on a language that is not supported by the Windows Azure SDK; for example, you can install the Windows Azure SDK for Java and use the Java programming language.

Administrators and developers at Trey Research realized that they could use code inside their applications and management tools to perform complex tasks by using the Service Management API, including creation of setup programs and tools for managing most aspects of the application and the services it uses.

The major limitation with the Windows Azure Service Management REST API is that it cannot be used directly in scripts.

Note:
For more information about the Windows Azure Service Management REST API, see "About the Service Management API" on MSDN.

Windows Azure PowerShell Cmdlets

The Windows Azure PowerShell cmdlets library that you can download from the Codeplex website contains almost one hundred PowerShell cmdlets that can accomplish most common Windows Azure management and configuration tasks.

These cmdlets are extremely useful for performing a wide range of management tasks, and they can be used within other scripts as well as being executed directly from the command line. The administrators at Trey Research realized that the cmdLets provide an ideal solution for accomplishing simple, everyday tasks.

However, while the Windows Azure PowerShell cmdlets can be used in scripts for complex management tasks, this can be more difficult than using the Windows Azure Service Management REST API.

Note:
For more information about the Windows Azure PowerShell cmdlets, see "Windows Azure PowerShell Cmdlets" at http://wappowershell.codeplex.com.

How Trey Research Chose Deployment and Management Solutions

Managing Windows Azure applications and services typically involves two types of tasks:

Occasional or infrequent tasks, such as configuring namespaces and features such as ACS, Windows Azure Service Bus, and Windows Azure Traffic Manager. These tasks are often complex.
Frequent tasks, such as deploying and updating services, downloading logging data, adding and removing certificates, manipulating storage services, setting firewall rules, and interacting with SQL Azure. These tasks are typically more straight-forward.

For the occasional tasks, Trey Research decided to create applications and tools that use the Windows Azure Management REST API through objects exposed by the Windows Azure SDK and by a separate library. For example, Trey Research decided to create a setup program that can be executed to set many of the configuration options in Windows Azure instead of using the Management Portal.

For the more frequently performed tasks, administrators at Trey Research decided to use the Windows Azure PowerShell cmdlets within scripts to provide a repeatable, reliable, and automated process. For example, Trey Research uses a PowerShell script that is executed before deployment to set the appropriate values for namespaces, user names, passwords, and keys in the source files. These items are different for each datacenter in which the application is deployed.

Administrators at Trey Research also use PowerShell scripts to perform tasks such as changing the Windows Azure Diagnostics configuration for tracing and debugging, managing certificates, starting and stopping instances of the application roles, and managing SQL Azure databases.

How Trey Research Deploys and Manages the Orders Application

The following sections describe how Trey Research uses the Windows Azure Management REST API though a management wrapper library and directly to automate much of the configuration of Windows Azure services such as ACS and Service Bus.

Configuring Windows Azure by Using the Service Management Wrapper Library

Trey Research uses a library of functions that was originally developed by the Windows Azure team to help automate configuration of Windows Azure namespaces through the REST-based Management API. The library code is included in the ACS.ServiceManagementWrapper project of the sample code, and you can reuse this library in your own applications.

Markus says:
	The workings of the Service Management wrapper library are not described here, but you can examine the source code and modify it if you wish. It is a complex project, and exposes a great deal of functionality that makes many tasks for configuring Windows Azure much easier that writing your own custom code.

The setup program Trey Research created instantiates a ServiceManagementWrapper object and then calls several separate methods within the setup program to configure ACS and Service Bus for the Orders application. The Service Bus configuration Trey Research uses depends on ACS to authenticate the identities that post messages to, or subscribe to, queues and topics.

internal static void Main(string[] args)
{
  try
  {
    var acs = new ServiceManagementWrapper(
                  acsServiceNamespace,
                  acsUsername, acsPassword);

    Console.WriteLine("Setting up ACS namespace:"
                      + acsServiceNamespace);

    // ACS namespace setup for the Orders Website
    CleanupIdenityProviders(acs);
    CleanupRelyingParties(acs);
    CreateIdentityProviders(acs);
    CreateRelyingPartysWithRules(acs);

    // Create Service Bus topic, subscriptions and queue.
    SetupServiceBusTopicAndQueue();
  }
  catch (Exception ex)
  {
    ... display exception information ...
  }
    Console.ReadKey();
}

The values used by the setup program for namespace names, account IDs, passwords, and keys are stored in the App.config file of the setup program project named TreyResearch.Setup. Notice that the code first cleans up the current ACS namespace by removing any existing settings so that the new settings replace the old ones. If not, ACS may attempt to add duplicate settings or features such as identity providers or rule sets, which could cause an error.

To illustrate how easy the Service Management wrapper library is to use, the following code from CleanupRelyingParties method of the setup program removes all existing relying parties with the name "AccessControlManagement" from the current ACS namespace.

var rps = acsWrapper.RetrieveRelyingParties();
foreach (var rp in rps)
{
  if (rp.Name != "AccessControlManagement")
  {
    acsWrapper.RemoveRelyingParty(rp.Name);
  }
}

The setup program creates service identities by first removing any identity with the same name, and then adding a new one with the specified name and password. The values used in this code come from the App.config file.

acswrapper.RemoveServiceIdentity(ContosoDisplayName);
acswrapper.AddServiceIdentity(ContosoDisplayName,
ContosoPassword);

The setup program also uses the Service Management wrapper library to create rules that map claims from identity providers to the claims required by the Orders application. For example, the following code creates a pass-through rule for the Windows Live ID® identity provider that maps the NameIdentifier claim provided by Windows Live ID to a new Name claim.

var identityProviderName
    = SocialIdentityProviders.WindowsLiveId.DisplayName;

// pass nameidentifier as name
acsWrapper.AddPassThroughRuleToRuleGroup(
           defaultRuleGroup.RuleGroup.Name,
           identityProviderName,
           ClaimTypes.NameIdentifier, ClaimTypes.Name);

Configuring Windows Azure by Using the Built-in Management Objects

It is also possible to use the built-in objects that are part of the Windows Azure SDK to configure Windows Azure namespaces. For example, the setup program configures the access rules for the Service Bus endpoints in ACS. To do this it uses the classes in the Microsoft.ServiceBus.AccessControlExtensions.AccessControlManagement namespace.

The following code shows how the setup program creates a rule group for the Orders Statistics service. You can see that it sets ACS as the claim issuer, and adds two rules to the rule group. The first rule allows the authenticated identity externaldataanalyzer (a small and simple demonstration program that can display order statistics) to send requests. The second rule allows the authenticated identity headoffice (the on-premises management and monitoring application) to listen for requests. The code then adds the rule group to the OrdersStatisticsService relying party, and saves all the changes.

var settings = new AccessControlSettings(
                ServiceBusNamespace, DefaultKey);
ManagementService serviceClient = ManagementServiceHelper
               .CreateManagementServiceClient(settings);

serviceClient.DeleteRuleGroupByNameIfExists(
             "Rule group for OrdersStatisticsService");
serviceClient.SaveChanges(SaveChangesOptions.Batch);

var ruleGroup = new RuleGroup {
      Name = "Rule group for OrdersStatisticsService" };
serviceClient.AddToRuleGroups(ruleGroup);

// Equivalent to selecting "Access Control Service" as
// the input claim issuer in the Management portal.
var issuer = serviceClient.GetIssuerByName(
                                 "LOCAL AUTHORITY");

serviceClient.CreateRule(
    issuer,
    "http://schemas.xmlsoap.org/ws/2005/05/identity/
       claims/nameidentifier",
    "externaldataanalyzer",
    "net.windows.servicebus.action",
    "Send",
    ruleGroup,
    string.Empty);

serviceClient.CreateRule(
    issuer,
    "http://schemas.xmlsoap.org/ws/2005/05/identity/
       claims/nameidentifier",
    "headoffice",
    "net.windows.servicebus.action",
    "Listen",
    ruleGroup,
    string.Empty);

var relyingParty = serviceClient.GetRelyingPartyByName(
                       "OrdersStatisticsService", true);
var relyingPartyRuleGroup = new
  Microsoft.ServiceBus.AccessControlExtensions
  .AccessControlManagement.RelyingPartyRuleGroup();
relyingParty.RelyingPartyRuleGroups.Add(
                                relyingPartyRuleGroup);
serviceClient.AddToRelyingPartyRuleGroups(
                                relyingPartyRuleGroup);

serviceClient.AddLink(relyingParty,
      "RelyingPartyRuleGroups", relyingPartyRuleGroup);
serviceClient.AddLink(ruleGroup,
      "RelyingPartyRuleGroups", relyingPartyRuleGroup);

serviceClient.SaveChanges(SaveChangesOptions.Batch);

Markus says:
	There is a great deal of code in the TreyResearch.Setup project. You may find it useful to examine this project when you create your own setup programs, and reuse some of the generic routines it contains.

Summary

This chapter described how Trey Research tackled the issues surrounding deploying, configuring, monitoring and managing a hybrid application. The important point to realize is that the complexity of the environment and its distributed nature means it is inevitable that performance issues and failures will occur in such a system. The key to maintaining a good level of service is detecting these issues and failures, and responding quickly in a controlled, secure, and repeatable manner.

Windows Azure Diagnostics provides the basic tools to enable you to detect and determine the possible causes of errors and performance problems, but it is important that you understand how to relate the diagnostic information that is generated to the structure of your application. Analyzing this information is a task for a domain expert who not only understands the architecture and business operations of your application, but also has a thorough familiarity with the way in which this architecture maps to the services provided by Windows Azure.

The Windows Azure Service Management API provides controlled access to Windows Azure features, enabling you to build scripts and applications that an operator can use to deploy and manage the elements that comprise your application. This approach eliminates the need for an operator to have the same level of expertise with Windows Azure as the solution architect, and can also reduce the scope for errors by automating and sequencing many of the tasks involved in a complex deployment.

More Information

All links in this book are accessible from the book's online bibliography available at: http://msdn.microsoft.com/en-us/library/hh968447.aspx.

"About This Release of Enterprise Library" at http://msdn.microsoft.com/en-us/library/ff664636(v=PandP.50).aspx.

Whitepaper on the Enterprise Library Codeplex site that describes how you can use the Enterprise Library 5.0 application blocks with Windows Azure-hosted applications at
http://entlib.codeplex.com/releases/view/75025#DownloadId=336804.

Windows Azure Management Pack for Microsoft System Center Operations Manager at
http://pinpoint.microsoft.com/en-us/applications/system-center-monitoring-pack-for-windows-azure-applications-12884907699.
Azure Diagnostics Manager from Cerebrata at http://www.cerebrata.com/Products/AzureDiagnosticsManager/Default.aspx.
AzureWatch from Paraleap Technologies at http://www.paraleap.com/.
ManageAxis from Cumulux at
http://www.cumulux.com/products-and-services/cloud-operations/.

"Using the Windows Azure Diagnostics Configuration File" at
http://msdn.microsoft.com/en-us/library/gg604918.aspx.

"About the Service Management API" at http://msdn.microsoft.com/en-us/library/windowsazure/ee460807.aspx.

"Collecting Logging Data by Using Windows Azure Diagnostics" at http://msdn.microsoft.com/en-us/library/windowsazure/gg433048.aspx.

"Monitoring Windows Azure Applications" at http://msdn.microsoft.com/en-us/library/windowsazure/gg676009.aspx.

"Windows Azure Service Management REST API Reference" at http://msdn.microsoft.com/en-us/library/windowsazure/ee460799.aspx.

"Take Control of Logging and Tracing in Windows Azure" at http://msdn.microsoft.com/en-us/magazine/ff714589.aspx.

"Windows Azure PowerShell Cmdlets" at http://wappowershell.codeplex.com.

7 –Monitoring and Managing the Orders Application