Appendix C - Implementing Cross-Boundary Communication


A key aspect of any solution that spans the on-premises infrastructure of an organization and the cloud concerns the way in which the elements that comprise the solution connect and communicate. A typical distributed application contains many parts running in a variety of locations, which must be able to interact in a safe and reliable manner. Although the individual components of a distributed solution typically run in a controlled environment, carefully managed and protected by the organizations responsible for hosting them, the network that joins these elements together commonly utilizes infrastructure, such as the Internet, that is outside of these organizations' realms of responsibility.

Consequently the network is the weak link in many hybrid systems; performance is variable, connectivity between components is not guaranteed, and all communications must be carefully protected. Any distributed solution must be able to handle intermittent and unreliable communications while ensuring that all transmissions are subject to an appropriate level of security.

The Windows Azure™ technology platform provides technologies that address these concerns and help you to build reliable and safe solutions. This appendix describes these technologies.

NoteJana says:
Jana Making the most appropriate choice for selecting the way in which components communicate with each other is crucial, and can have a significant bearing on the entire systems design.

Uses Cases and Challenges

In a hybrid cloud-based solution, the various applications and services will be running on-premises or in the cloud and interacting across a network. Communicating across the on-premises/cloud divide typically involves implementing one or more of the following generic use cases. Each of these use cases has its own series of challenges that you need to consider.

Accessing On-Premises Resources From Outside the Organization

Description: Resources located on-premises are required by components running elsewhere, either in the cloud or at partner organizations.

The primary challenge associated with this use case concerns finding and connecting to the resources that the applications and services running outside of the organization utilize. When running on-premises, the code for these items frequently has direct and controlled access to these resources by virtue of running in the same network segment. However, when this same code runs in the cloud it is operating in a different network space, and must be able to connect back to the on-premises servers in a safe and secure manner to read or modify the on-premises resources.

Accessing On-Premises Services From Outside the Organization

Description: Services running on-premises are accessed by applications running elsewhere, either in the cloud or at partner organizations.

In a typical service-based architecture running over the Internet, applications running on-premises within an organization access services through a public-facing network. The environment hosting the service makes access available through one or more well-defined ports and by using common protocols; in the case of most web-based services this will be port 80 over HTTP, or port 443 over HTTPS. If the service is hosted behind a firewall, you must open the appropriate port(s) to allow inbound requests. When your application running on-premises connects to the service it makes an outbound call through your organization's firewall. The local port selected for the outbound call from your on-premises application depends on the negotiation performed by the HTTP protocol (it will probably be some high-numbered port not currently in use), and any responses from the service return on the same channel through the same port. The important point is that to send requests from your application to the service, you do not have to open any additional inbound ports in your organization's firewall.

When you run a service on-premises, you are effectively reversing the communication requirements; applications running in the cloud and partner organizations need to make an inbound call through your organization's firewall and, possibly, one or more Network Address Translation (NAT) routers to connect to your services. Remember that the purpose of this firewall is to help guard against unrestrained and potentially damaging access to the assets stored on-premises from an attacker located in the outside world. Therefore, for security reasons, most organizations implement a policy that restricts inbound traffic to their on-premises business servers, blocking access to your services. Even if you are allowed to open up various ports, you are then faced with the task of filtering the traffic to detect and deny access to malicious requests.

The vital question concerned with this use case therefore, is how do you enable access to services running on-premises without compromising the security of your organization?

NotePoe says:
Poe Opening ports in your corporate firewall without due consideration of the implications can render your systems liable to attack. Many hackers run automated port-scanning software to search for opportunities such as this. They then probe any services listening on open ports to determine whether they exhibit any common vulnerabilities that can be exploited to break into your corporate systems.

Implementing a Reliable Communications Channel across Boundaries

Description: Distributed components require a reliable communications mechanism that is resilient to network failure and enables the components to be responsive even if the network is slow.

When you depend on a public network such as the Internet for your communications, you are completely dependent on the various network technologies managed by third party operators to transmit your data. Utilizing reliable messaging to connect the elements of your system in this environment requires that you understand not only the logical messaging semantics of your application, but also how you can meet the physical networking and security challenges that these semantics imply.

A reliable communications channel does not lose messages, although it may choose to discard information in a controlled manner in well-defined circumstances. Addressing this challenge requires you to consider the following issues:

Cross-Cutting Concerns

In conjunction with the functional aspects of connecting components to services and data, you also need to consider the common non-functional challenges that any communications mechanism must address.

Security

The first and most important of these challenges is security. You should treat the network as a hostile environment and be suspicious of all incoming traffic. Specifically, you must also ensure that the communications channel used for connecting to a service is well protected. Requests may arrive from services and organizations running in a different security domain from your organization. You should be prepared to authenticate all incoming requests, and authorize them according to your organization's data access policy to guard your organization's resources from unauthorized access.

You must also take steps to protect all outgoing traffic, as the data that you are transmitting will be vulnerable as soon as it leaves the environs of your organization.

The questions that you must consider when implementing a safe communications channel include:

NotePoe says:
Poe Robust security is a vital element of any application that is accessible across a network. If security is compromised, the results can be very costly and users will lose faith in your system.

Responsiveness

A well designed solution ensures that the system remains responsive, even while messages are flowing across a slow, error prone network between distant components. Senders and receivers will probably be running on different computers, hosted in different datacenters (either in the cloud, on-premises, or within a third-party partner organization), and located in different parts of the world. You must answer the following questions:

Interoperability

Hybrid applications combine components built using different technologies. Ideally, the communications channel that you implement should be independent of these technologies. Following this strategy not only reduces dependencies on the way in which existing elements of your solution are implemented, but also helps to ensure that your system is more easily extensible in the future.

Maintaining messaging interoperability inevitably involves adopting a standards-based approach, utilizing commonly accepted networking protocols such as TCP and HTTP, and message formats such as XML and SOAP. A common strategy to address this issue is to select a communications mechanism that layers neatly on top of a standard protocol, and then implement the appropriate libraries to format messages in a manner that components built using different technologies can easily parse and process.

Windows Azure Technologies for Implementing Cross-Boundary Communication

If you are building solutions based on direct access to resources located on-premises, you can use Windows Azure Connect to establish a safe, virtual network connection to your on-premises servers. Your code can utilize this connection to read and write the resources to which it has been granted access.

If you are following a service-oriented architecture (SOA) approach, you can build services to implement more functionally focused access to resources; you send messages to these services that access the resources in a controlled manner on your behalf. Communication with services in this architecture frequently falls into one of two distinct styles:

The following sections provide more details on Windows Azure Connect, Windows Azure Service Bus Relay, and Service Bus queues; and describe when you should consider using each of them.

Accessing On-Premises Resources from Outside the Organization Using Windows Azure Connect

Windows Azure Connect enables you integrate your Windows Azure roles with your on-premises servers by establishing a virtual network connection between the two environments. It implements a network level connection based on standard IP protocols between your applications and services running in the cloud and your resources located on-premises, and vice versa.

Guidelines for Using Windows Azure Connect

Using Windows Azure Connect provides many benefits over common alternative approaches:

Windows Azure Connect is suitable for the following scenarios:

NoteNote:
For up-to-date information about best practices for implementing Windows Azure Connect, visit the Windows Azure Connect Team Blog.

Windows Azure Connect Architecture and Security Model

Windows Azure Connect is implemented as an IPv6 virtual network by Windows Azure Connect endpoint software running on each server and role that participates in the virtual network. The endpoint software transparently handles DNS resolution and manages the IP connections between your servers and roles. It is installed automatically on roles running in the cloud that are configured as connect-enabled. For servers running on-premises, you download and install the Windows Azure Connect endpoint software manually. This software executes in the background as a Windows service. Similarly, if you are using Windows Azure Connect to connect from a VM role, you must install the Windows Azure Connect endpoint software in this role before you deploy it to the cloud.

You use the Windows Azure Management Portal to generate an activation token that you include as part of the configuration for each role and each instance of the Windows Azure Connect endpoint software running on-premises. Windows Azure Connect uses this token to link the connection endpoint to the Windows Azure subscription and ensures that the virtual network is only accessible to authenticated servers and roles. Network traffic traversing the virtual network is protected end-to-end by using certificate-based IPsec over Secure Socket Tunneling Protocol (SSTP). Windows Azure Connect provisions and configures the appropriate certificates automatically, and does not require any manual intervention on the part of the operator.

The Windows Azure Connect endpoint software establishes communications with each node by using Connect Relay service, hosted and managed by Microsoft in their datacenters. The endpoint software uses outbound HTTPS connections only to communicate with the Windows Azure Connect Relay service. However, the Windows Azure Connect endpoint software creates a firewall rule for Internet Control Message Protocol for IPv6 (ICMPv6) communications which allows Router Solicitation (Type 133) and Router Advertisement (Type 134) messages when it is installed. These messages are required to establish and maintain an IPv6 link. Do not disable this rule.

NoteBharath says:
Bharath Microsoft implements separate instances of the Windows Azure Connect Relay service in each region. For best performance, choose the relay region closest to your organization when you configure Windows Azure Connect.

Figure 2 - The security architecture of Windows Azure Connect

Figure 2
The security architecture of Windows Azure Connect

You manage the connection security policy that governs which servers and roles can communicate with each other using the Management Portal; you create one or more endpoint groups containing the host servers that comprise your solution (and that have the Windows Azure Connect endpoint software installed), and then specify the Windows Azure roles that they can connect to. This collection of host servers and roles constitutes a single virtual network.

NoteNote:
For more information about configuring Windows Azure Connect and creating endpoint groups, see "Windows Azure Connect" on MSDN.

Limitations of Windows Azure Connect

Windows Azure Connect is intended for providing direct access to corporate resources, either located on-premises or in the cloud. It provides a general purpose solution, but is subject to some constraints as summarized by the following list:

Accessing On-Premises Services from Outside the Organization Using Windows Azure Service Bus Relay

Windows Azure Service Bus Relay provides the communication infrastructure that enables you to expose a service to the Internet from behind your firewall or NAT router. The Windows Azure Service Bus Relay service provides an easy to use mechanism for connecting applications and services running on either side of the corporate firewall, enabling them to communicate safely without requiring a complex security configuration or custom messaging infrastructure.

Guidelines for Using Windows Azure Service Bus Relay

Windows Azure Service Bus Relay is ideal for enabling secure communications with a service running on-premises, and for establishing peer-to-peer connectivity. Using Windows Azure Service Bus Relay brings a number of benefits:

You should note that Windows Azure Service Bus Relay is not suitable for implementing all communication solutions. For example, it imposes a temporal dependency between the services running on-premises and the client applications that connect to them; a service must be running before a client application can connect to it otherwise the client application will receive an EndpointNotFoundException exception (this limitation applies even with the NetOnewayRelayBinding and NetEventRelayBinding bindings described in the section "Selecting a Binding for a Service" later in this appendix.) Furthermore, Windows Azure Service Bus Relay is heavily dependent on the reliability of the network; a service may be running, but if a client application cannot reach it because of a network failure the client will again receive an EndpointNotFoundException exception. In these cases using Windows Azure Service Bus queues may provide a better alternative; see the section "Implementing a Reliable Communications Channel across Boundaries Using Service Bus Queues" later in this appendix for more information.

You should consider using Windows Azure Service Bus Relay in the following scenarios:

Guidelines for Securing Windows Azure Service Bus Relay

Windows Azure Service Bus Relay endpoints are organized by using Service Bus namespaces. When you create a new service that communicates with client applications by using Windows Azure Service Bus Relay you can use the Management Portal to generate a new service namespace. This namespace must be unique, and it determines the uniform resource identifier (URI) that your service exposes; client applications specify this URI to connect to your service through Windows Azure Service Bus Relay. For example, if you create a namespace with the value TreyResearch and you publish a service named OrdersService in this namespace, the full URI of the service is sb://treyresearch.servicebus.windows.net/OrdersService.

The services that you expose through Windows Azure Service Bus Relay can provide access to sensitive data, and are themselves valuable assets; therefore you should protect these services. There are several facets to this task:

NotePoe says:
Poe Remember that even though Service Bus is managed and maintained by one or more Microsoft datacenters, applications connect to Windows Azure Service Bus Relay across the Internet. Unauthorized applications that can connect to your Service Bus namespaces can implement common attacks, such as denial of service to disrupt your operations, or Man-in-the-Middle to steal data as it is passed to your services. Therefore, you should protect your Service Bus namespaces and the services that use it as carefully as you would defend your on-premises assets.

Figure 9 illustrates the core recommendations for protecting services exposed through Windows Azure Service Bus Relay.

NoteNote:
You can configure message authentication and encryption by configuring the WCF binding used by the service. For more information, see "Securing and Authenticating a Service Bus Connection" on MSDN.

Figure 9 - Recommendations for protecting services exposed through Windows Azure Service Bus Relay

Figure 9
Recommendations for protecting services exposed through Windows Azure Service Bus Relay

Many organizations implement outbound firewall rules that are based on IP address allow-listing. In this configuration, to provide access to Service Bus or ACS you must add the addresses of the corresponding Windows Azure services to your firewall. These addresses vary according to the region hosting the services, and they may also change over time, but the following list shows the addresses for each region at the time of writing:

NotePoe says:
Poe IP address allow-listing is not really a suitable security strategy for an organization when the target addresses identify a massively multi-tenant infrastructure such as Windows Azure (or any other public cloud platform, for that matter).

Guidelines for Naming Services in Windows Azure Service Bus Relay

If you have a large number of services, you should adopt a standardized convention for naming the endpoints for these services. This will help you manage, protect, and monitor services and the client applications that connect to them. Many organizations commonly adopt a hierarchical approach. For example, if Trey Research had sites in Chicago, New York, and Washington, each of which provided ordering and shipping services, an administrator might register URIs following the naming convention shown in this list:

However, when you register the URI for a service with Windows Azure Service Bus Relay, no other service can listen on any URI scoped at a lower level than your service. What this means that if in the future Trey Research decided to implement an additional orders service for exclusive customers, they could not register it by using a URI such as sb://treyresearch.servicebus.windows.net/chicago/ordersservice/exclusive.

To avoid problems such as this, you should ensure that the initial part of each URI is unique. You can generate a new GUID for each service, and prepend the city and service name elements of the URI with this GUID. In the Trey Research exampl , the URIs for the Chicago services, including the exclusive orders service, could be:

For more information about naming guidelines for Windows Azure Service Bus Relay services, see "AppFabric Service Bus – Things You Should Know – Part 1 of 3 (Naming Your Endpoints)."

Selecting a Binding for a Service

The purpose of Windows Azure Service Bus Relay is to provide a safe and reliable connection to your services running on-premises for client applications executing on the other side of your corporate firewall. Once a service has registered with the Windows Azure Service Bus Relay service, much of the complexity associated with protecting the service and authenticating and authorizing requests can be handled transparently outside the scope of the business logic of the service. If you are using WCF to implement your services, you can use the same types and APIs that you are familiar with in the System.ServiceModel assembly. The Windows Azure SDK includes transport bindings, behaviors, and other extensions in the Microsoft.ServiceBus assembly for integrating a WCF service with Windows Azure Service Bus Relay.

NoteMarkus says:
Markus If you are familiar with building services and client applications using WCF, you should find Windows Azure Service Bus Relay quite straightforward.

As with a regular WCF service, selecting an appropriate binding for a service that uses Windows Azure Service Bus Relay has an impact on the connectivity for client applications and the functionality and security that the transport provides. The Microsoft.ServiceBus assembly provides four sets of bindings:

Windows Azure Service Bus Relay and Windows Azure Connect Compared

There is some overlap in the features provided by Windows Azure Service Bus Relay and Windows Azure Connect. However, when deciding which of these technologies you should use, consider the following points:

Implementing a Reliable Communications Channel across Boundaries Using Service Bus Queues

Service Bus queues enable you to decouple services from the client applications that use them, both in terms of functionality (a client application does not have to implement any specific interface or proxy to send messages to a receiver) and time (a receiver does not have to be running when a client application posts it a message). Service Bus queues implement reliable, transactional messaging with guaranteed delivery, so messages are never inadvertently lost. Moreover, Service Bus queues are resilient to network failure; as long as a client application can post a message to a queue it will be delivered when the service is next able to connect to the queue.

When you are dealing with message queues, keep in mind that client applications and services can both send and receive messages. The descriptions in this section therefore refer to "senders" and "receivers" rather than client applications and services.

Service Bus Messages

A Service Bus message is an instance of the BrokeredMessage class. It consists of two elements; the message body which contains the information being sent, and a collection of message properties which can be used to add metadata to the message.

The message body is opaque to the Service Bus queue infrastructure and it can contain any application-defined information, as long as this data can be serialized. The message body may also be encrypted for additional security. The contents of the message are never visible outside of a sending or receiving application, not even in the Management Portal.

NoteMarkus says:
Markus The data in a message must be serializable. By default the BrokeredMessage class uses a DataContractSerializer object with a binary XmlDictionaryWriter to perform this task, although you can override this behavior and provide your own XmlObjectSerializer object if you need to customize the way the data is serialized. The body of a message can also be a stream.

In contrast, the Service Bus queue infrastructure can examine the metadata of a message. Some of the metadata items define standard messaging properties that an application can set; and are used by the Service Bus queues infrastructure for performing tasks such as uniquely identifying a message, specifying the session for a message, indicating the expiration time for a message if it is undelivered, and many other common operations. Messages also expose a number of system-managed read-only properties, such as the size of the message and the number of times a receiver has retrieved the message in PeekLock mode but not completed the operation successfully. Additionally, an application can define custom properties and add them to the metadata. These items are typically used to pass additional information describing the contents of the message, and they can also be used by Service Bus to filter and route messages to message subscribers.

Guidelines for Using Service Bus Queues

Service Bus queues are perfect for implementing a system based on asynchronous messaging. You can build applications and services that utilize Service Bus queues by using the Windows Azure SDK. This SDK includes APIs for interacting directly with the Service Bus queues object model, but it also provides bindings that enable WCF applications and services to connect to queues in a similar way to consuming Microsoft Windows Message Queuing queues in an enterprise environment.

NoteBharath says:
Bharath Prior to the availability of Service Bus queues, Windows Azure provided message buffers. These are still available, but they are only included for backwards compatibility. If you are implementing a new system, you should use Service Bus queues instead.
You should also note that Service Bus queues are different from Windows Azure storage queues, which are used primarily as a communication mechanism between web and worker roles running on the same site.

Service Bus queues enable a variety of common patterns and can assist you in building highly elastic solutions as described in the following scenarios:

Guidelines for Sending and Receiving Messages Using Service Bus Queues

You can implement the application logic that sends and receives messages using a variety of technologies:

Sending and Receiving Messages Asynchronously

If you are using the Windows Azure SDK, you can implement applications that send and receive messages by using the MessageSender and MessageReceiver classes in the Microsoft.ServiceBus.Messaging namespace. These types expose the messaging operations described earlier in this appendix. The basic functionality for sending and receiving messages is available through the Send and Receive methods of these types. However, these operations are synchronous. For example, the Send method of the MessageSender class waits for the send operation to complete before continuing, and similarly the Receive method of the MessageReceiver class either waits for a message to be available or until a specified timeout period has expired. Remember that these methods are really just façades in front of a series of HTTP REST requests, and that the Service Bus queue is a remote service being accessed over the Internet. Therefore, your applications should assume that:

Scheduling, Expiring, and Deferring Messages

By default, when a sender posts a message to a queue, it is immediately available for a receiver to retrieve and process. However, you can arrange for a message to remain invisible when it is first sent and only appear on the queue at a later time. This technique is useful for scheduling messages that should only be processed after a particular point in time; for example, the data could be time sensitive and may not be released until after midnight. To specify the time when the message should appear on the queue and be available for processing, set the ScheduledEnqueueTimeUtc property of the BrokeredMessage object.

When a sender posts a message to a queue, that message might wait in the queue for some considerable time before a receiver picks it up. The message might have a lifetime after which it becomes stale and the information that it holds is no longer valid. In this case, if the message has not been received then it should be silently removed from the queue. You can achieve this by setting the TimeToLive property of the BrokeredMessage object when the sender posts the message.

In some situations, an application may not want to process the next available message but skip over it, retrieve subsequent messages, and only return to the skipped message later. You can achieve this by deferring the message, using the Defer method of the BrokeredMessage class. To implement this mechanism, an application must retrieve messages by using PeekLock mode. The Defer method leaves the message on the queue, but it is locked and unavailable to other receivers. At the appropriate juncture, the application can return to the message to process it, and then finish by calling the Complete or Abandon methods as described earlier in this appendix. In the event that a message is no longer useful or valid at the time that it is processed, the application can optionally dead letter it. Note that if the application fails, the lock eventually times out and the message becomes available in the queue. You can specify the lock duration by setting the LockDuration property of the queue when it is created.

Guidelines for Securing Service Bus Queues

Service Bus queues provide a messaging infrastructure for business applications. They are created and managed by Windows Azure, in the cloud. Consequently they are reliable and durable; once a sender has posted a message to a queue it will remain on the queue until it has been retrieved by a receiver or it has expired.

A Service Bus queue is held in a Service Bus namespace identified by a unique URI. You establish this URI when you create the namespace, and the URI structure is similar to that described in the section "Windows Azure Service Bus Relay Security Model" earlier in this appendix. An application instantiates a MessagingFactory object using this URI. The MessagingFactory object can then be used to create a MessageSender or MessageReceiver object that connects to the queue.

The Service Bus namespace provides the protection context for a queue, and the namespace holding your queues should only be made available to authenticated senders and receivers. You protect namespaces by using ACS, in a manner very similar to that described in the section "Guidelines for Securing Windows Azure Service Bus Relay" earlier in this appendix, except that the realm of the relying party application is the URI of the Service Bus namespace with the name of the Service Bus queue, topic, or subscription appended (such as http://treyresearch.servicebus.windows.net/orderstatusupdatequeue) rather than the address of a WCF service.

You can create an ACS rule group for this URI and assign the net.windows.servicebus.action claim type values Send, Listen, and Manage to authenticated identities, as appropriate. You should note that the Send and Listen claims each confer a very minimal set of privileges, enabling an application to post messages to a queue or retrieve messages from a queue respectively, but very little else. If your application needs to perform tasks such as creating a new queue, querying the number of messages currently posted to a queue, or even simply determining whether a queue with a given name exists, the application must run with an identity that has been granted the rights associated with the Manage claim.

All communications with a Service Bus queue occur over a TCP channel, encrypted by using SSL. If you need to implement additional security at the message level, you should encrypt the contents of messages and the receiver should be provided with the decryption key. In this way, if a message is somehow intercepted by a rogue receiver it will not be able to examine the contents of the message. Similarly, if the valid receiver of a message is not able to decrypt that message, it should be treated as a poison message from a rogue sender and moved to the dead letter queue.

NoteNote:
You can also implement a mechanism to verify the identity of a sender posting a message to a Service Bus queue by adding an identity token to the header of the message. If this token is missing or unrecognized by the receiving application, the message should be treated as suspect. For an example of how to implement this approach, see the section "Securing Messages" in Chapter 4, "Implementing Reliable Messaging and Communications with the Cloud."

More Information

All links in this book are accessible from the book's online bibliography available at: http://msdn.microsoft.com/en-us/library/hh968447.aspx.