Getting Started with Service-Oriented Architectures
A major advantage of using a service-oriented architecture (SOA) with Web services is that it fits in with the general chaos of business. There are many forces contributing to this chaos: organizations are acquired and divested, organizations restructure themselves, new products need to be sold, competition forces quick responses, and, of course, there’s more and it is ever changing.
Historically, it has been a struggle for IT groups to respond to this business chaos. An SOA provides a way to be more nimble in the responses. If an SOA is designed properly, it will approach the type of plug-compatibility that I have been alluding to with the audio-video (AV) examples sprinkled throughout this book.
Nevertheless, just using Web services for making connections does not guarantee your organization will have a functional SOA. The trick is in how you design your SOA.
This chapter provides design concepts and considerations along with staffing and change issues to take into account when establishing an SOA. It illustrates how a properly designed service interface can make it easier for an organization to respond to the chaos of modern business. At the end, there is a discussion of SOA governance.
I have waited until this point before suggesting that you establish an architecture, because it is important to have the experiences experimenting with Web services described in Chapter 11. An architecture based on experience is much more likely to succeed than one that is based on just reading a book or thinking about the technology.
Anchor an SOA in what your organization really needs and what your people are capable of accomplishing. Recall from Chapter 9 that smaller projects are more focused and are more likely to succeed. Large projects are likely to fail. Since 1994, the Standish Group has conducted studies on IT development projects, compiling the results in the Chaos Reports. In 2005, Watts S. Humphrey of the Software Engineering Institute looked at the Standish Group’s data by project size. His research showed that half of the smaller projects succeeded, whereas none of the largest projects did.1
You may want to go back to Chapter 10 to review the approach to developing an SOA as a series of small incremental projects.
1. Adopt industry standards. These standards include Web services and industry-standard semantic vocabularies. (The industry group specifying the standard semantic vocabularies could also be identified. See page 179 for a sample of vocabularies by industry.)
2. Use commercial off-the-shelf software as much as possible. The software must provide Web services adapters.
3. Encapsulate legacy applications with interfaces that meet industry standards. Web services must be used for the interface.
4. Use a data-independent layer between applications and data to hide the structure of the underlying data. All interaction must be through Web services.
5. Design services for reusability. Chapter 10 suggests one technique for determining the right “size” or granularity of a service. Use some type of methodology to improve the chances that the services are reusable.
6. Every service must be able to receive messages multiple times with no adverse effects. For example, assume a service can receive updates to customer data. That service must be able to receive the same update more than once without affecting the data. The reason for this is that the sending service may, for various reasons, send data multiple times. This can happen when a system comes up after being down for a period of time. It may have some type of checkpoint that is taken after some multiple of messages go out. If the system goes down between checkpoints, some messages may need to be sent again to be sure they went out. It can also happen through mistakes in programming, multiple data requests, or simply unforeseen actions.
7. Track and manage the use of services. If people see services as useful, they are going to be used—perhaps in new and unusual ways (which often may be a good thing). It is important to consider tracking the use of services. Amazon Web Services (AWS) provides a model to consider:
Unique token. Each developer has a unique token in AWS that is used to track usage, payments, and so on.
Versioning. The incoming messages to AWS specify the version of the messaging and XML vocabulary to use. This allows changing the messaging in the AWS without requiring all users to also make changes. Only users interested in the change are affected.
Response groups. The incoming messages to AWS specify the desired response groups. Each response group contains certain data. This is an option to consider since, like the decomposition of services shown in Figure 10.4, it is possible to have services returning multiple types of responses. This could be achieved using response groups.
This is extra work but with benefits. It provides increased flexibility for enhancing service responses and allows for tracking usage.
8. High-volume, high-speed messages should be sent within a service and low-volume, low-speed messages should be sent between services. This is one of those “relative” design considerations. Web services, no matter what, are going to run significantly slower than communication within most internal systems. Try to keep the high-volume, high-speed messages within a service. Figure 12.1 illustrates keeping high-volume, high-speed messages within an internal system that is also being used as a service.
Figure 12.1 Keep high-volume, high-speed messages within a service.
9. Balance the conflict between indeterminate and operational access. This conflict is often quite apparent when using an existing system as a service. That existing system was not necessarily designed for indeterminate or erratic requests. Having to deal with those requests with the service responsiveness expected is sometimes difficult to do with an existing system. Figure 12.2 illustrates this issue. (This issue is explored more later in this chapter.)
Figure 12.2 Conflict between indeterminate and operational access.
By now you should have an effective team of seven or fewer people who are very capable of taking on short-term projects. Your methodology should also be well established.
The most likely change issues you will encounter at this stage are:
Feeling that jobs may be threatened. The you know what will really hit the fan at this stage. For many organizations, it will be obvious that the size of the IT staff will begin to decrease and jobs might genuinely be threatened. Be prepared to communicate openly about this as soon as possible so that people have time to make decisions.
Not invented here. You are likely to be considering external services at this stage. It is very human to resist this. Be sure to get this resistance out in the open and really listen to concerns to make sure any legitimate concerns are addressed.
Our problems are special. This relates to the feeling that jobs may be threatened. Be sure to get these concerns out in the open to see if there is any real concern that the special issues are being overlooked. Chances are very likely that the problems are not special. Be prepared to communicate this effectively. Be sure to keep management informed as the word spreads through the grapevine that “special issues” are being overlooked.
The example development illustrated in Figure 11.7 addressed the design-related restraining forces for adopting an SOA. Those design issues appeared in the force field analysis illustrated by Figure 6.9 and are as follows:
Delays getting data updates distributed
Deciding what data to warehouse
Delays in getting data to the warehouse
Effects on operational systems for up-to-the-moment data requests
But what if things are not going as planned? I’ll go back to the story about C. R.’s organization to illustrate problems and possible responses.
Figure 12.3 represents, at one point, the systems supporting the SOA for C. R.’s organization (Figure 11.7 is essentially a subset of this figure). To add more detail to the story, let’s say two issues appeared at this point in the use of the SOA:
1. The data warehouse was growing much faster than expected.
2. The response time of the services provided by an internal system was inadequate and the indeterminate access requests were adversely impacting the operational system. This is the issue illustrated by Figure 12.2.
Figure 12.3 Systems supporting the SOA of C. R.’s organization.
The response to the first issue was described in the section on adopting a platform as a service (PaaS) starting on page 74. This described how C. R.’s organization moved to a virtual private cloud to provide for a big data store, and is illustrated by Figure 12.4.
Figure 12.4 Using a PaaS cloud provider for a big data store and BI/analytics.
The PaaS includes tools to help develop, manage, and analyze the data in big data stores. It provides an ESB within the virtual private cloud that is optimized for the big data store and the business intelligence (BI)/analytics software.
The Internet is represented by the horizontal shaded area. Web services are shown as a black line within the shaded area. This represents that Web services protocols (SOAP, REST, JSON, etc.) are a subset of the protocols that can be used on the Internet.
Note the adapters aligned with the big data and BI/analytics in the virtual private cloud. They are needed because those services use a somewhat different semantic vocabulary than the one used by C. R.’s organization.
The second issue can be problematic. C. R.’s organization, like many others, was not in a position to change the internal system that was being adversely affected when used as a service. One solution is a middle-tier architecture that uses persistent caching.
A middle-tier architecture is one way to leverage the use of existing systems and databases. The middle tier changes where integration occurs. Instead of directly integrating existing systems and databases, a new layer is developed so that the integration occurs in the middle tier. Moving integration to the middle tier is the solution used by C. R.’s organization to address the conflict between indeterminate and operational access.
Figure 12.5 illustrates the basics of a middle-tier architecture2 that uses an application server and a middle-tier database. The middle tier is above internal systems. One of the internal systems that we have covered so far is at the bottom of the figure. It is also used as a service.
Figure 12.5 Middle-tier architecture.
Note that the adapter is at the bottom of the middle tier, above the internal system as it was in Figure 12.4. Since this application server is presumably new development, it can use the same semantic vocabulary and Web services message format as the ESB. An adapter is not needed for the application server.
It is possible to add persistence to the middle tier. Adding persistence to the middle tier makes sense in situations that either have too much data to keep in the application server cache or situations where you need the protection of persistence to make sure no data would be lost before it can be written to the internal system. It can also be a way to boost performance of services provided by an application server when it needs to access data. Middle-tier persistence, however, will require additional development.
A persistent cache adds capabilities to the in-memory cache. These include:
The examples assume that a database will be used in the middle tier to provide the persistent cache. A database manager ensures that all transactions will be recorded properly and has recovery and backup capabilities, if needed.
There are several ways that a cache could be populated:
1. On an as-needed basis. An instance moves into the cache only when a program requests to read the values of the instance.
2. Fully populated at start time. All instances needed in the cache are populated when the system starts up.
3. A combination of the first two. An example is populating the cache with the most likely instances that are needed and then moving additional instances into the cache when a program requests to read the values of the instances.
In any of these cases, the cache size simply could be too large to efficiently keep in memory. A middle-tier database could act as an expanded cache to offload some of the data cached in memory.
Using a middle-tier database as an expanded cache adds options when the underlying internal system is updated. The updates could occur as they happen or at intervals, depending on the needs of the organization. For example, one option would be to populate the middle-tier database from the internal system at the beginning of a business day. All updates could be kept in the middle-tier database. These updates could then be written to the internal system at the end of the day or at intervals during the day.
If all middle-tier cache updates are written to a middle-tier database, then the cached updates are not lost if the application server should fail. They can be recovered from the middle-tier database when the application server is restored. This, of course, would not be necessary if updates to the internal system are made every time an update occurs. That, however, can create a performance hit to the middle tier, as will be discussed in the next section.
If the middle-tier database uses the same data model as the middle-tier cache, there is a good chance that performance will be significantly better than if updates were written to the internal system as they happened.
This performance gain is possible assuming:
The internal system uses a data structure that is different from what is needed for the service. Chances are that this is true if the internal system has been around for a while.
The application server uses a cache that matches the needs of the object program in the application server. This cache could use either an object, XML, or other NoSQL data structure.
The middle-tier database uses the same data model as the cache.
Given these assumptions, the time it takes to write an update to the internal system will most likely take longer than writing to the middle-tier database. As the complexity of the model used by the object program in the application server increases, the difference in the time it takes to write the update to the middle-tier database versus the internal system increases. This is because the mapping complexity also increases between the data model in the cache and the model in the internal system. The mapping simply takes time and costs performance.3As a result, an update to a middle-tier database can be significantly faster and allow processing to resume much sooner than if the update was to the internal system directly.4 Figure 12.6 shows the sequence of this processing.
Figure 12.6 Using a persistent cache in the middle tier.
There are many database options available for middle-tier persistence, because middle-tier databases essentially store temporary data. This is in contrast to internal system databases that are often seen as databases of record, which are expected to last “forever.” When you are considering a database product for an internal system, it is reasonable to choose a database management product from a well-known, established vendor.
In contrast, middle-tier databases—because they are temporary—open up the possibilities of using technologies that might significantly improve performance and reduce development as well as maintenance costs.
There are many issues to consider in selecting a middle-tier database. A discussion of those issues goes beyond the scope of this book. More information on middle-tier persistence can be found at http://www.service-architecture.com/object-oriented-databases/articles/middle_tier_architecture.html.
Figure 12.7 shows the systems supporting the SOA for C. R.’s organization after addressing the two issues that appeared after developing the enterprise data warehouse (EDW) and using an internal system as a service. The customer relationship management (CRM) from a software as a service (SaaS) cloud provider was also added for completeness.
Figure 12.7 Systems used by C. R.’s organization that include a PaaS cloud provider, SaaS cloud provider, and middle-tier persistence.
In Chapter 3, a service was described as software and hardware—and that one or more services support or automate a business function. Much of this book has focused on the software and hardware systems that are needed to support services. Let’s now focus on services. Chapter 10 had a generated decomposition of services illustrated in Figure 10.4. A portion from the lower right corner of that services decomposition is shown in Figure 12.8.
Figure 12.8 Two data services.
Figure 12.8 represents two low-level data services. For discussion’s sake, let’s say these two services were part of the data services for the prior discussion of the two issues facing C. R.’s organization. The Get Customers service could relate to the data warehouse that was moved to a big data store in a PaaS cloud provider, and the Get Invoice Items could relate to the middle-tier architecture used to relieve an internal system that was adversely affected by indeterminate access because it was also used as a service.
Something had to change related to these services when the underlying systems were changed. The adapters may have needed to be changed or perhaps some code in these services needed to be changed.
Note that before these solutions were implemented, C. R.’s organization had implemented services for the data warehouse and the internal system. The important point is that only the code in low-level data services or other code below the low-level data services needed changing when the changes to the systems were made. The rest of the services remained unchanged. Presumably, all that was noticed is that the services related to the upgraded systems provided better performance. This is one way C. R.’s organization was able to easily respond to what could be seen as the chaos of business by concentrating work in specific areas, knowing that the structure of the services will keep the system changes isolated.
The structure of the services is what consumers of those services see. They do not see the underlying systems. The design and management of the structure of services is important. Figure 12.9 shows the generated decomposition of services from Figure 10.4. To help think about the structure of the services, they are divided into two layers: business services and data services.5 The business services layer supports or automates the business functions. The data services layer interacts with the software and hardware systems to access and update data stored in those systems.
Figure 12.9 Example layers of an SOA.
Since Figure 12.9 is actually a data flow diagram from the tool used in Chapter 10, it shows only the data flows and not the control flows (requests). Figure 12.10 generalizes Figure 12.9 and adds arrows that represent commands or requests. At the bottom are internal interface services. These could, for example, be the interface for the data services layer mentioned earlier. At the top are external interface services that could be the interface services for the business services layer. These are the interfaces used by many other services, systems, applications, and so on. As will be discussed in the next section on governance, it is important to control or minimize changes to the Web services/messaging that connects these external services.
Figure 12.10 Interfaces of services in an SOA.
There could, however, be many services at each layer. So it makes sense to classify services with more detail than layers. Figure 12.11 shows services related to Figure 12.7 organized into collections of services. The circles within the rectangles represent services and each rectangle represents a collection of services.
Figure 12.11 Collections of services in an SOA.
Notice that no hardware or systems appear in Figure 12.11. Likewise, the services appear the same whether or not they relate to a cloud provider or an internal system.
If you could enter Figure 12.7 and stand inside the enterprise service bus (ESB), you would see a bunch of services. One way to make sense of those services is to organize them into collections as shown in Figure 12.11.
While you were inside the ESB, you would see that all the services appear to use the same semantic vocabulary and message protocol (SOAP, REST, JSON, etc.). This is why it makes sense to establish a semantic vocabulary as early as possible and to actively maintain it as described in the incremental SOA analysis discussed in Chapter 10 and illustrated in Figure 10.5.
As you can well imagine, there will be many services in an SOA. Managing those services is an important part of SOA governance. This includes:
Providing a means to identify services for reuse
Managing access to services by various entities/services/users
Monitoring usage of services by various entities/services/users
Analyzing the impact of proposed changes to services
Adhering to messaging standards including appropriate industry-wide standards
Adhering to semantic vocabularies including appropriate industry-wide semantic vocabularies
Monitoring the performance and availability of services and the underlying systems and hardware supporting services
Tracking where services run in the supporting systems and hardware
I’m going to dwell a bit on the last bullet. Services are code. They can be written in any language. They can run anywhere code can run. Often, this is an application server, but that is not a requirement. Part of the design process is to determine the best way to implement a service and part of governance is keeping track of where the services run in the supporting systems and hardware.
Chapter 3 mentioned using a service repository for governance. Details such as tokens mentioned on page 147 can be used for managing access and monitoring usage of services. There are products on the market to aid governance that have features such as these. You should decide what tools you will use for governance early since they will impact the development of services. You will need to take into account such features as tokens (or whatever the products use).
Your organization may also add other areas for governance, such as government regulations, laws, internal architectural principles, and so on.
This chapter provided design concepts and considerations along with staffing and change issues to take into account when establishing a service-oriented architecture. Using big data in a private cloud and a middle-tier architecture with an internal system, this chapter illustrated how properly designed service interfaces can make it easier for an organization to respond to the chaos of modern business.
It is possible to have an SOA without cloud computing. But with the way technology is moving, it is increasingly likely that most SOAs will use cloud computing. Chapter 13 provides suggestions for getting started with cloud computing.
1Watts S. Humphrey “Why Big Software Projects Fail: The 12 Key Questions.” CrossTalk: The Journal of Defense Software Engineering, March 2005.
2There are various terms for the tiers in systems architecture. For this discussion, middle tier is used because it is in the middle between user systems and the internal system.
3More on mapping issues can be found at http://www.service-architecture.com/object-relational-mapping/articles/mapping_layer.html.
4IBM published a benchmark that showed significant performance gains with a middle-tier database between the WebSphere application server and DB2. See http://www.service-architecture.com/application-servers/articles/benchmark_using_a_transaction_accelerator.html. The full benchmark paper is available from a link on that page.
5Other variants of the number and names of service layers are used. Nearly all models of service layers include a data service layer.