IT Disaster Recovery Planning For Dummies

Chapter 3

Developing and Using a Business Impact Analysis

In This Chapter

Understanding a Business Impact Analysis

Conducting a BIA

Looking into threat modeling and risk analysis

Deciding what’s critical in your business

Organizations have limited resources. You can do only so much with the people, budget, and equipment available. Software companies want to add more features and functionality to their products; financial services organizations want to have additional investment and management plans available for their customers; automobile manufacturers want to have additional models, features, and accessories available for their customers to choose from. But these organizations can’t always add what they want to their products and services.

Businesses that want to develop disaster recovery (DR) capabilities are also constrained by limited resources. At first glance, it makes sense that all business processes and information systems should have disaster recovery capabilities. However, an organization just can’t have DR plans for all of its processes and systems — it doesn’t have enough resources or enough time. So how does an organization decide which processes and systems warrant the expense and effort related to the development of DR plans? Most businesses use a Business Impact Analysis (BIA) to help them make this decision.

You use a structured, top-down approach to create a BIA. In this chapter, I take you through a tour of the ins and outs of BIA development and how it supports long-term DR development.

Understanding the Purpose of a BIA

A Business Impact Analysis (BIA) is a detailed inventory of the primary processes, systems, assets, people, and suppliers that are associated with an organization’s principle business activities.

The BIA starts out as a list, but it becomes a web. You end up with a connected set of lists in which the entries in one list refer to entries in other lists — dependencies across the spectrum of processes, systems, assets, people, and suppliers. Process A depends on Systems K, L, and M; requires the use of Assets S and T; is operated by key personnel in Department Q; and depends on supplies delivered by Suppliers Y and Z. Think of the BIA as a sort of three-dimensional connect-the-dots, in which entries in various layers have connections to entries in other layers. Like in the organization itself, everything is interconnected.

The core purpose of a Business Impact Analysis is to identify which processes and systems are the most critical to the survival of an organization.

Here’s a closer look at two of the terms I use in the preceding paragraph:

Critical: This word refers to those processes and systems that your business absolutely needs in order to perform its main functions.

Survival: Saving your business from suffering a catastrophic blow that could result in substantial damage to the business, including closing its doors for the last time and shutting down for good. I’m not talking about avoiding a bad financial year or trying not to lose customers or market share.

Here’s what the Business Impact Analysis does:

Determines which business processes you need to recover and restart as soon as possible after a disaster

Determines how soon you need to restart business processes

Identifies the resources you need to restart these business processes

Without a BIA, you don’t know which processes are mission critical (crucial to the ongoing success of the organization) or time critical (those processes that negatively affect the organization when they’re not performed promptly), and you don’t know which ones require attention in the DR planning and testing phases. Without a BIA, you’re just guessing, and you’re liable to identify processes and systems as critical by some less-than-ideal criteria, including

Your favorites

Those with which you’re most familiar

The ones that the executives like best

The shiny, new ones

Pet processes (the favorites of others)

The easy ones

The criteria listed above don’t make good business sense. The BIA uses objective criteria to select those processes that are truly the most critical to the organization, instead of relying on subjective criteria.

Scoping the Effort

Early on, you need to clearly establish the scope of the entire project. If the scope of an organization’s DR project is unclear, members in the project team may arbitrarily cut out important components or increase the scope beyond what the project’s sponsors originally intended.

You need to first establish the boundaries of the DR project by addressing the following questions:

Project team: Which staff members will make up the DR project team? How much time per week do you expect each team member to work on the DR project?

Similarly, do you have any staff members on loan for the DR project, and do you need to obtain any contractors for the project?

Scope: Which sets of business functions are in the scope of the DR project, and which are out? Have you established a quick analysis of dependencies in order to firmly establish the scope?

Project plan: Have you established a high-level project plan that includes important dates?

Budget: What budget are you establishing for each phase of the DR project? Have you established the budget in such a way that you can use the results of the BIA to help shape the budget for the DR effort itself, after you know better how much investment you need in order to meet established recovery criteria?

Has management committed to establishing an annual budget that can help maintain the DR plan and keep it relevant and effective?

Should you hire a consultant?

You may want to hire an outside consultant who’s an expert at disaster recovery planning and performing a Business Impact Analysis. Hiring an outside consultant for this type of work has its pros and cons. The pros include that the consultant is

bullet An expert at DR planning

bullet An expert at creating Business Impact Analyses

bullet Objective

Consultants do have their downside:

bullet They’re not familiar with your business

bullet They have few, if any, relationships with staff

bullet Their services are costly

You need to weigh these factors and decide how much you want a consultant to do for you. You can have him or her give you a little up-front advice, or you can let him or her manage the entire process.

Executive support: What level of executive support do you have for the DR project? Are company executives firmly behind the DR project, or are they only lukewarm about it?

You need a formal, written charter that answers all the questions in the preceding list if you want a truly successful DR project. In all but the smallest organizations, a DR project can easily take a year or longer, from the inception of the BIA all the way to the investment in any necessary systems and equipment, training, and testing.

Conducting a BIA: Taking a Common Approach

The information-gathering stage of the BIA involves a great number of interviews that one or more people carry out. You need to develop a common approach so that every interviewer gathers the same information from every person he or she interviews about every process, system, asset, and supplier.

Instead of steaming headlong into interviews and other information-gathering activities without a plan, spend some time developing procedures and templates so that you can make the interviewing process (probably the most labor-intensive of all BIA activities) as efficient and effective as possible.

Your BIA should focus on identifying and inventorying several key aspects and characteristics of an organization, including

Business processes: This generic term refers to business activities that your business’s personnel carry out, often with the help of machinery — including information systems. Processes are made up of one or many procedures. Business processes can be fairly simple, one or two people carrying them out with minimum dependency on other resources; or they can be quite complex, involving people in many parts of the organization, as well as suppliers and other external resources.

Information systems: This generic term means computer systems, applications, databases, and devices. An information system can be as simple as an Excel spreadsheet on a desktop computer or as complex as an application running on dozens of servers in locations throughout the world.

Assets: The equipment needed to facilitate the production of whatever products or services your organization produces. Assets may consist of machinery or tools that are essential to the business. Assets can be servers (although you could argue that computers belong in the information systems category, but don’t get too nit-picky); mechanical devices, such as milling machines or lathes; tools, such as forklifts and electric generators; or equipment, such as X-ray machines and CAT scanners.

Personnel: The people who perform the processes or support them in some direct way. These people may be located anywhere, and they can include your employees, contractors, and temps.

Suppliers: The outside organizations that supply your business with goods or services that it needs in order to produce its goods or services. Suppliers include organizations that provide you raw materials, such as steel, lumber, or blank CDs; a public utility supplying electricity, natural gas, or water; and a service organization, such as an Internet colocation facility or a data storage provider.

Gathering information through interviews

The best approach to inventorying all of the items in the preceding section is to schedule discussions with key people in the business. Business processes and information systems can’t explain themselves, so you need to talk to the people who are responsible for those processes and systems.

You’re not going to be able to create a complete list of people you need to interview initially. When people describe their processes and systems to you, they may point out more names that you need to add to your list of interviewees. Don’t look at the incompleteness of your initial list of suspects as a sign of weakness — it’s just a simple fact. Few organizations have a single individual who has an exceedingly clear view of every critical process, system, and supplier.

Here are some tips for these interviews:

Arrange the interviews in advance with department or business unit owners, letting them know what to expect so they can be prepared.

Plan the interviews so that they’ll be effective and won’t waste time. For instance, create a list of standard questions so that you can get more consistent answers, particularly if more than one person does the interviewing.

Conduct the interviews in person when possible.

Using consistent forms and worksheets

You can make the information-gathering stage of the BIA most effective if you ask for the same types of information from each process or system owner. You can (arguably) most easily accomplish this conformity by developing forms that you (or whoever conducts the interview) use when interviewing each process owner. Using forms has several advantages, including

Completeness: You can be sure that you ask all the key questions in every interview.

Conciseness: You can more easily capture a higher amount of detail in interviews by including details in forms.

Consistency: Whether one person or several people are conducting the interviews, they’re more likely to ask the same questions every time if you use a form.

You can use simple paper forms that the interviewer fills out in hardcopy, then the interviewer enters that information into spreadsheets or databases later; or you can use soft forms in Microsoft Word or Adobe Acrobat that the interviewer can fill out on-screen. Electronic note-taking may be more efficient because you don’t have to transcribe the written notes. Exactly how you conduct your interviews is up to you.

Here are some tips as you develop your information gathering forms and procedures:

Use one form per process, not one per interview. Unless an inter-viewee is responsible for only one process, use a separate form for each process (or system, asset, person, supplier, and so on). You want your information gathering to focus on the processes, not the personnel you’re interviewing.

Cross-reference. On a process intake form, list critical suppliers, personnel, assets, and systems. Likewise, on a critical supplier’s intake form, cross-reference the processes and systems that supplier supports.

Include metadata. Be sure that the form includes information such as the name of the interviewee, the interviewee’s contact information, who conducted the interview, and when it took place. You want to be able to trace data back to its source in case you come up with more questions later.

To help clarify this whole business of interviews and intake forms, Figure 3-1 shows a part of a sample intake form that you can use as a starting point. You can see the metadata (information about the information gathered, such as the names of the people interviewed, who did the interviewing, the date of interviews, and so on) and dependencies in this sample process intake form.

As you develop your forms, keep this in mind: You’ll probably want to transfer the data gathered on the forms into a spreadsheet, in which you’ll be able to view the data that you gather, as well as sort, filter, and merge that data with other related data that you might have on-hand.

The information-gathering stage of the BIA should help you build a high-level view of critical business processes and the systems that support them that lets you examine the details without getting mired in them.

The bird’s eye detailed view of the business

Often, an organization doesn’t have personnel who have a comprehensive view of all an organization’s processes, suppliers, assets, and personnel until the organization undertakes its first Business Impact Analysis. Many in the organization will want to get their hands on your business’s completed BIA that describes the entire business in detail, including rank-ordered lists of critical processes, systems, suppliers, assets, and personnel.

No one knows the details of a business as well as DR and security people do. These personnel are responsible for reducing risk across the entire business, and in the process of doing so, they accumulate much knowledge about all aspects of the business.

Figure 3-1: Use this sample intake form for a critical process to help you create your own.

Capturing Data for the BIA

The BIA is all about gathering information and then analyzing it. Gather information for the BIA methodically, consistently, and in a way you can repeat. In a larger project, in which more than one person gathers information, you should get the same details, regardless of who’s doing the gathering.

You gather a lot of information on a variety of topics for the Business Impact Analysis. Even though this book focuses on the IT side of disaster recovery planning, you can’t ignore the fact that IT systems support business processes. Ultimately, you need to know about the business processes — which ones are the most critical and how quickly you need to recover them. The BIA helps you figure out your organization’s processes.

Business processes

Business processes, or just processes, are the activities that an organization performs in support of its primary purpose(s): the production and delivery of goods and/or services.

All businesses have processes, although they may not be called processes. The following list includes some possible features of a business’s processes:

Processes contain one or more procedures. Procedures are (usually) written instructions that people carry out. Simple processes may contain only a single procedure, whereas complex processes may have many procedures (which personnel don’t necessarily carry out sequentially). Examples of procedures include intubate the patient, install the operating system, and replace the brake pads.

Procedures consist of one or more tasks, which are the individual steps that you need to perform in a procedure. Example tasks include log out of the application, turn off the power supply, and fasten the sensor to the bracket.

A DR project can expose weaknesses in business processes, including when you don’t have procedures in writing. In a smaller, newer, or less formal organization, you may be able to get away with maintaining a procedure in more of an oral tradition, rather than in a formal written form. But an organization that wants to establish an effective disaster recovery plan needs to document its procedures, in both disaster and peacetime settings.

Processes are carried out by people. Examples of people who carry out processes include bank tellers, database administrators, and mechanics. In highly automated processes, such as oil refineries, the machinery does most of the real work, but operators and engineers are in there somewhere, turning equipment off and on, and making adjustments to machinery as it continues operation.

Processes may depend on information systems. Personnel can carry out some processes without using an information system, but increasingly, business processes require information systems in some direct or indirect manner. Examples of these dependencies include the availability of a patient records system in order to admit a patient, the availability of an inventory system in order to identify the location of a replacement part, and the availability of a directory server to perform backups.

Processes may require assets. Processes often depend on one or more assets. For instance, a medical office needs a copier or scanner to make copies of insurance benefit cards and scales to weigh patients. A fueling station needs tanks and pumps. A manufacturing company needs its forklifts, packing machines, machine tools, and assembly lines.

Some organizations list their computer systems under assets rather than under information systems. I won’t get in the middle of that argument: As long as you document the dependencies, you can label them any way you want that makes sense for your organization.

Processes may depend on suppliers or service providers. Most processes require supplies or raw materials, which your business often gets from external suppliers.

The Business Impact Analysis contains all the features in the preceding list about business processes in a high level of detail. Your BIA may contain several worksheets listing the organization’s business processes, one per row (or even one per worksheet), with columns containing the individual items in the preceding list.

Bottom line: The BIA contains a detailed list of all the processes (at least, the important ones) that the organization carries out. It’s a summary of everything the organization does.

Information systems

The BIA contains an inventory of the organization’s information systems. Like the list of processes (which I discuss in the preceding section), the list of information systems probably will be quite detailed.

I deliberately use the rather general term information system, as opposed to more specific terms such as application, server, device, or database system. The term information system includes some or all the components in an IT environment. To some extent, what falls into your information systems group depends on how your organization thinks about its own information systems.

For example, a large medical clinic has a patient information system that manages all the information about its patients. If you think about the patient information system as an application, the system contains not only the application, but the servers it resides on; (potentially) separate database servers; and other elements, such as directory servers, print servers, and file servers. Without all these other elements, the patient information system wouldn’t function. And don’t forget the network (at least a part of it), as well as workstations and other equipment.

Your business model, information systems, application architecture, and even the structure of your org chart (who works for whom, and how responsibilities align with senior managers and executives) may dictate the ways that you slice and dice your complete collection of information assets. You don’t have to worry about a right way or wrong way, as long as the methods for identifying and classifying your information systems work for you.

Structure your inventory of information systems in such a way that you can easily identify dependencies between processes and information systems, as well as assets, personnel, and suppliers.

External or internal: It depends on scope

Many larger organizations perform not only the activities that directly result in the delivery of their primary goods and services, but also many supporting activities. Here are some examples:

bullet An office supply company may have its own fleet of delivery vehicles.

bullet An online travel services provider may operate its own Internet data center.

bullet A fleet of limousines may have its own tow trucks and mechanics.

The scope of a DR project determines whether you consider supporting or adjunct services internal or external. For example, if the scope of the online travel services provider’s DR plan includes only its software applications, the company may consider the Internet data center’s services external, even though they’re performed by the same organization, because the Internet data center isn’t a part of the company’s DR project.

Assets

A Business Impact Analysis contains a list of important assets that the business uses, particularly those assets that are directly or indirectly related to the production of whatever goods or services the business produces.

Your organization’s assets might be any of the following or something entirely different:

Delivery vehicles

Cranes

Printing presses

Regardless of the specific items on your list of assets, a BIA should contain the assets that are related to the organization’s primary activities.

Cross-reference your assets with whichever lists are relevant — processes, information systems, suppliers, and personnel.

Personnel

Every organization has its replaceable personnel, as well as those who aren’t so easily replaced. The point of a list of personnel (if, indeed, you even need such a list) is to identify those people who are critical to the delivery of the organization’s principle goods and services.

To avoid the nearly inevitable political posturing and other unnatural behavior that occurs when personnel try to prove their worth, avoid coming up with a list of critical personnel at all. Instead, identify critical personnel (if and as needed) within the most critical business processes and leave it at that.

You may decide to draw up a list of critical personnel — those people whose unanticipated absence could make the business suffer most. You can use this list to identify any critical paths that you can alleviate by cross-training or redistributing duties. Remember, the purpose of DR planning is to ensure the survival of the organization in a disaster. In serious disasters, key personnel may be killed, injured, or unable to report to work because of transportation disruptions.

Suppliers

Like processes, information systems, assets, and personnel (which you can read about in the preceding sections), you probably have several key suppliers, without which the organization’s output of goods and/or services would grind to a halt.

Identify key suppliers within each business process. If the organization is highly dependent on external suppliers (which may include other distant parts of the organization that fall out of scope of the DR project, as the “External or internal: It depends on scope” sidebar, in this chapter, states), the BIA may include a separate list of those suppliers, just so you can see them all in one place.

If you include a separate list of suppliers in the BIA, cross-reference each supplier back to the process(es) that it supports.

Statements of impact

In the preceding sections, I describe lists that you need to add to the BIA: processes, information systems, assets, personnel (optionally), and suppliers. Those lists contain a lot of details about each of the processes, information systems, and so on, including dependencies between processes and suppliers.

You need to add something else to those lists — the impact of nonperformance, or the impact of unavailability. In other words, the impact upon the organization as a whole if the particular process, supplier, or asset is disrupted or unavailable for a period of time.

In your BIA report, add statements of impact — words or short phrases that describe the impact if each process (or supplier or asset) is interrupted or unavailable. Examples include inability to process customer deposits, inability to transfer goods from inventory, and inability to access patient medical history.

Each business process is somehow — directly or indirectly — related to the organization’s production of goods and services. What if a disrupting event knocks the process offline (literally or figuratively) for an extended period of time?

Your Business Impact Analysis can also show a cost figure associated with each process. This figure represents the cost to the business per unit time, such as dollars per hour, if the process is unavailable.

Calculating cost impact can be quite complicated, and you should do it only for those processes you rank as most critical. You probably need the expertise of one or more financial people in your organization to help you make these calculations. You can get the heady details in Activity Accounting: An Activity-Based Costing Approach (Wiley), by James A. Brimson, and Activity-Based Cost Management: An Executive’s Guide (Wiley), by Gary Cokins.

Criticality assessment

The BIA report contains, in addition to statements of impact (which you can read about in the preceding section), criticality rankings for each process. You probably also want to include criticality rankings in the other lists, such as information systems and suppliers.

You can code criticality on a scale such as L, M, H, C (for low, medium, high, or critical impact) or a numeric scale rated 1 through 4.

Although you can rate or rank each data point fairly simply, criticality has tremendous impact on the results of the BIA. When you collect all the business processes on a spreadsheet and sort them by criticality, you get a rank-ordered list of the organization’s most critical processes — one of the primary objectives of the Business Impact Analysis.

The criticality ranking is a well-informed estimate of overall impact on continuing business operations if that process is interrupted.

MTD and governments

If the local city or county government can’t perform a critical process past its MTD, is it really going to go out of business? You probably find it hard to imagine a government actually ceasing to function altogether, but a lot of people could end up with really big problems if the critical process that’s not available involves keeping the water or electricity flowing to that government’s citizens.

In businesses such as governments that rarely just stop functioning entirely, the MTD might instead be the point at which the customers (citizens) are likely to revolt and force out the top officials.

Maximum Tolerable Downtime

For each process in the BIA, you need to determine its Maximum Tolerable Downtime (MTD).

Maximum Tolerable Downtime is the time after which the process being unavailable creates irreversible (and often fatal) consequences. Generally, exceeding the MTD leads to severe damage to the viability of the business, including the actual failure of the business. Depending on the process, you can express the MTD in hours, days, or longer.

Arriving at a reasonable MTD for a process is anything but easy. You can’t ask yourself, “Last time this process became unavailable to the organization, how long was it before the organization actually failed?” And such occurrences happen so rarely, even among other organizations similar to yours, that you have very little data to reference when you estimate an MTD. You really have to ask yourself, how long would it take for this organization to go fins-up if this particular process was down for a long time?

Still, you have to put something in that spot. You may need to turn to the expertise of more seasoned senior or executive management, and even then, you can come up with only a somewhat arbitrary figure. You really need to think out the figures for MTD because those figures contribute to the calculation of other figures discussed in the following sections.

Recovery Time Objective

After you determine MTDs for processes (see the preceding section), you can begin setting targets for recovery. One important target is the Recovery Time Objective (RTO).

The Recovery Time Objective is the period of time in which the organization intends to have the interrupted process running again.

Time critical versus mission critical

When you gather information about critical processes, and when you’re estimating Maximum Tolerable Downtime (MTD), Recovery Time Objectives (RTO), and Recovery Point Objectives (RPO), you may notice that

bullet You have time-critical processes (those that must be delivered in a timely fashion).

bullet You have mission-critical processes (those that are vital to the organization’s viability).

Your time-critical processes and your mission-critical process aren’t necessarily the same.

Your organization may have mission-critical processes that aren’t time critical, and you may have time-critical processes that aren’t mission critical.

The difference between these two kinds of processes becomes important as you begin using the results of your Business Impact Analysis. As you establish Recovery Time Objectives (RTOs) for processes, you need to balance the cost of attaining those RTOs against the value of the processes that they support.

For any given process, the RTO is less than the MTD. By definition, it has to be. If you set a 14-day RTO for a process with a 7-day MTD, your business has failed before you can get the critical process running again. And what’s the point of that?

A process’s RTO forms the basis for any DR planning that you’ll do for that process. For example, if a process has a 30-day RTO, you can get it running again — purchase a new server, install software, and restore backup data — at a leisurely pace. However, a process with a one-hour RTO requires a hot site with a standby server and data replication in near-real time. The costs for these two scenarios vary greatly.

Time is money. Lower RTOs require more investment in standby systems, as well as the possible need for data replication or other potentially costly technologies.

Establishing RTOs and then determining the costs required to reach those objectives can be a repetitive process. As you discover the costs of achieving an ambitious RTO, you may need to compromise and develop a capability that costs less but delivers a longer RTO.

Recovery Point Objective

The Recovery Point Objective (RPO), like the RTO (discussed in the preceding section), is somewhat arbitrary and based on assumptions that people near the top of the org chart (executives and senior managers) make.

The Recovery Point Objective is the maximum amount of data that you can lose if a process is interrupted and later recovered.

Say that an organization wants to establish a four-hour RPO for an order entry system. In order to meet this figure, the organization has to implement a mechanism to back up or replicate transaction data so that it loses no more than four hours of transactions in a disaster scenario.

Similar to the RTO, setting the RPO determines what sorts of measures you need to take to ensure that you don’t lose information related to any particular business process.

Speed costs. Lower RPOs generally require greater investment in data replication or backup technology.

Introducing Threat Modeling and Risk Analysis

You need to carry out threat modeling and risk analysis for each critical process that you identify in the BIA. Although they’re somewhat different activities, threat modeling and risk analysis are similar enough that you can think of them as a single integrated activity.

Threat modeling is the process of identifying a full range of potential threats, the probability that they’ll occur, their impact, and mitigation steps.

Risk analysis is the process of identifying and assessing factors that may jeopardize the ongoing operation of a business process.

If you think that threat modeling and risk analysis are similar, you’re right. You perform both processes as a single activity, in which you identify threats and vulnerabilities in business processes and the steps that you can take to mitigate the potential impact of those threats and vulnerabilities.

Mitigation is just a fancy word that means the steps or measures that you need to perform to reduce your risk.

You may need to carry out these activities for each process in the BIA, although in many cases, you can carry out threat modeling on groups of similar processes, rather than each process individually. I mean, a flood is a flood — listing it for every process might be going a little overboard (yes, that pun was intended).

Disaster scenarios

Before you can get to the actual threat and risk analysis, you need to create a relatively complete list of the disasters that are reasonably likely to occur. The following list isn’t meant to be complete — some disasters not listed here might belong in your threat model. But this list should give you a good starting point:

Natural disasters: You know, acts of nature — events that occur without any direct help from people. Here are some examples:

• Fires and explosions

• Earthquakes

• Volcanoes

• Storms (snow, ice, hail, wind, or prolonged rain)

• Floods

• Hurricanes, cyclones, and typhoons

• Tornadoes

• Landslides, mudflows, and avalanches

• Tsunamis

• Pandemic

Man-made disasters: Human-caused events. These disasters include

• War and terrorism

• Riots and other civil disturbances

• Work stoppages

• Cyber attacks

Secondary effects: These effects can result from both man-made and natural disasters. Secondary effects include

• Utility outages: Electric power, natural gas, water, and so on

• Communications outages: Telephone, cable, wireless, television, radio, Internet

• Transportation outages: Roads, highways, airports, railroads, shipping

Your region or locale may be subject to other events that can disrupt business activities to such an extent that you can consider those events disasters.

Identifying potential disasters in your region

Disastrous events are, by their nature, uncommon in many parts of the world. Where disasters occur frequently, usually everyone leaves, or they make long-term investments in infrastructure to lessen the effects of natural events so those events are no longer disastrous when they occur. Still, you should have a good understanding of the types of disasters that can occur in your region. To find information on the types of disaster you may have to face, check out these sources:

National and local weather bureaus

Local civil defense authorities

Local disaster relief agencies, such as the International Red Cross

Local law enforcement

Local newspaper archives

Army Corps of Engineers (for flood plain data in the U.S. only)

Peers and colleagues in local trade organizations

One or more of these sources may lead you to other local sources of useful information about potential disasters.

Performing Threat Modeling and Risk Analysis

Threat modeling and risk analysis can consume a significant portion of the total BIA effort. Entire books have been written on the topic, but because this book has only so much space, I describe these activities only in procedural form.

For each process or group of processes in your BIA, follow these steps:

1. Identify every potential natural disaster that could interrupt the process you’re dealing with.

2. Determine the likelihood of each disaster occurring within a single calendar year.

3. Identify every potential man-made disaster that could interrupt your process.

4. Determine the likelihood of each man-made disaster occurring in a single calendar year.

For both natural and man-made disasters, assign numeric values for low-to-high likelihood something like this: Rare: 1; Infrequent: 10; Possible: 100; Likely: 1,000; Very Likely: 10,000.

5. For each threat that you identify in Step 1 and Step 3, rank the impact of the event (if it actually occurs) on this scale: Lowest: 1; Medium: 100; Highest: 10,000.

6. Determine the risk of each threat.

For each threat, multiply the likelihood figure from Step 2 or Step 4 by the threat figure from Step 5. For example, a threat with infrequent probability (value: 10) and a medium impact (value: 100) equals 1,000. Use this equation for each threat.

7. Sort the threats by risk (the figure you establish in Step 6).

Pay the most attention to the threats at the top of the list. Chances are, these events are most likely to occur in your region.

After following the preceding steps, you have a simple threat analysis. You know which threats you need to pay the most attention to, and you have an idea how likely those threats are to actually occur (well, at least as accurate as your estimate based on the preceding list’s rather unsophisticated scale).

I made the threat analysis procedure in the preceding list intentionally simplistic. A real threat analysis should use a broader scale and more realistic probabilities. But hopefully you get the idea of what threat analysis is all about. If you want all the details about threat analysis, you can pick up a copy of Emerging Threat Analysis: From Mischief to Malicious (Syngress), by Michael Gregg. You can also find a great free online resource about risk analysis at the U.S. National Institute for Standards and Technology’s Web site (www.nist.gov): Risk Management Guide for Information Technology Systems, special publication 800-30.

You can perform threat modeling and risk analysis at the same time as the information-gathering process. Because threat modeling and risk analysis are so similar, you might consider doing them as a single task.

Identifying Critical Components

You’ve collected basic information from all of the important business processes for your Business Impact Analysis. You’ve identified information systems, personnel, assets, and suppliers that these processes depend on, and you may have created separate lists of these if your business has a lot of them. For instance, you can create separate lists of suppliers by category. You get to decide how you want to organize your information.

Processes and systems

In the list of critical processes that you create in the BIA, you have many important fields that describe the processes, their owners, and so on. Here’s a list of the fields you should include:

Process name

Process owner

Description of the process

Information systems that this process requires

Assets that this process requires

Any critical personnel without whom this process would fail

Suppliers that this process requires

Statement of impact if the process fails or is interrupted

Maximum Tolerable Downtime (MTD)

Recovery Time Objective (RTO)

Recovery Point Objective (RPO)

Cost of downtime

Criticality ranking

Your list will probably have more fields than the preceding list does, but this list gives you the basics.

I want to focus, for now, on the numeric items in the preceding list — the MTD, RTO, RPO, cost of downtime, and criticality. With these fields, you can manipulate your list in various ways to get an eagle-eye view of which processes are truly important in your organization. You can begin to see which process have the shortest MTDs, RTOs, and RPOs by sorting the list based on those columns. While sorting based on these fields, you can keep your eye on the criticality rankings to see if criticality is in line with those objectives. Do you see any correlation between criticality and your RTOs and RPOs? Maybe you do, and maybe you don’t.

If you have a lot of processes (more than can fit on a screen — at least, that’s how I decide whether I have a lot of processes), you might group them into High, Medium, and Low categories, based on ranking. For instance, you might divide the entire rank-ordered list of processes into thirds. Depending on the nature of your business (which includes a great many things, including regulatory, financial, and market conditions), your organization might invest in DR capabilities for only the High and Medium processes, not the Low.

Suppliers

In a large BIA effort (say, more than 20 business processes), you may identify several suppliers and other supply chain partners within your processes. You might decide to pull these critical suppliers and make a separate worksheet for them, in which you can capture additional information about them, including

Company name, address, phone number, Internet URL, and so on

Business contact’s name, address, phone number, e-mail, and so on

Name of business contact in your organization who has the business relationship with the supplier’s business contact

The processes that the supplier supports

The goods and/or services that the supplier provides

You can use this critical supplier information as a jumping-off point when you begin building your DR plans.

Personnel

As you manipulate, slice, and dice your critical processes list (described in the preceding sections), you may begin to notice a few names of personnel who appear frequently in the most critical processes. You may want to take a closer look at those people and consider whether they’re truly critical for so many business processes.

Items in your DR plans that relate to critical personnel may include cross-training or staff augmentation of some sort in order to reduce any possible exposures related to too many processes depending on too few individuals.

Determining the Maximum Tolerable Downtime

I discuss Maximum Tolerable Downtime in the section “Maximum Tolerable Downtime,” earlier in this chapter; now, I go into this topic a little deeper.

Maximum Tolerable Downtime (MTD) is the maximum length of time a business process can be interrupted or unavailable without causing the business itself to fail. Here are some examples:

An exclusively online retailer might go under if its online catalog is unavailable for several days.

An airline might go out of business if it can’t book flights for more than 48 hours.

A delivery business might fail if it can’t get dispatch information to its trucks within an hour of loading them.

You can have a really hard time arriving at reasonable MTD figures for your business, or any business. Business failures that occur because of disasters aren’t an everyday occurrence. To my knowledge, no sites on the Internet have statistics on the connection between disasters and failed businesses. With so little data to work with, your MTD figure is probably going to be no better than an educated guess.

Calculating the Recovery Time Objective

The Recovery Time Objective (RTO) is the time period in which the organization should have the interrupted process running again, at or near the same capacity and conditions as before the disaster.

To determine the RTO, you need an idea of your Maximum Tolerable Downtime (MTD) value. Common sense should dictate that you need your RTO to be less than your MTD. In other words, you want your critical process restored and operating well before the point at which its downtime would threaten the very viability of the business. Otherwise, it’s sort of like waiting three and a half minutes to begin administering CPR to a drowning victim.

For example, if the MTD for a critical process is seven days, you might set your RTO to four days.

You need to be as realistic as possible about the RTOs you specify for processes. A lower RTO does cost more than a higher RTO. You can’t have it both ways — you either have a fast recovery or a cheap recovery.

If you’ve been reading ahead, or if you’re just a quick study, you might be thinking that you don’t want the cost of achieving a given RTO to exceed the value derived from the business process. For example, it doesn’t make sense to invest $100,000 in equipment to reduce an RTO from four hours to one hour if the cost of downtime is only $1,000 per hour. Spending $100,000 to save $4,000 doesn’t make good sense.

You figure out how much you can reasonably spend to improve the RTO and RPO much in the same way you buy auto insurance: You need to figure out how much the premiums cost, what the deductibles are, and what events the insurance covers.

In your BIA and DR plan development, you estimate the cost required to achieve an RTO. I go through this procedure in Chapter 6 through Chapter 8.

Calculating the Recovery Point Objective

A Recovery Point Objective (RPO) is the amount of data that you can lose in a disaster without being able to recover it.

For example, a company uses an online financial management application to manage its finances. Every day, employees enter invoices, payment requests, journal entries, and receipts. A disaster strikes the data center in which the application’s servers reside. Backups were performed once per day, and an entire day’s work was lost. This application’s RPO is one day — in other words, the company can recover the application only to the point one day prior to the disaster.

Thinking ahead, if the organization wanted to shorten the RPO, it could do so by running backups more often or replicating transactions to another server in another location.

Like the RTO (see the preceding section), shortening a process’s RPO generally carries a price. Later in the analysis, you can better determine the right balance between the cost of achieving an RPO and the value it provides the organization. Chapter 6 through Chapter 8 can help you strike this balance as you begin to formulate ways to make the various parts of your environment recoverable.