Data: The Key Element in the Overall Methodology to Resolving Problems and Assessing Opportunities
The overall methodology and process to assessing business situations comprises four steps:
1.Data collection
2.Analyses
3.Drawing conclusions
4.Making recommendations
Compromise could happen anywhere along this process. However, it is my experience that the majority of the compromise occurs in the data collection and analyses phases. They are interrelated, and many times it is very difficult to know what was compromised. To fully appreciate where and how compromises occur, it would be helpful to understand how data are collected and used in the analyses.
Data Collection and Analyses
Data collection is the most critical element of the overall methodology for reaching proper conclusions. Everything is based on the data. If the data are compromised, so is everything else that flows from them. Unfortunately, this is also where most of the shortfalls and shortcuts occur. Do this part well, and the probability of reaching better conclusions and recommendations increases measurably.
The most important observation and understanding regarding data is best captured in the adage “Garbage in, garbage out.” This adage emerged when computers began to penetrate the business world. Computers didn’t do any “thinking,” nor were they capable of checking results for “reasonableness.” Numbers were input, manipulated according to specific algorithms, and results were spilled out into reports.
When the data were wrong, the results were wrong. When business managers received reports based on incorrect data, it was obvious to them that the output made no sense. They would blame the computers and the IT department. The IT professionals would tell these managers that they were not at fault.
The managers had no concept as to how computers worked. So, for them, it must have been the fault of the computer, because no human would have allowed the dissemination of obviously erroneous reports. It would have been caught on sight and corrected. Thus, endless back-and-forth arguments would ensue.
The IT professionals then coined a memorable phrase that attempted to explain how computers worked and why the erroneous reports were not the fault of the IT professionals or the computers: “Garbage in, garbage out.” End of argument!
So, unequivocally, the highest priority in regard to data is to make sure that data do not comprise “garbage.” In my experience, most of the failings occur because the practitioners view the adequacy of data from a narrow perspective, mostly whether it is accurate. In my mind, data must be viewed from a much broader perspective in order to produce a quality result. Data need to meet four criteria to pass the adequacy test:
•Be most relevant
•Be complete
•Be accurate
•Be organized
Each and all of the criteria require sound thinking and critical judgment. In most situations, it is not a straightforward undertaking. The process is prone to mistakes and, indeed, mistakes are made frequently.
To understand how to do it well and how easy it is to err, one needs to look at the various sources of the data. The sources of data (listed below) can be categorized into eleven distinct categories representing various collection and aggregation methods, and each is prone to potential errors and inaccuracies.
1.Internal, elementary, raw data: The “smallest” and most “raw” data elements that are collected at the source that generates them. For example, sales slips for “sales”; expense receipts for “costs”; self-monitored machine diagnostics of all kinds; and so on. These types of data, in general, are reasonably accurate and present few potential pitfalls.
2.Internally aggregated raw data: The raw data are generally aggregated into categories in order to allow management to see the bigger picture and observe trends. The aggregation continues upward, so that each management layer can view it from their higher-level perspective. These data, too, are generally reasonably accurate.
3.Internal, raw data with allocated overhead costs: These data are generated to allow management to get the actual manufacturing costs for the various products, as well as the various business lines. Indirect and overhead manufacturing costs are added to the raw, direct costs of an activity. This activity is most often used to calculate the real cost of manufacturing or “cost of goods sold,” thereby the gross margins (gross profits) of the various products. The same is done to calculate the true total costs of the various business lines by allocating to them the costs of the various support functions like research and development, marketing and sales, finance and accounting, and so on. There are specific methodologies and algorithms that are used to determine how to allocate the different overhead costs properly across the organization and products. These allocation algorithms attempt to allocate the overhead costs so as to represent the actual costs of the various direct activities, or business entities. These algorithms are at best an attempt to estimate the real costs and therefore are vulnerable to compromised results. The accuracy of the allocations should always be verified whenever important conclusions are drawn from analyses using these data categories.
4.External, private-sector aggregated data: These are data that are aggregated to include broader information than for a specific company. They come with different levels of aggregations. For example, the most common are an industry association’s data for its industry. These data are generally used for multi-year trend analyses and are collected and completed by the different member companies. Nobody is responsible for the accuracy or consistency of the data, and as such these data should rarely be relied on in making important decisions without verification. They are fine for general, big-picture trend analyses. At Booz Allen, we never relied on this type of data without verification. Every time I attempted to verify such data by studying how they were collected and aggregated, I found deficiencies that gave rise to compromised data and inaccuracies.
5.External public-sector aggregated data: These are data collected by different governmental agencies, from local agencies all the way up to the federal level. Accuracy may be questionable, as some data are more accurate than others, and should rarely be relied on without verification in making important decisions. They are fine for general, big-picture trend analyses.
6.Statistically generated data: These are data based on statistical principles from which broader information/data are extrapolated. Quality depends on the sampling, but if done properly, this type of data should be fine. Always verify the sampling methodology.
7.Survey-generated data: These data are generally collected via questionnaires and other such means of data collection methods. This type of data may generate more “complete” data, or may be used to generate “sample” data to be used as input for statistically generated data. This category presents many pitfalls. The quality of this type of data depends heavily on the veracity of the questions being asked and the sampling of the respondents. I found in most cases that such data are not thorough. Participants are not asked all of the necessary questions to lead to a complete picture on all of the important variables. The worst data compromise happens when the questions have multiple-choice answers. I rarely found the multiple-choice answers to be specific or accurate enough to allow respondents a clear, unequivocal choice. Most of the answers are, therefore, a best guess of proximity to the truth, and not the truth itself. Check such data thoroughly before relying on the results.
8.Focus group–generated data: These data are collected verbally from groups, with the collection carried out in a methodical way. Such data generate much better-quality data than questionnaire surveys. However, here, too, the quality of results heavily depends on the moderator, the questions asked, how and when they are asked, and the people selected for the focus group. I personally have sat in on focus groups and was seldom impressed. The moderators were often too quickly satisfied with the first level of answers; didn’t consider a gap between how a person articulated an answer versus what they meant to say; rarely paid attention to the conviction level of the answer; and mostly ignored group dynamics and its impact on the answers. Check such data thoroughly before relying on the results.
9.Informal customer feedback data: This consists of verbal feedback, done in an informal, non-methodical way. Accuracy cannot be relied on. Never rely on them without a more formal methodology to collect data. Unfortunately, many times companies heavily rely on them, particularly if feedback comes from important customers.
10.“Expert” interviews: These are generally verbal interviews. It is a very important part of the analysis process. However, no specific data are collected and aggregated. So, there is no issue of accuracy. Just be mindful of the 80-20 Rule.
11.Salesforce feedback: This is an important but informal source of constant feedback. Salespeople are out there fighting daily in the trenches. They have firsthand experience with all the competitive dynamics on the ground. They are generally thought of as the eyes and ears of the company. However, feedback from salespeople may be unreliable and needs to be studied in a formal way before any hard conclusions are drawn and changes implemented. (An in-depth case study is presented later that convincingly illustrates this point.) This process is very susceptible to the “squeaky wheel gets the grease” phenomenon. Also, salespeople might have the tendency, for reasons of self-preservation, to blame everything on a product’s price or features. Take the feedback seriously, but do additional due diligence before drawing conclusions.
Data Relevancy, Completeness, and Accuracy
While everybody would agree that any data used must be relevant and complete, the problem is that thoroughness is often in the eyes of the beholder. As a result, it is easy to believe that the data are relevant and complete when in reality they might not be.
First, let’s discuss relevance. Relevance is a relative measure, and in my way of thinking—and in our case—has a dual meaning. It is in the dual meaning wherein the potential for abuse of the data and compromised results lies. The dual meanings are the two ends of the stick; one can look at it from one end or the other. They represent two opposite directions of using data and logic. One way is to have relevant data justify the relevancy of a conclusion. The other is to have an irrelevant conclusion be derived from what may appear to be relevant data.
Conceptually, data only become relevant because they support some observation or conclusion. Without a specific observation or conclusion, data, by definition, have no relevancy. Do you see the nuance? Any data that support any observation or conclusion automatically become relevant to that conclusion. Yet, the observation or conclusion itself may or may not be important, relative to the issue or challenge at hand. Thus, to decide whether the data are relevant, one needs to look at whether the observation or conclusion they support is relevant or not. Placing judgment on that is not simple.
Oftentimes, people see a subset of the data that they believe leads to a logical and relevant conclusion. Yet there may be other data that could be even more important, and that may even negate the initial observation. This is a classic case of actually not seeing (or looking for) all the trees, yet believing we are clearly seeing the forest. This is why I said that data must also be complete (where you are looking at all the trees before you focus on the forest) to allow for proper analyses and conclusions. When data are not complete, compromised business decisions are likely to be made.
Thus, I define relevancy to acknowledge this potential pitfall. I define it to be relative to the overall quality, importance, and correctness of the observations or conclusions they drive, as opposed to data that drive relatively unimportant or even potentially misleading observations or conclusions. This leads to another logical point: The only way to place judgment on relevance is when one knows without doubt the correct observations and conclusions, which is rarely the case. Thus, a judgment call is made, and as such is rife with potential pitfalls and mistakes. The following is an example of just how easy it is to err with the relevancy and completeness of data.
Example: The Case of Airfone
In 1986, a few years after the breakup of the AT&T monopoly, the founder of Airfone approached Ameritech for a very large investment and a potential partnership agreement. Airfone was a company that was founded by the founder of MCI Communications, Inc., a few years earlier.
MCI was the first telecommunications company that began to compete directly with AT&T for its long-distance telephone business. For more than a decade, MCI fought AT&T in court until finally, in 1980, it won an antitrust lawsuit that subsequently led to the breakup of AT&T. It was the first and only organized competitor to AT&T for its long-distance telephone business and was able to offer substantially better prices for long-distance telephone calls. MCI grew rapidly to become the second-largest long-distance telephone company, behind AT&T. The founder of MCI shortly thereafter retired from the company and founded a number of start-up companies. One of those companies was Airfone, which was based in Chicago.
Airfone’s vision was to become a main player in the newly emerging cellular telephone technology and industry. While the cellular industry was seeing some momentum, calls were expensive at thirty cents to fifty cents a minute for local calls, and double that cost for long-distance calls. Airfone was developing technology for the airline industry, wherein passengers would be able to make telephone calls in-flight. It was an ambitious and costly undertaking. The founder had already invested close to $30 million to develop the necessary technology. He approached Ameritech soon after Airfone had developed a working prototype. He had also secured, on an exclusive basis, service contracts with some of the largest airline companies to install the technology in their airplanes. Initially, the service would be available in the U.S. only, but with time it could expand internationally. Airfone estimated that it would require about $100 million of fresh capital to finish the development, commercialize the technology, and launch the business. Ameritech was their preferred strategic partner. Both were headquartered in Chicago, and Ameritech was the largest telephone company in the Midwest and rich in cash.
I met with the founder to evaluate this opportunity on behalf of Ameritech. Our initial meetings centered on the viability of the technology and the risk involved in finishing the development and commercializing the technology. Because of the substantial cost of installation in the airplane, the initial phase called for between two and six cordless phones per airplane, depending on the size of the plane. The phones would be placed in the galley, adjacent to the kitchen. They would be connected via wireless technology inside the plane, so that the passengers could pick them up and take them from their cradle by the galley to their seat for use. The calls would be transmitted from the airplane via radio communications equipment to a specialized network of receiving towers on the ground. Those towers would then need to be integrated into the existing cellular industry towers to complete the calls.
After extensive due diligence, we were convinced that the technology could indeed be successfully commercialized, albeit at a substantially greater cost than Airfone estimated. We subsequently began to discuss the business potential of the opportunity and negotiate the terms of the partnership.
Airfone presented its business case to us in detail, which was very impressive and looked promising at face value. They presented a logical plan that suggested the business would become successful and grow to billions of dollars in revenues. Based on their assumptions, they placed a pre-money valuation of $500 million on the company. The centerpiece of the data was an extensive survey of thousands of airline passengers from around the country, who were asked if they would make use of a telephone on an airplane. Ninety-five percent of the responders said they would, and 50 percent suggested that they would make multiple calls. With such a demand, the phones would be in constant use. Multiply the large number of airplanes by their average flight-time per flight, and then by the cost per minute charge for the call, and you easily get revenues into the billions of dollars.
I asked to see the questions used in the surveys and was given a list of fifteen questions. I studied them carefully and then said to the founder that we would not have any interest in pursuing partnership discussions with Airfone unless he was willing to accept a value of about $50 million, which was a little more than the cash he had invested in the company up to that point. Of course, I provided the logic of my position, but he strongly protested before leaving. He went to look for another partner and successfully reached an agreement with GTE, which was the largest independent telephone company in the United States, competing in both the local and long-distance telephone services.
What I saw, and based on what they told us in earlier discussions, was a fatal flaw in the questions of the survey. The questions overlooked a critical dimension. It was hard to accept that such a simple and obvious error would escape scrutiny.
As hard as it is to believe, Airfone never asked the passengers whether they would be willing to pay five dollars per minute for a call. Such a high price was needed to support the heavy cost of providing the service. Just to put it in perspective, five dollars of value in those days would be equivalent to about ten dollars to fifteen dollars in today’s money. Nobody in their right mind would spend so much money per minute on a telephone call, unless it was an emergency. With such a steep price, I didn’t think those phones would be used as often as Airfone thought they would.
This may have been a case of either complete incompetency or passive incompetency, but incompetency, nevertheless. The complete incompetency refers to those who drew up the questions for the survey, who may have been lower-level managers not concerned with price, and top management, who were not involved enough to notice it. The passive incompetency would have been an incompetency out of naivete. The survey was conducted before the technology was fully developed, and perhaps used to justify the investment in developing the technology in the first place. As such, Airfone may have believed, or assumed, that the cost to build its technology would have been comparable to the cost of existing cellular technology, and thus the price of a call would approximate the then current cellular rates. In either case, Airfone should have corrected the survey once it realized that the pricing would be completely out of whack with standard rates.
In retrospect, I was right. GTE subsequently invested in Airfone and lost hundreds of millions of dollars, and the business never took off.
Interestingly enough, there were two other major problems with the initial plans, which proved to be critical as well. Both problems could have been avoided and addressed with a smarter initial design of the technology/product had the data collection been more complete. One of the problems dealt with airline regulations, the other with human behavior.
The first was Airfone’s assumption that the phones would be used throughout the entire flight. However, between ten to twenty minutes after takeoff and prior to landing, passengers are required to be in their seat, thus reducing the number of potential minutes of use by at least twenty to forty minutes per flight. The second problem was even worse. Passengers would pick up the phone at the galley by the kitchen, go back to their seat, and attempt to make a call. If they got a busy signal, which occurred often because the network coverage was not yet ubiquitous, they would not return the phone to the galley. Rather, they would keep it with them and redial a few minutes later, and so on, until they got through. Thus, phone use in reality became very limited. To solve these two problems, Airfone/GTE redesigned the technology to allow a telephone to be installed in each row of seats. It was massively expensive. I learned about those facts because two years later the company needed to secure another $200 million in funding and invited Ameritech to join their partnership. I turned it down again because I wasn’t convinced that there would be any financial returns for many years to come.
As it turned out, the promise of Airfone/GTE never materialized. GTE was subsequently purchased by Verizon. Airfone operations lost hundreds of millions of dollars over the years and service was completely terminated in 2006. Some quotes from Wikipedia capture well the demise of Airfone:
•“The head of the Airfone division told The New York Times in 2004 that only two to three people use the Airfone service per flight.”
•“On June 23, 2006, Verizon Communications, Airfone’s parent company, announced that they would be discontinuing their Airfone service on all commercial flights by the end of 2006. They cited the declining use of the service.”
•“On December 28, 2007, Airfone announced it would discontinue service effective December 31, 2008, unless it successfully concludes negotiations with LiveTV, an affiliate of JetBlue, to take over the business on that date.”
A conversation on Aviation Stack Exchange, an online question and answer forum for aircraft enthusiasts, captures the essence of the demise:
Question: |
“Did AirFone ever make money?” |
Answer: |
“I remember when the first AirFones were put into service in the 1990s and marveled at who would be rich enough to pay $7/min for air-to-ground phone service. Even though every seat eventually had an AirFone, I cannot once recall seeing one in use (except in movies) in twenty years of flying (in economy).” |
Question: |
“I know they’re being pulled out now, but did AirFones ever make a profit? Who paid for their installation? Was it the airlines or did GTE, etc., pay for everything?” |
Answer: |
“Despite impressive technological achievements and great expectations all around, the service’s business performance ultimately proved an abject failure. In the ten years before Verizon finally pulled the plug, Airfone generated only 50 million total calls, a fraction of the calls carried daily by cellular companies. While the Airfone service was heavily used when bad weather caused significant delays, the system’s utilization at other times was extremely low. A typical plane was equipped with as many as 60 phones. From these, the average large jet generated fewer than 100 calls per day in about 16 hours of flying time. As a result, the expensive system, with its heavy load of fixed costs, remained idle well over 99% of the time.” |
Data Organization
Simply collecting data is not all that is required before it can be analyzed. The data must be organized so that one can observe findings that lead to conclusions. Most data and trends are best portrayed via graphs and charts of all kinds.
We are now ready to proceed onto the next steps of the process.