INTRODUCTION |
In chapter 1 I highlighted the fundamental relevance of collection for strategic analysis. This chapter introduces different segmentation dimensions of sources for data collection. The first segmentation dimension is the ‘nature’ of data sources. Subsequently I will introduce some other useful dimensions to segment data sources.
Then, in subsequent chapters, I will use ‘People’, ‘Papers’, ‘Pictures’ and ‘Products’ as leading segmentation for the discussion on collection methodologies by data source nature. Prior to that, in this chapter I will discuss other generic segmentation dimensions for data sources, a valuation of source by segment and a methodology to efficiently and effectively execute collection. The latter is commonly known as a ‘collection plan’.
SEGMENTATION |
DEFINING AND SEGMENTING SOURCES
As a first step in this section I aim to define what a source is. A source in this book is:
“a book, statement, image, person, object, etc., supplying data for an analysis.”
There are many ways to segment data sources. First, I would like to discuss sources (regardless of their nature), by using some generic segmentation dimensions. The list below, though it is not exhaustive, presents the dimensions that sources are segmented by in this book. But, there may well be other dimensions that are relevant for segmenting such data sources. So be it. The ones below in my experience are relevant in basically all strategic analysis and therefore have made it into this book:
• Open |
vs. | Classified |
• Primary |
vs. | Secondary |
• Internal |
vs. | External |
• Free-of-charge |
vs. | Fee-based |
• Small |
vs. | Big |
• Good |
vs. | Bad |
OPEN VERSUS CLASSIFIED
In strategic business analysis some sources are confidential. In business analysis confidentiality usually relates to Non-Disclosure Agreements (NDAs) that have been concluded between the different parties and the firm that employs the strategic analyst. Parties involved in NDAs may include suppliers, investment banks, brokers, competitors, other companies and customers. Data obtained from such confidential sources, for instance data to judge the value offered to a firm from a sale in a particular transaction, often is explicitly forbidden to be used for any other purpose.
For example: an auction process is organized to sell a company. In such a case an NDA will first need to be signed by authorized staff in the analyst’s firm. Only when the NDA has been countersigned, the seller will be prepared to share detailed, confidential information on the company that is for sale. Such information is contained in what is usually called an Information Memorandum (IM). From a strategic analysis perspective, an IM is a treasure trove. It provides information on a company for sale that will never be available in the public domain. Moreover, IMs have usually been prepared by professional services firms. The latter do not want to end up in litigation when, for whatever reason, the IM contained incorrect information. So, the reliability of the data in an IM is high.
It should, however, be remembered that even when every data point mentioned in the IM is 100% correct and true, the information may still be far from the whole truth (e.g., it may be that essential facts for valuing the company for sale are lacking).
Usually usage clauses of an IM of a company offered for sale limit the IM’s value. As a rule, an IM should only be used to judge the value of a potential transaction. The candidate buyer has to guarantee in writing that soft and hard copies of the IM are removed from the computers of the staff involved in the transaction – except for one copy that is usually allowed to be held as a record by the prospective buying firm’s legal affairs department. Thus, IMs can only legally be used for a single purpose and afterwards are no longer accessible to the strategic analysis function (or so at least it should be).
Moreover, by implication any analysis will be classified and only available for limited distribution in the context of the purpose for which it has been produced and the use allowed.
In the selling process of a company, even more classified information is made available once a formal due diligence process has started. The due diligence team gets free access to many more and different confidential company facts. What applies to IMs applies even more to due diligence-obtained information: highly useful, but due to confidentiality-related limitations, off-limits for other strategic analysis assignments.
Therefore, in general, classified sources are a great and essential asset, but only for the purpose for which they are provided. Therefore, strategic analysis work tends to be based on open sources and the data is collected and available in sources that are 100% in the public domain. There is never, repeat never an excuse to break the law (including but not limited to using IMs for purposes other than intended). An analyst who sees no other way to solve his puzzle at hand than through breaking the law is either not competent or lacks integrity. In both cases, such an analyst is not the type of staff any decent firm that embraces fair competition would wish to employ. In summary: open sources are the key source for strategic analysis work.
PRIMARY VERSUS SECONDARY
Primary sources are defined as sources that themselves possess the data that is looked for. Secondary sources may report data but have generally not created the data themselves.
An example of a primary source is a competitor’s annual report, a press release or an investor presentation that is, or at least should be, in the public domain. Primary sources also include humans, who may for instance include the competitor’s staff or staff employed by a supplier or a customer.
Secondary sources include all other sources that are not primary sources. The longer the distance between the origin of the data and the source that reports the data, the more significant source reliability issues may become.
Let me provide an example to illustrate this. Secondary sources amongst others include newspaper articles. Newspaper articles, especially those featured in the general press rather than specialized financial newspapers, are notoriously inaccurate when it comes to figures. Turnover or even worse profit is rarely accurately quoted in a newspaper article. A journalist reports that ‘profits of company ABC have gone up to US$12 million in Q4’. What profits?
• Gross earnings (i.e., net sales minus variable cost)?
• Earnings Before Interest, Tax, Depreciation and Amortization?
• Earnings Before Interest and Tax?
• Earnings Before Tax?
• Net Earnings?
For each of the four above profit definitions, at least two variations exist: before or after exceptional items.
The good news is that all that is reported is 100% true. There is a caveat. All that is reported, especially by a company on its own performance, is said in such a way as to make the company look great. It is only when reading the fine print that the complete reality and thus the full truth might be revealed.
In the absence of a primary source it is courageous (read ‘reckless’) for a strategic analysis department to report on a competitor’s profit based on a vague secondary source.
A simple rule of strategic analysis work is whenever primary sources are available, secure them. Only use secondary sources when primary sources are not accessible or available. If you have to, in the absence of primary sources, use secondary sources, but do so with sufficient caution regarding the reliability of the data.
INTERNAL VERSUS EXTERNAL
Internal sources are defined as sources that work in the same firm as the analyst or originate from the same firm that employs the analyst. External sources are not related to the same firm. When working with internal and external sources it is recommended in advance to consider two dimensions:
• Bias
• Accessibility
Internal sources may be deeply biased regarding a topic of strategic analysis. A classic phrase is that it is hard to have a useful meeting with a turkey on Christmas. In other words, don’t automatically expect the firm’s staff to give a neutral opinion on the situation at a competitor or a customer or a supplier. The firm’s staff may feel personally threatened by an issue at hand related to another company. As a result, the source may aim to colour the picture in the way that he perceives to serve his interests best. However, external sources may also not be bias-free. To add to the confusion, the external source bias is probably different from that of the internal source…
Below two generally common biases are discussed to illustrate how analysis can be deceived by missing out on biased sources. The first bias is believing that a human reporting data as a source had the full picture (Friedman, 2010b):
“There is value in sources, but they need to be taken with many grains of salt, not because they necessarily lie but because the highest-placed source may simply be wrong (…). If the purpose of intelligence is to predict what will happen, and it is source-based, then that assumes that the sources know what is going on and how it will play out. But they often don’t.
(…) The purpose of intelligence is obvious: It is to collect as much as
information as possible, and surely from the most highly placed sources.
But in the end, the most important question to ask is whether the most highly placed source has any clue what is going to happen.”
In this quote, Friedman implicitly suggests that evaluation of the input of the source is ignored. As a result, the source’s view on what will happen by definition becomes the strategic analyst’s view. That should offend any analyst. A good analyst has the checks and balances in place to evaluate sources, prior to communicating any conclusions to decision-makers about the work that she did. This is not to say that Friedman’s warning should not be taken seriously. It apparently is based on experience, unfortunately also my own. Human sources are indeed sometimes indispensable in collecting data. When a human source allows an analyst to turn a mystery into a puzzle, the all-too-human tendency is to overestimate the accuracy of that source’s message. The reliability of the source becomes increasingly questionable when more time has passed between the moment of the event reported and the moment of the report. Footnote 1 to chapter 8 encourages keeping a sober view on the limited accuracy of the human memory. These risks are inevitable when working with human sources.
Below I will provide a second example of a human source bias. This bias could well be summarized as the source that happens to have an opinion. Jonathan Powell, the Chief of Staff of the Blair administration in the UK (1998-2007) provides us with an amusing example of this bias (Powell, 2011b):
“We had two advisers in (Downing Street) Number 10, one madly for the Euro and one madly opposed. I used to think that, if you were a foreign intelligence service and had the first of the two on your payroll, you would be certain Britain was about to join the Euro and be equally sure we were not if you employed the other.”
The advantage of company-internal human sources is that they tend to be more easily accessible than external sources. In a firm where cooperation between colleagues is valued, colleagues generally reply positively upon requests for help on data.
Humans are a strange species. On the one hand, humans kill other humans (i.e., animals of the same species.) This is very rare if not absent in the animal world. On the other hand, humans tend to be helpful to other humans, even when knowing there is no reciprocity to be expected (e.g. in charity donations). Intra-company requests to colleagues, even when the purpose of the request, as so often, cannot be fully shared by the strategic analysis department, tend to be answered quickly and correctly. Knowing internally which source knows what therefore is a key asset for a strategic analysis department.
As people tend to work together much easier with people they know than with people they have never met, it also means that the analyst should be known inside his own firm. Those colleagues who benefit from deliverables (e.g., by receiving a regular news and analysis letter), will generally be even more inclined to help. In the best of cases such colleagues will start to send their information proactively to the analyst.
A second benefit of working with internal human sources is that questions can be asked without having to hide the strategic analysis department’s intent. The degree of openness that can be used depends on the confidentiality of the topic at hand. With internal sources, at least direct questions are possible.
In contrast, when working with external sources, it is necessary to ensure that no suspicion is created in the mind of the source. Revealing, even unintentionally, what a strategic analysis department wants to know is not a good idea, given the confidentially and/or strategic character of what a strategic analysis department generally works on.
Given the fact that both internal as well as external sources may be biased, with different biases for sure, there is no automatic preference for either of the sources. The big advantage of internal sources is the fact that they, as colleagues, probably collaborate more easily and less suspiciously. Moreover, using internal human sources is usually more time-efficient, not the least because direct questions can be used.
What should be prevented at all times is paying for external sources to discover data that are available within the firm – but only in unknown sources. There are few better options for a strategic analysis department to more quickly and effectively lose its credibility than to pay big-time for data that is already in the possession of the firm.
As a rule, strategic analysis departments should strive to comprehensively explore and approach all internal sources prior to signing off on orders for fee-based external data.
FREE-OF-CHARGE VERSUS FEE-BASED
The difference between free-of-charge and fee-based sources in strategic analysis work limits itself to public domain or open sources. In any decent firm, the code of ethics rightly prevents the firm from bribing third party human sources. It goes without saying that doing so is unacceptable at all times. Stories about the KGB paying its agents in foreign government jobs are great for history books. For some delightful examples of KGB Christmas bonuses to French agents check out the history of the KGB (Andrew, 2001d). Those type of practices, however, have nothing to do with strategic analysis in business and moreover are generally probably less romantic than spy fiction makes them.
When considering whether to use paid sources the question should not be what amount of cost is involved, but whether the fee paid will be rapidly earned back by the improved quality of the decision-making.
Common paid open sources include databases such as but not limited to Dun & Bradstreet, which provides financial statements on companies in an easy-to-read format. Many structured news providers also charge a price per article or a fixed subscription price. Databases and other free and fee-based sources are covered in chapter 6 and in Appendix 2.
Generally, strategic analysis projects and processes will need a mix of fee-based and free-of-charge open sources. Company statements are of course for free and so are many web-based sources. Few strategic analysis projects, with the exceptions of some small errands, can be executed without using fee-based sources. In budgeting for a strategic analysis department, the subscription cost to various paid sources may be amongst the largest items paid for.
SMALL VERSUS BIG
In a figurative sense, small sources deliver small amounts of data and big sources deliver big data. Big data as a term has been around since the last quarter of 2010 (Davenport, 2014d).
Big Data may include both internal company data as well external data concerning the business environment of a firm. The criterion that determines whether a data set qualifies to be called big is that the amount of data is truly big: think of petabytes. Big Data forms a rapidly developing field of business information, enabling companies to for example analyse millions and millions of user acts (e.g., user mouse clicks on a website like LinkedIn) per day in search of ever better, more intuitive offers to their customer or visitor base.
Big Data differs from Small Data not only in quantity but also in collection process. Big data tends to come in continuously and keep coming in – think of mouse clicks on a 24/7 global website. Small Data comes in ‘one-off batches’ – think of monthly market share data – a typical recurring external data point, or monthly sales data by country by product – a typical internal data point.
Some Big Data indeed considers the external environment of a firm and thus may qualify for strategic analysis (e.g., external visitor interaction with a website) but most focuses on internal data (the firm’s own performance indicators). Due to their continuous flow and the ever-stronger computer processing power available, Big Data may revolutionize decision-making of organizations. Big Data allows for the response time between an anomaly in a signal and a management response to be minimized – with the latter even possibly being the result of an automated script – so without any human interference. However, direct Big Data applications in strategic analysis are still believed to be in their infancy, even though social media sentiment analysis – for example on the public image of brands, which requires big data technology – starts to take off.
As the discipline of Big Data collection on the business environment is still in its infancy, it is hard to draw conclusions where Big Data will lead strategic analysis to in the future. One thing already seems to be clear: collection isn’t and will not likely be the bottleneck in the world of Big Data. Technologies are available to collect the data, whereas making sense of Big Data is proving to be the challenge today. The data analyst profession may be amongst the most promising in the entire computer sector today.
No matter how exciting Big Data and the opportunities Big Data may offer for strategic analysis, it is my experience that often a single data point – so the smallest data set possible – may just make all the difference in an analysis. Though it may be considered to be a bit tongue-in-cheek, here I share a quote I personally quite enjoy, taken from the memoirs of the former head of the East German foreign intelligence service, the extraordinarily capable Markus Wolf. Wolf, I think, highlights just how important it is to find the meaningful small data points, even when big data are available (Wolf, 1998):
“Intelligence is essentially a banal trade of sifting through huge amounts of random information in a search for a single enlightening gem or an illuminating link.”
GOOD VERSUS BAD
The good news is that there are no bad sources. For sure, there are sources that are more and that are less reliable. Knowing the reliability of a source in advance of using it is a great asset when considering the use of sources. The North Korean newspaper Nodong Sinmun may, for example, not be the most reliable source on news about South Korea. Meanwhile, it may be one of the few accessible and reliable sources on North Korea, which in some instances, but certainly not all, remarkably frankly reports on topics concerning the North (Mercado, 2009b).
Sources are not intrinsically good or bad. A source is an instrument. For accepting or rejecting a hypothesis, one source may simply deliver a key data point more cost effectively than the other. However, when another hypothesis is concerned, another source may be preferable: horses for courses. In the collection plan, which I will discuss later in this chapter, links between hypotheses, key information needs and the (expected) most reliable and efficient source(s) are summarily worked out, for execution during the actual collection.
SOURCES MAY ALSO BE SEGMENTED BY THEIR NATURE
Where the above segmentation was generic and irrespective of the nature of the source, the nature of the source as such also guides source segmentation. In this book, I present four different natures. To stay close to marketing parlance I will here (again) define the ‘four Ps’: People, Papers, Pictures and Products.
The four Ps, as they often referred to in marketing, are just a flashy way to present an existing concept, in this case an existing classification – the four Ps link one-to-one to the traditional taxonomy of US intelligence collection (Clark, 2007f):
• people |
or human intelligence | or HUMINT | chapters 4 &5 |
• papers |
or open source intelligence | or OSINT | chapter 6 |
• pictures |
or imagery intelligence | or IMINT | chapter 7 |
• products |
or measurement and signature intelligence | or MASINT | chapter 8 |
Other collection methodologies exist in military intelligence, for example ELINT (electronic data interception intelligence) or its sister methodology SIGINT (signal intelligence). ELINT is probably illegal in any business application, so by definition out of scope. SIGINT is a euphemism for eavesdropping. As a rule, all SIGINT for business applications is ethically inappropriate in any company that embraces fair business practices – even when some SIGINT methods may be legal. Therefore, SIGINT like ELINT will not be covered in this book.
HUMINT concerns data collected from and through human sources. OSINT relates to the collection of data from sources that are available in the (written) public domain. IMINT focuses on intelligence collection from images. Finally, MASINT is a collection process to get product samples. The latter are usually processed/analysed to obtain insights by deploying back-engineering or other laboratory analysis techniques. The remainder of this chapter focuses on valuation of sources, subsequently of terminology and on collection as a support process for strategic analysis in general.
VALUATION |
What is true in many management disciplines also holds truth in strategic analysis: garbage in, garbage out. The lower the reliability of the sources, the lower the reliability of the output of the strategic analysis process. The user of the output is unknowingly exposed to serious risks particularly when the output is presented without an indication of the (low) reliability.
There are no firm rules for the reliability of sources. Normally, for example, company annual reports that contain a statement of approval by a certified accountant tend to present reliable figures. After Enron, Parmalat and Ahold, to name a few exciting cases of accounting frauds, real figures may not be what they look. These, of course, were cases of deliberate fraud.
The message is simple: reliability is never to be taken for granted. On the other hand, paranoia over the reliability of sources is also not an approach that is deemed to be effective.
MASINT TENDS TO BE THE MOST RELIABLE SOURCE
Balancing between naivety and paranoia, diagram 3.1 gives an experience-based review of the reliability of common sources.
MASINT – in other words, the laboratory results of analysis on the competitor’s products, for instance – clearly features among the most reliable sources. Provided the analysed sample was representative, there is little room for bias when laboratory best practices have been used.
Certified OSINT delivers, after MASINT, among the most reliable input data or information that is obtainable. When certified it means data or information that prior to their publication have been reviewed by an accountant or a lawyer: for example, a firm’s annual report or an Information Memorandum. Both types of documents contain only information that is 100% true. This is true in normal cases, assuming there is no fraud. Information that is 100% true is great, but as has been said before, all that is made available that is 100% true does not necessarily make up the full truth, so having flawless puzzle pieces still doesn’t mean that we have a complete puzzle.
External and internal HUMINT sources both rank from highly reliable to not so reliable. Internal HUMINT sources may be biased because, for example, they only want to share part of the story that suits them in their career. They may also unknowingly have had biased or poorly informed sources, which are beyond the strategic analysis department’s control to check.
External HUMINT sources may well be primary sources, but should always be treated with caution. They too may have an interest in sharing only the part of the story that suits them personally. Or, even more common, they may themselves simply not have the full picture. There is, or there may be, a big difference between the opinion or speculation of a certain person in a firm on a particular matter in that firm and the official company policy or strategy regarding the matter. As was pointed out by Friedman, the person either may not know the full story or may simply phrase his personal opinion. This source’s input may not be allowed to inform the next choices the external firm’s management as a whole may make.
In the worst case, a person in a competitor firm may intentionally provide an incorrect story that is illustrated with incorrect facts, just to deceive its competitor.
IMINT with expert input is normally more reliable than IMINT without expert input. With expert input is meant that a neutral expert has been identified to assist in interpreting the materials that have been obtained.
Validated (catalogued) OSINT – information accessible via databases or libraries – tends to be (much) more reliable than information that is just out there (available on a non-validated website. Specifically, information from social media, online blogs and random clippings are among the least reliable sources).
For strategic analysis work, the above general reliability assessment leads to a priority of choosing sources. Remember: sources are an instrument, not a goal. The goal remains to efficiently and effectively solve the strategic analysis project’s core problem. Whatever sources are needed to do so should be approached. When, however, choices can be made, the priority that is introduced below may be helpful.
When useful for achieving the objective of a strategic analysis project, MASINT should always be done with priority. Certified OSINT sources should also be used with priority, not the least because they are often accessible 24/7, fast and easy to locate and often cheap. When these sources do not meet all the information needs, think carefully which HUMINT sources could be reliable and (cheaply, rapidly) accessible. Always bear in mind potential biases of the sources to be approached. Think in advance how to avoid getting trapped in a biased story that is great to hear but foolish to believe. Where needed, do take a xenocentric view: it too often matters more who says a thing than what is being said.
When image material is available it should be used. This is especially true when trustworthy experts are around to see what non-experts wouldn’t have seen.
Validated/catalogued OSINT sources are normally a good choice: they tend to enable efficient work. These catalogued databases may, however, consist of newspaper articles which may not be as complete and/or correct with quantitative facts as needed.
When all the above sources do not provide all the identified information needs, the inevitable web search is needed. The search may uncover pearls but generally these have to be found among dozens if not millions of shining but worthless glass beads. The Internet in strategic analysis work, however large it may be, is and remains the data provider of last resort. For clarity purposes, catalogued databases that happen to be offered via the web are not included in the above reasoning. Non-validated internet sources generally should get the lowest reliability score. They also deserve to get the lowest priority in the collection efforts of a professional strategic analysis department.
SUBJECT |
Particular topics may have associated jargon which can only be understood by insiders. Prior to an analyst starting a search, the preferable route is to locate an insider and get initiated into the jargon, regardless of the methodology of searching and/or the sources.
For example, yoghurt drinks, fermented dairy drinks, dairy drinks, sour milk drinks and drinking yoghurts may all mean the same thing, even though dairy drinks may include (flavoured) milk drinks which technically are different. When exploring the market for the entire ready-to-drink yoghurt category (to use another similar term), it is essential to have a grasp of the jargon to avoid missing out on vital data.
Another business discipline that is rich in synonyms is finance. Profits, earnings, results, yields and returns may all be used as synonyms for financial gains (another synonym). The complications get broader when sources in different languages are to be searched.
The recommendation that is made is first to get familiar with the jargon and only afterwards start the search. The preferred source for the jargon is the insider (a trusted human source). In the absence thereof, encyclopedias – either online, like Wikipedia or hard copies – or dictionaries may get you started.
What is true for synonyms is even more relevant for translations of names from different alphabets. Arabic, Chinese or Russian names are easily misspelled in Latin script (and vice versa). Arno Reuser is a great friend of mine. He pointed out that in open sources alone the number of different ways used to spell the name of the former Libyan head of state, Colonel Khadhaffi, in the Latin alphabet went into double digits. When searching for the former Libyan head of state (using a Latin alphabet), checking which spellings are commonly used may assist in not missing out relevant data. In chapter 6 on Open Source Intelligence we cover the challenge of language, spelling and jargon in more detail.
DATA |
Most (larger) strategic analysis assignments are run as projects. Projects usually start with a Project Start Up meeting (PSU meeting). In the PSU-meeting, the objective (or ‘what’) of the project is defined, next of course to the ‘how’ which makes up the project plan. The plan also answers questions like ‘who’ is to be involved, ‘when is the deadline’ and ‘where is the work to be executed.’
In strategic analysis projects, data requirements are usually defined in an early phase of the project, if not at the very start during the PSU meeting. Data requirements are the input to what is normally a sub-task within a strategic analysis assignment: the data collection plan.
Data collection plans are normally organized as source-centred plans. Table 3.1 gives an example of what typically a collection plan’s position and content is in a strategic analysis assignment.
Activities |
|
PROJECT PLAN |
Define data needs |
COLLECTION PLAN |
- Take holistic view on the data needs - Brainstorm on which sources may have relevant data on the topic - List the potential sources: HUMINT - contact A for not-too-narrow sub-topic P - contact B for sub-topic Q - contact C for sub-topic R OSINT - check database K for data on need X - check newspaper L for data on need Y - check handbook M for data on need Z IMINT - have image G evaluated by expert D, etc. |
COLLECTION EXECUTION |
Execute collection by following the so-generated leads, taking into account factors like reliability, cost, time requirement and efficiency |
ANALYSIS |
- Evaluate results of collection: do more collection where needed - Use or reject obtained data points based on actuality, risk of bias,reliability and/or other factors |
Table 3.1 hopefully makes clear that compiling a collection plan is not rocket science. The value of formally writing a collection plan is the discipline that it brings to the data collection. By formally brainstorming on what sources to capitalize on for which data need and recording the outcome, the chances of overlooking high-value, low-cost sources of data are substantially reduced. For this reason alone, even after more than two decades of routine in executing strategic analysis assignments I still wouldn’t start the smallest errand without making a proper list of data needs and the connected data sources.