Arvind Sathi (IBM) authored this chapter.
The term “communications service provider” (CSP) refers to a broad category of companies, including telecommunications, wireless, Internet, cable, and satellite service providers. Over the past few years, CSPs have experienced a proliferation of customer data due to a rapid rise in the use of smartphones and similar devices. However, most of the revenues and market valuations have accrued to application providers and device makers. CSPs have seen significant increases in data volumes on their networks without corresponding gains in revenue.
In response to market pressures, CSPs are now in a race to better understand and use this newly available data for sales, marketing, and customer service. A fair amount of this data is in the area of big data. CSPs cannot survive unless they manage, organize, and monetize their big data. Big data governance is, as a result, an emerging imperative for CSPs today.
There are three broad categories of questions emerging in the area of big data governance for CSPs:
This chapter elaborates on these questions and provides partial answers as known today. These questions are fundamental to the competitiveness and survival of CSPs in a market where green field companies are enjoying record-breaking valuations for innovative use of big data analytics. Although the market is in the early stages of evolution, CSPs are in the forefront in terms of big data governance. The answers to these questions will also be applicable to other industries.
CSPs have aggregated some of the largest collections of big data, which can be categorized into these types:
During the 1980s and 1990s, CSPs created a series of departmental applications based on the business cases associated with workforce automation. The result was a series of departmental databases containing customer, product, and related data. While the billing and sales views were often overlapping in these applications, it was not easy to map one to the other.
The past ten years have seen a rapid rise in master data management for customer and product data across the enterprise. Analytics applications were the first consumers of master data to create mappings across multiple hierarchies as well as fragmented customer and product identifiers. Master data management then graduated to transactional applications, with much of the focus on business solutions, specifically CRM and billing systems. CSPs can now use big data to build a comprehensive view of customer, network, and external data, as shown in Case Study 20.1.
Jim and Mary Smith have two children, Corey and Karen. The family has four phones, one for each family member. Corey and Karen are in high school and have basic phones for calls and text. Jim has an iPhone and uses it primarily for office calls and emails. Mary has an iPhone and a WiFi-only iPad. She uses her iPad for investment research and to participate in financial blogs.
Jim received a new iPhone from his employer as part of an upgrade program. He decided to give his older iPhone to Karen. Karen decided to sell her basic phone to a friend. Since they were in the last six months of their contract for Karen’s phone, the Smith family decided to keep it on their plan until the end of the period. Karen’s friend paid her for basic phone and messaging services.
The CSP providing phone service to the Smiths had done extensive householding analysis to develop a customer hierarchy of residences. It tagged phones to users and connected all the users to the family account. After the changes mentioned, the CSPs analytics applications would likely display abnormal calling patterns for the users compared to historical norms. In addition, Jim’s old iPhone would show a number of web transactions that tracked to Jim’s user ID, but exhibited web browsing behaviors characteristic of teenagers.
This case study exhibits the dynamic nature of customer usage and social media data, as well as the complexities of discovering customer hierarchies and householding. The case study would be even more dynamic if Corey were to borrow Jim’s phone during a trip for a day or two. In growth markets, providers of prepaid services see massive churn in their customer base as consumers switch suppliers based on costs. The usage information can be used to identify a subscriber even as he or she switches telephone numbers.
Big data has opened up a number of new sources for customer master data. For example, network data such as product usage and trouble records can yield valuable insights when augmented with customer-specific information. Case Study 20.2 uses publicly available information to describe the use of big data at T-Mobile.
T-Mobile has built a 1.2 petabyte data warehouse that processes 17 billion events per day, including phone calls and text messages. In the first phase of the project, T-Mobile used this information to optimize the performance of its network assets. However, once this information was consolidated in one place, it began to be accessed by many users from finance, sales, and marketing. In a subsequent phase, T-Mobile planned to use the data warehouse to personalize interactions with the customer.
This case study demonstrates how CSPs can radically evolve master data with big data analytics. Network data provides the best view of customer usage and trouble information. If this data is harnessed and offered as a strategic asset to others in the organization, it can provide a far more comprehensive understanding of the customer.
How should CSPs safeguard this data? When CSPs integrate location and calling patterns for an individual with his or her personally identifiable information (PII), the data is subject to abuse. Often, the best approach is to store the customer master data without PII.
Most subscribers exhibit constant patterns of use and location. At Northeastern University in Boston, network physicists discovered just how predictable people could be by studying the travel routines of 100,000 European mobile-phone users. After analyzing more than 16 million records of call dates, times, and locations, the researchers determined that, taken together, people’s movements appeared to follow a mathematical pattern. The researchers said that, with enough information about past movements, they could forecast someone’s future whereabouts with 93.6 percent accuracy.2 Unless the CSP is targeting a specific subscriber, knowledge of subscriber behavior at such a high level of accuracy is sufficient for most marketing campaigns.
Even if the CSP drops the PII, the usage and location data provide a wealth of information for marketers. However, how do CSPs know that the data has truly been made anonymous? Calling patterns are sometimes so specific to individuals that, even without PII, marketers can spot the individuals by matching their behaviors. Big data governance needs to monitor the formation of MDM and safeguard its use based on privacy policies, which we discuss in the next section.
Chapter 14, on machine-to-machine data, provides a detailed account of the privacy implications of location data, as well as emerging regulations in the United States and Europe. Use of location data is a double-edged sword. On the one hand, a number of potential applications would be well-received by consumers. For example, the use of location data to locate a lost child in an amusement park is a very positive customer experience. On the other hand, unauthorized use of location data by third parties is a potential bombshell that could lead to a serious customer backlash.
CSPs are actively seeking to monetize their location data, a process to convert this data into money by selling it to third parties or using it to develop new services. However, any viable big data program that uses location data must be sensitive to growing legal and customer concerns, and the associated risks. Because they must protect their core communications businesses from customer and regulatory backlash, many CSPs are erring on the side of conservative policies for the use and sharing of location data. For example, a large CSP has decided that any location data can be monetized, as long as it is at least one day old.
Aggregate location data is relatively easy to use. Acceptable location-based services include the use of driver cell phone locations to ascertain traffic congestion, and the use of aggregate attributes of the stadium audience to display electronic advertisements. However, any targeting of advertisements at the individual level gets to the heart of the privacy controversy. Leading CSPs have established a series of use cases that adhere to strict privacy guidelines. Both policies and use cases are in continuous flux in the context of constantly evolving regulations and market customs.
Big data brings new challenges to data quality management. If properly governed and managed, internal data quality can be measured and controlled. Although CSPs have limited control over external data, they must assess its value and quality. The merging of internal and external data should be done carefully, based on an understanding of the quality of the external data and an appreciation for how the merged data will be used. Consider Case Study 20.3, regarding the use of Twitter data.
A CSP launched a new product nationally and gathered data relating to product sales, trouble tickets, network usage, and Twitter. A number of Tweets showed consistently negative sentiments from Twitter users about the product. The marketing team was concerned that the data was an anomaly because product sales were brisk, and there was no significant increase in the number of trouble tickets.
Why was the Twitter data so out of whack? A closer analysis revealed that older customers were relatively happy with the product and used surveys and trouble tickets to provide feedback. On the other hand, the product was not doing well with younger customers. These customers did not rely on traditional means of feedback and had been using Twitter to discuss the product in a negative way.
Because social media information is mostly self-reported, it is somewhat more prone to biased sampling. The CSP adopted a process to deal with big data quality during data aggregation. The social listening team began to report the overall confidence level with the Twitter data, especially since it did not represent the entire population.
Big data can mean big storage, assuming all the data needs to be stored. Contrary to traditional data warehousing and analytics, CSPs can perform big data analytics at the time of data collection. As a result, CSPs might need to maintain a relatively small subset of the data, such as samples, filters, and aggregations, in tier-one storage. Big data also provides its own tier-two storage environment. Large quantities of unstructured data can be placed in Hadoop, which can be MapReduced later for any meaningful insight. A number of query tools are now available for large-scale queries on this data. These topics are discussed in the remaining chapters of this book.
The beginning of this chapter listed three issues for which we have provided partial solutions, as summarized below:
This is a new field, and CSPs are breaking new ground in terms of big data governance. CSPs are sure to find new solutions to data quality, master data management, data privacy, and information lifecycle management as they deal with big data governance.
1. Leslie, Alex. “T-Mobile crunching 17 billion transactions a day—what does it do with all that data?” The “connected planet unfiltered” blog, June 29, 2011. http://blog.connectedplanetonline.com/unfiltered/2011/06/29/t-mobile-crunching-17-billion-transactions-a-day-%E2%80%93-what-does-it-do-with-all-that-data/.
2. Hotz, Robert Lee. “The Really Smart Phone.” The Wall Street Journal, April 22, 2011. http://online.wsj.com/article/SB10001424052748704547604576263261679848814.html.