Chapter 10
Best Practice #9
Support Analytics with Enterprise Data Governance
“For data only two moments really matter- the moment of data creation and the moment of data use. And they both don’t happen in IT.”
Tom Redman
Today, businesses create a large and varied amount of data. If the data that is ingested into the IT systems is not taken care of or governed, there will be a poor quality of data in the enterprise, ultimately resulting in poor business results. Data governance is a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods [DGI, 2020]. At a tactical level, data governance is the collection of policies, processes, roles, standards, and KPIs that ensure high-quality data is used across the business organization in a compliant way.
In simple words, data governance is the right people managing the data in the right manner in accordance with the nine key data governance principles or areas listed below.
Why is this a best practice?
Why is governing data a best analytics practice? There are three main reasons for companies to govern data. Firstly, today data is created in business at a rapid pace through different mechanisms and channels. This is resulting in a huge and a complex amount of data, causing data inconsistencies that need to be identified and addressed.
Secondly, data literacy and self-service analytics (SSA) are creating the need for a common definition and understanding of data across the organization. The democratization of data and analytics is creating an increasing need for a common and standard data model to enable better communication in the enterprise.
Thirdly, businesses need to adhere to compliance requirements like regulatory mandates (such as SOX, GDPR, and HIPAA), industry standards (such as ISO/IEC 38500 and UNSPSC), and even internal business policies. Without effective data governance, the data inconsistencies in different IT systems in the organization might not get resolved. For example, customer names may be listed differently in CRM, ERP, and customer service IT systems. This could complicate data integration efforts and create data integrity issues that affect the quality of the insights derived.
Data governance provides processes, roles, policies, standards, and KPIs that ensure high-quality data in the business. An enterprise data governance program typically results in the development of common data definitions and internal data standards that are applied in all business systems, boosting data consistency for both business and compliance uses. Also, data governance promotes strong compliance with security and privacy, which is achieved by locating critical data, identifying data owners and data users, and assessing and remediating risk to critical data assets.
Realizing the best practice
Effective data governance takes an efficient combination of people, process, and technology on how data is generated, stored, used, and maintained across the data lifecycle (DLC). There are three key capabilities in implementing data governance in the enterprise.
Identify the data assets to be governed
As discussed in best practice #3, there are many types of data assets in a business enterprise – reference data on business categories, master data for business entities, and transactional data on the business events. The first step in data governance is to identify the specific categories of data objects to govern. While the specific data type to govern depends on the industry sector and the business need, data governance works effectively when the data assets are managed early in the data lifecycle. Specifically, data governance should identify the reference data (on business categories), and master data (on business entities) as these data elements are shared and used enterprise-wide in business transactions like purchase orders, sales orders, and invoices.
Data governance practices should first ensure high quality of reference data and master data in the enterprise. Data quality will be high when the data is initially created and captured. The moment data starts getting shared with different LoB and IT systems, then managing the data quality becomes challenging. A simplified example of a data flow diagram in a business enterprise is as shown in the figure below. The need for data governance is high in the transactional or data capture systems, that is, in the initial stages of the DLC. The result is that data quality will also be high.
Identifying the data owner, stewards, and custodians of these data assets
Data governance must be led by the data owners and supported by data stewards from the business and data custodians from IT. All three, data owners, data stewards, and data custodians, should jointly take responsibility for the quality of the data in the enterprise. But what exactly is the role of the data owner, data steward, and data custodian?
Data custodians work with data stewards to gain a better understanding of the business and data requirements. In short, data stewards are responsible for the content in the database, while data custodians are responsible for the technical environment of the database. Both the data stewards and data custodians who are responsible for data quality work under the strategic direction of the data owner who is accountable for the quality of the data object. Below is an example of data governance on a store master data in a retail company.
Set up the process and KPIs to govern these data assets
The third capability in having effective data governance is setting up the processes to govern data assets. Data quality typically degrades over time as businesses are constantly evolving and changing entities. Hence data should be monitored for quality throughout its lifecycle with appropriate KPIs using baselines, thresholds, and targets. Ensuring conformance to those values, and effectively communicating the KPIs to stakeholders will help the business take corrective measures. The data stewards and data custodians should together plan and execute the data governance process that covers the following important aspects:
Conclusion
A good data governance program with consistent processes and responsibilities ensures high data quality. Today data governance is not an option. It helps in risk mitigation as businesses today hold incredible amounts of data about customers, suppliers, prices, products, employees, and more that need to be complied with laws, regulations, industry standards, internal business processes, and ethics. Hence data governance helps businesses to properly and proactively manage data and reduce its financial and compliance liability. This means good quality data, better analytics models, better insights, better business decisions, and ultimately offer superior business results.
References