35,99 €
2.5 quintillion bytes! This is the amount of data being generated every single day across the globe. As this number continues to grow, understanding and managing data becomes more complex. Data professionals know that it’s their responsibility to navigate this complexity and ensure effective governance, empowering businesses with the right data, at the right time, and with the right controls.
If you are a data professional, this book will equip you with valuable guidance to conquer data governance complexities with ease. Written by a three-time chief data officer in global Fortune 500 companies, the Data Governance Handbook is an exhaustive guide to understanding data governance, its key components, and how to successfully position solutions in a way that translates into tangible business outcomes.
By the end, you’ll be able to successfully pitch and gain support for your data governance program, demonstrating tangible outcomes that resonate with key stakeholders.
*Email sign-up and proof of purchase required
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 661
Veröffentlichungsjahr: 2024
Data Governance Handbook
A practical approach to building trust in data
Wendy S. Batchelder
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Apeksha Shetty
Publishing Product Manager: Apeksha Shetty
Book Project Manager: Aparna Nair
Senior Editor: Sushma Reddy
Technical Editor: Seemanjay Ameriya
Copy Editor: Safis Editing
Proofreader: Sushma Reddy
Indexer: Rekha Nair
Production Designer: Prashant Ghare
DevRel Marketing Coordinator: Nivedita Singh
First published: May 2024
Production reference: 2130625
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK
ISBN: 978-1-80324-072-5
www.packtpub.com
To my husband, for supporting my weekly writing sessions, encouraging me to keep going and pour my heart and experience into this book, being my life partner, and sharing this beautiful, messy, and incredible life with me.
To my children, who inspire me to stay curious, ask questions, and help others see the beauty in simplicity, love, and inclusion, every day.
To my father, who told me I belonged in tech, even when I was asked if I was in the wrong room on the first day of my first IT class, and who encouraged me to keep showing up and taking up space, no matter what others thought. Your encouragement inspires me every day, which in turn, impacts others, long after your passing.
To my teammates, mentors, mentees, and sponsors – thank you. You have taught me more than I can ever put into words. Thank you for inspiring me.
– Wendy S. Batchelder
Much more than an innovative reference work for data governance professionals alone, this text is a beacon for anyone leading data-driven initiatives. Wendy is an exceptional guide through the complexities of data governance while maintaining a rigorous perspective on the business impact of data. A must-read for those aspiring to transform an enterprise through the power of information.
Dave Mayer, Vice President | Program Director, Gartner Data and Analytics Research Board
As a sales leader, there is nothing more critical than understanding and interpreting data. One of the biggest challenges is assessing integrity, and data governance has a direct impact on integrity. What I love about Wendy’s work is it’s not about data in a vacuum; she provides a level of business acumen and outcome orientation that far exceeds many practitioners in this field.
Melissa Steffen, General Manager Sales and Customer Success, Thomson Reuters
Wendy’s most recent Data Governance Handbook, “A practical guide to building Trust in Data”, for me, proves to be an industry-agnostic roadmap for success; with two of its primary on-ramps being 1) Trust and 2) Data! Wendy offers a pragmatic and reusable framework that can be adapted and adopted by all enterprises and organizations, unlocking meaning for every reader and organizational role from the “Boardroom to the Back Office”. Wendy’s transparent approach in unveiling the journey to building Trust in Data is refreshing, and her methodical principle-based approach, using practical industry-proven techniques, hints, and use cases, undoubtedly equips and empowers data and non-data practitioners, business leaders, and executive sponsors with a roadmap to success!
Stephen Harris, Former Corporate Vice President, Cloud + AI, Microsoft
Wendy Batchelder’s guide to data governance is an essential resource for anyone looking to harness the vast potential of data in a structured and effective manner. With her extensive experience as a Chief Data Officer and her clear, insightful approach, Wendy not only demystifies complex data governance concepts but also connects them directly to tangible business outcomes. This book is a must-have for leaders who are serious about making informed decisions that drive company success. Her practical frameworks and real-world examples equip professionals to launch and sustain impactful data governance initiatives with confidence. A compelling read, highly recommended for those committed to transforming their organizations through data.
Sadie St. Lawrence, Founder and CEO, Women in Data
The relevance of this book expands well beyond just data governance professionals and applies to GTM operations professionals as well, especially as many look to leverage AI capabilities to make every customer engagement more intentional. Wendy simplifies the complexity of data governance, making it easy for multiple functions within a global organization to apply her best practices accordingly. I recommend this book to any GTM or revenue operations leaders.
Ryan Mac Ban, President of Americas, UiPath
Wendy S. Batchelder is a three-time chief data officer with a wide understanding of how to take highly technical aspects of data and analytics and translate them into simple, concise business-valued solutions that are practical and easy to understand. Her background has led her to lead global data and analytics organizations at four Fortune 500 companies, including Wells Fargo, VMware, Salesforce, and now, Centene. She approaches situations with curiosity and humility, which has led to applying innovative data solutions to challenges with increased complexity to deliver value that companies can measure.
A lifelong learner, Wendy graduated from Miami University with a BS in accounting with a minor in information systems, from Drake University with a master’s in accountancy, and from the University of Iowa with an executive MBA, and she has completed ongoing professional education at Harvard Business School.
Wendy resides in West Des Moines, Iowa, with her husband and six children.
Ankur Roy is a solutions architect at Online Partner AB in Stockholm, Sweden. Prior to this, he worked as a software engineer at Genese Solution in Kathmandu, Nepal. His areas of expertise include cloud-based solutions and workloads in a diverse range of fields such as development, DevOps, and security. Ankur is an avid blogger, podcaster, content creator, and contributing member of the Python, DevOps, and cloud computing community. He has completed all the available certifications in Google Cloud and several others in AWS and Azure as well. Moreover, he is an AWS Community Builder.
The world generates ~2.5 quintillion bytes of data every single day (and growing!). As a result, understanding and managing the data created becomes more complex every single day. It’s our job to drive simplicity, understanding, and ease of use to make accessing and using data as easy and understandable as possible.
As a data professional, our role is to ensure we can govern data and empower our businesses with the right data, at the right time, with the right controls. This book is a comprehensive guide on how to better understand what data governance is, its key components, and how to successfully position solutions in a way that translates into real, understandable business results. After reading this book, you will be able to successfully pitch and gain support for a data governance program, with measured outcomes in terms the business will understand and deeply value.
We will move from establishing a Chief Data and Analytics Office and building a business case to successfully implementing the more technical capabilities that any CDAO will need to deliver to drive successful data management. You will notice in this book that I emphasize the “why” behind these capabilities. In my experience, simply explaining what the capabilities are without being able to see how they can impact a business is a recipe for failure.
There are many more technical books available on each of these topics, and where I hope this book will provide a variance from what is already available is that: business value. I will spend time explaining the technical capabilities in terms your business stakeholders can understand, with the aim of creating a business-led program.
Ultimately, if you want to get into details about a specific tool or technical implementation, there are loads of other great resources (including books) you can head to drive your implementation, including a wonderful inventory from the publisher of this book, Packt.
This book is for chief data officers, data governance leaders, data stewards, engineers who want to understand the business value of their work. It is also for IT professionals seeking further understanding of data management. Any business leader who wants to better understand data governance would also benefit from learning the basics, as well as any executive finding themselves managing a chief data and analytics officer who wants to better understand the discipline at a higher level. You should have a basic understanding of working with data and understand the basic needs of a business and how to meet those needs with data solutions. You do not need to have the knowledge or skills needed to sell solutions to executives, nor coding experience.
Chapter 1, What Is Data Governance?, introduces you to data governance. At face value, data governance may seem like a cost center, if not approached with value generation in mind. Many companies start a data governance program without the right support, structure, or funding model. First, you will learn the basics of what data governance is and how it relates to adjacent capabilities. Then, you will learn the components of data governance programs, why each component matters, and finally, why to treat data governance as an enabler for business value.
Chapter 2, How to Build a Coalition of Advocates, explores gaining support for your program, which is arguably the most important part of launching a data governance program that drives impact. First, you will learn why and how to identify and secure the right executive sponsor for the data program, and then how to bring in additional leadership support. Lastly, you will learn how to engage and energize the entire company to collaborate toward value-based outcomes that matter to them.
Chapter 3, Building a High-Performing Team, focuses on establishing a high-performing data governance team, which is a critical and long-term investment in the success of a company’s use of data. First, you will be introduced to the key roles in a successful data governance function, how they should optimally structure for results, and finally, how to establish routines and rhythms to support the operations of the team.
Chapter 4, Baseline Your Organization, teaches you the importance of defining a baseline, not only for the organization as a whole but also for individual projects. A key component of measuring success is measuring where you start. You will learn how to capture a baseline and who to communicate it to. Finally, we will discuss how to ensure agreement on the baseline before beginning work.
Chapter 5, Define Success and Align on Outcomes, focuses on the area where many data transformations fall flat – aligning on outcomes that matter to a business. Most data transformations stop with data outcomes and fail to reach the final mile – where the business uses the delivered data capabilities to drive operational efficiency, increased revenues, and better insights. In this chapter, you will learn why defining success beyond data and with the business matters, how to successfully map all relevant stakeholders (including secondary and tertiary stakeholders), and how to translate results into business terms.
Chapter 6, Metadata Management, delves into establishing a high-value, high-return metadata management capability, which is required for any data governance program. The success or failure of a chief data and analytics officer hinges on being able to answer a few fundamental, core questions. Where is my data? Who owns it? How is it classified? Is it safe and secure? Can I leverage it for value? Do I know how to reduce risk? You will learn the answers to these questions and be guided through how to tactically set up a metadata management capability for success.
Chapter 7, Technical Metadata and Data Lineage, explores establishing a high-value, high-return data lineage capability, which is a core capability for any data governance program. Following on from Chapter 6, this chapter focuses on the data supply chain. You will learn the answers to the questions in Chapter 6, with a focus on data lineage, and will be guided through how to set up data lineage for success.
Chapter 8, Data Quality, examines understanding the quality of data and being able to have a defendable stance when it comes to “Can I trust it?”, which is key for any user of data or information that is used to make decisions. Establishing a data quality capability enables the CDO/CDAO and their teams to stand behind their data, being able to defend the quality of the information. This solution can also, when coupled with metadata management and data lineage, lead to a data certification process. You will learn the answers to the questions and be guided through how to tactically set up a data quality capability.
Chapter 9, Data Architecture, delves into data architecture. Designing the patterns and optimal flow of information throughout an organization is sometimes more art than science. With data architecture, you will learn just that. First, you will be grounded in what good data architecture is, when and how it should be applied to an organization, why perfection is not the goal, and when not to involve data architects in a program.
Chapter 10, Primary Data Management, focuses on primary data. One of the core capabilities of any organization is the ability to standardize and conform its most critical information – customer, product, and reference data. By nature, rationalized data provides a solution, whereas data used by multiple divisions for many uses is standardized and cleansed for the benefit of the organization as a whole. First, you will understand what primary data is and is not, clarifying misconceptions. Then, you will be guided through the various types of primary data, how to prioritize, and how to implement a strong and centralized primary data solution that will impact and elevate the power of data into a strategic asset. All the capabilities introduced so far will be woven into this powerful capability to tie them together.
Chapter 11, Data Operations, explores how to run the operations of a data organization, including support for the running of primary data management, data warehouses, data lakes, and other authorized provisioning points managed by the data organization. First, you will learn what data operations are, how to scale effectively, and when to pull in engineering. Lastly, you will learn how to optimize DataOps as a core capability and what opportunities there are to automate.
Chapter 12, Launch Powerfully, examines how to launch a good data governance program, which can quickly lose impact if not launched properly. The importance of the launch cannot be underscored enough. You will learn how to create simple and strong core messaging to engage and clearly articulate to the stakeholder community what and how delivery will be accomplished. Then, you will be led through the creation of a launch plan, a design of feedback loops to ensure continuous improvement, and finally, how to report on an ongoing basis for impact.
Chapter 13, Delivering Quick Wins with Impact, delves into the post-launch period, when the data governance team must quickly begin to deliver results. The time to first value metric should be as short as possible while producing impacts that matter to the business. You will learn how to create momentum through the delivery of quick wins, how to communicate the wins to the business, and how to ensure that the business not only understands the results but also becomes an advocate for the future success of the data governance program.
Chapter 14, Data Automation for Impact and More Powerful Results, focuses on automation, which is a lever that can be pulled to expedite data product deliveries. First, you will be introduced to what automation techniques can be applied to data governance. Second, you will learn how to select the right automation solutions for their transformation. Finally, you will learn how to power their transformation with automation across all solutions.
Chapter 15, Adoption That Drives Business Results, explores business adoption. Now that you have learned what data governance is, how to gather support, design a program, baseline the organization, launch, and deliver against the plan, you need to be able to ensure that your solutions are used by the business. Building an adoption roadmap, you will be able to articulate to your stakeholders how to use the solutions, ensuring a lasting impact of the solutions provided to the organization. Lastly, you will double the impact by ensuring ongoing supports are in place.
Chapter 16, Delivering Trusted Results with Outcomes That Matter, teaches you how to ensure consistency in what was promised to stakeholders versus what was actually delivered. As the implementation of data governance occurs, the chief data/analytics officer and their leadership team must keep all messaging focused on results. You will be guided through how to explain variances in expected delivery versus real results, and how that builds trust. Finally, you will learn how to message back to stakeholders powerfully, during delivery for impact.
Chapter 17, Case Study – Financial Institution, walks you through how to apply the topics covered in this book to an organization with a high degree of regulation (i.e., a financial institution). First, you will find use cases that are unique to this type of entity. Second, you will learn how to pull out the unique requirements and how to adjust messaging, sequencing, and results to accommodate the special needs of a highly regulated organization.
You will learn how to design trust in data governance, starting with a fundamental understanding of what data governance is and the path to align an organization around the need for data governance. You will learn about the subcomponents of data governance, how to implement them, and how to drive adoption within the organization that creates value and ease of use for the business. To get the most out of this book, you should embrace a beginner’s mind, allowing you to relearn your approach to data concepts with fresh perspectives.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Once you’ve read Data Governance Handbook, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Unlock exclusive free benefits that come with your purchase, thoughtfully crafted to supercharge your learning journey and help you learn without limits.
https://www.packtpub.com/unlock/9781803240725
Note: Have your purchase invoice ready before you begin.
Figure 1.1: Next-Gen Reader, AI Assistant (Beta), and Free PDF access
Enhanced reading experience with our Next-gen Reader:
Multi-device progress sync: Learn from any device with seamless progress sync.
Highlighting and Notetaking: Turn your reading into lasting knowledge.
Bookmarking: Revisit your most important learnings anytime.
Dark mode: Focus with minimal eye strain by switching to dark or sepia modes.
Learn smarter using our AI assistant (Beta):
Summarize it: Summarize key sections or an entire chapter.
AI code explainers: In Packt Reader, click the “Explain” button above each code block for AI-powered code explanations.
Note: AI Assistant is part of next-gen Packt Reader and is stillin beta.
Learn anytime, anywhere:
Access your content offline with DRM-free PDF and ePub versions—compatible with your favorite e-readers.
Your copy of this book comes with the following exclusive benefits:
Next-gen Packt Reader
AI assistant (beta)
DRM-free PDF/ePub downloads
Use the following guide to unlock them if you haven’t already. The process takes just a few minutes and needs to be done only once.
Have your purchase invoice for this book ready, as you’ll need it in Step 3. If you received a physical invoice, scan it on your phone and have it ready as either a PDF, JPG, or PNG.
For more help on finding your invoice, visit https://www.packtpub.com/unlock-benefits/help.
Note
Bought this book directly from Packt? You don’t need an invoice. After completing Step 2, you can jump straight to your exclusive content.
Scan the following QR code or visit https://www.packtpub.com/unlock/9781803240725
Sign in to your Packt account or create a new one for free. Once you’re logged in, upload your invoice. It can be in PDF, PNG, or JPG format and must be no larger than 10 MB. Follow the rest of the instructions on the screen to complete the process.
If you get stuck and need help, visit https://www.packtpub.com/unlock-benefits/help for a detailed FAQ on how to find your invoices and more. The following QR code will take you to the help page directly:
Note
If you are still facing issues, reach out to [email protected].
In this part, you will get an overview of how to design a data governance program, starting with the basic definitions, how to design a successful and scalable team, how to gain support, and how to define what success for your team and your company will be when it comes to data governance.
This part contains the following chapters:
Chapter 1, What Is Data Governance?Chapter 2, How to Build a Coalition of AdvocatesChapter 3, Building a High-Performing TeamChapter 4, Baseline Your OrganizationChapter 5, Define Success and Align on OutcomesAs a data professional, some of the most frustrating conversations you will have about data governance will be about data programs feeling like a series of constraints versus a strategic enabler and that you are slowing business down vs. enabling excellence. Having led data transformations in three Fortune 500 companies, I have heard my fair share of these same messages. In my humble opinion, this is feedback; feedback that we are speaking in “data speak” and have not created a business case that is centered on value generation from the lens of our stakeholders. Rather, we have delivered a business case that is focused on data needs vs. business needs.
From a stakeholder’s perspective, there are a plethora of forces at stake in driving business: generating revenue through the sales teams, marketing to existing and potential customers, economic factors, and supply chain challenges. Data is a part of all of these critical business components, but it is not the first thing that comes to mind for our stakeholders. It is embedded in how business runs. It is a part of the day-to-day. It does not and should not feel like a standalone function.
Therefore, it’s our job to serve the business and to make it feel seamless to the business stakeholders we enable. When things feel like friction, it’s not necessarily because we’re not supported; it’s because we are one of many problems leaders are facing. Often, this comes in the form of a lack of buy-in or pushback, a seemingly endless number of questions, or simply a lack of engagement. For data professionals, conversations like this often end in frustration and the underfunding of the data governance program. I have seen this scenario over and over again in organizations firsthand and have heard it from data executives in every single industry. Far too often, it ultimately ends in the failure of a chief data & analytics officer to survive in the organization.
The question is, why?
Over the course of the next 17 chapters, I will explain why Chief Data and Analytics Officers fail to establish themselves as strategic business partners in their organizations and how you can overcome these common pitfalls and succeed. I will cover everything you need to know to build a case for data governance, rally your organization to support you, deploy a strong data governance program, leverage core data governance solutions, and apply all of this in a case study for a fictitious financial institution. Let’s dive in.
Throughout this book, I promise to be transparent and direct about my experiences, and we’re going to start strong: governance programs fail because we have failed. We have failed to explain data governance in a way that makes sense to our business stakeholders. We have failed to deeply and intimately understand how our solutions will drive business success. In short, we have failed to explain in terms of business value. Conversely, the most successful data executives I have had the opportunity to work with have been successful because they deeply understand their company. They have spent the time to intimately understand the business, have crafted data solutions that enable business success and have successfully explained the benefits in terms of business results vs. data results.
As we go deep into these topics, I will not make assumptions about your experience implementing a successful data governance program. I will start with the basics by grounding you in definitions and the foundational capabilities and will build on how to launch a successful and impactful program, complete with the measures for success that will resonate with executive management and, ultimately, the board of directors for your organization. In the end, we will complete a case study to bring it all together. By the end of this book, you will have all you need to launch a program and deliver with excellence in your own organization. No longer will your organization be overwhelmed by data and underwhelmed by insight. We will change the narrative together.
In this chapter, we will ground ourselves in the basics of data governance and how it relates to adjacent capabilities. Then, we will define the components of a data governance program, why each component matters, and why we treat data governance as an enabler for business value. Subsequent chapters will dive deeper into the fundamental capabilities of a data governance program and how to implement them.
We will cover the following main topics:
What is data governance?What’s driving the increasing need for data governance?A brief overview of the data governance componentsData governance as a strategic enablerBuilding a business case for your companyWhen and why to launch a data governance programAs I meet with data professionals across industries, it is abundantly clear that data governance is more important than ever. Executives are expecting more from data, but without the proper investment, it is harder than ever to respond at the speed of business.
So why is it increasingly difficult to respond to our executives at the pace of the business? There are a number of key factors, including the continuous rise in the following:
Data volume: We have more data today than yesterday (everyday!). In fact, the amount of data doubles every two years. Yet, we cannot expect to double our efforts or double our staffing or technology spend.Regulation: The regulatory landscape is evolving, increasing expectations for how data is handled. In the United States, at the time of this writing, six states had signed privacy and data protection legislation into law. This increases the complexity of compliance for data handling.Expectations: Executives’ expectations are rising, but our use of data is not. In a recent Tableau survey, >80% of CEOs wanted their organizations to be data driven, but less than 35% percent of employees felt their data was used in decision making.User base: More individuals than ever are engaging in data, wanting it for their own use but needing to trust it. It puts our governance professionals in a position to add tremendous value by providing trusted, well-governed data to our organizations.We have to become more innovative and more embedded, leveraging more technologies (e.g., automation and AI) than ever before. We talk about what that means for our customers. But what does it mean for us? If it’s difficult to answer key, basic business questions today, how do we expect to do it in two–three years with more data than ever? We must take this sense of urgency and build capabilities that will scale and last as our volume, complexity, expectations, and user base continue to grow at an unprecedented rate.
Before we dive in, it’s important that we ground ourselves in basic definitions. During my first role in data management, we made the mistake of assuming that our stakeholders around the organization were aligned on what data we were referring to when we were discussing a particular domain of data. After several months of having difficult conversations on scope (if a particular data element, report, or system were in scope), we realized that we needed to go back and ground all stakeholders in a few very simple questions.
Data governance is the formal orchestration of people, processes, and technology by which an organization brings together the right data at the right time with the right controls to enable the company to drive efficient and effective business results. This formal orchestration should control, protect, deliver, and further enhance the value of data and create equity for an organization. Data governance is active and is delivered through capabilities, including the following:
Metadata managementData lineageData qualityData architectureMastering dataData operationsWe will explore these core capabilities, among other methods, in detail in subsequent chapters. The capabilities that make up a successful data governance program are defined slightly differently in just about every organization. Therefore, it is important that we define them here for the purposes of this book. Feel free to use the vocabulary in this text within your organization or the common language of your business.
Important note
Take the time to build a quick reference guide that defines the most basic terms used around your data governance program (e.g., data, governance, metadata, and so on). Make it accessible to the whole organization as a quick reference guide. Add to it as needed.
I want to point out that there is a passion for the use of data versus information terminology among industry veterans. Some practitioners are firm in their beliefs that these terms are not the same and should not be used interchangeably. Others use them synonymously without much thought. In my humble opinion, either can be appropriate for your organization. The important point is to distinguish between the two so that your organization understands the definitions and how to use them appropriately in your organization. Personally, I do not believe either position is correct or incorrect. It is far more important that you meet your stakeholders where they are and that your organization agrees on the alignment you choose to use. For the purpose of this book, I will use the term “data” primarily, and I will be sure to be specific about what that means.
In my very first data governance position, we launched a robust and multi-million dollar transformation to comply with a regulatory requirement around data management and regulatory reporting. About six months into the effort, we found we were really struggling to define what was “in” vs. “out” of the scope of the program. After several curricular and passionate conversations, we learned that we were not able to scope well because, ultimately, our stakeholders had differing views about what constituted “data” vs. “metrics.” We ended up building a full-blown methodology to ground the company and our regulators on how we thought about the reports so as to be in scope, built a full list of all reports, and documented whether each one either met the criteria or did not meet the criteria, and this was to be available for a credible challenge to anyone or any group interested. Instead of debating it theoretically, we documented the criteria with specificity and then clearly articulated the justification.
What I learned in this experience was two-fold: you cannot make assumptions regarding what people know or don’t know when scoping a data program, and that you must have grounding definitions that can be socialized, agreed to, and documented so that all involved could remain grounded.
I’ll ask us to do the same throughout this book. Please come back to these definitions as needed so we can be aligned.
Too often, companies have a tendency to blame problems on the data and/or the data team. Data governance (team or program) is not the solution to every problem. Data, like air, is everywhere in an organization, and it truly takes the entire organization to manage it well. Similar to the quality of air when a fire breaks out, poor data moves through an organization like smoke moves from a fire. The strong management of data requires prevention, detection, and correction, and to manage data well requires the entire company to be on board. A single data team cannot unilaterally solve every data problem. It will take the involvement and action of the organization at large to drive change and manage data effectively.
Secondly, data will never be perfect. If you or your executive team is expecting perfection from data governance, I would urge you to adjust your expectations. To ensure we align on what the appropriate expectations and objectives of a successful data governance program are, we must define success. To do that, we must start with the objective of data governance.
To put it simply, companies exist to increase value for stakeholders. When it comes to data, there is one very important objective of data to increase equity for stakeholders. Managing data effectively is one of the ways companies can increase value for their organization.
Figure 1.1 – A simple value equation
An asset is something of economic value that is owned by an organization. A liability is an obligation (either current or future) that decreases the overall value of the organization. Thus, when assets minus liabilities result in a positive value, the organization has an increase in value (i.e., has created equity), whereas when assets minus liabilities results in a negative value, the organization has a decrease in value (i.e., has reduced equity).
The same mindset can be applied to data. Data can impact equity in a number of ways. Equity can be created through addressing and minimizing operational risks by sustaining regulatory compliance, avoiding fines and penalties, and increasing or creating revenue. I break this concept down into two key subcomponents to manage data governance more specifically. These two subcomponents (assets and liabilities) are directly influenced by my formal training as an accountant and IT auditor, and this tends to resonate well with management when they translate data solutions into measurable value (ideally, monetary value, but may also consider the time value of employees).
Important note
Data is an asset when it creates value for the organization.
A few examples include:
Curated datasets that are used for multiple purposesCustomer health scoringAn authorized provisioning pointA data model used for predictive modelingImportant note
Data is a liability when it creates risk for the organization. Data can be both of these things but cannot be either (for example, a data solution may create value and reduce risk).
A few examples include:
Non-cataloged dataData that has not been classified and, therefore, not appropriately securedData leaks/breached dataIdeally, organizations should manage the liability of data while maximizing data as a strategic asset, such that data equity is created. Depending on your business and the maturity of your data governance practices, either asset management or liability management may be a bigger priority.
Data governance should create data equity by increasing the value of data as an asset and minimizing data liabilities. I encourage you to come back to this framing as you apply the principles in this book to your own organization. As you pitch data solutions, consider this:
How is this solution increasing the value of my data (increasing the asset) and/or decreasing the liability?
Both are of value. The momentum created by delivery should translate directly to an increase in data equity over time.
An example of a data asset might be a curated dataset that is reliable because it has clear ownership, is of high quality, and can be leveraged for multiple business purposes organization-wide. An example of a data liability might be as simple as an organization not knowing what data it has, where it lives, or what to do with it. This carries a risk to the company from a security perspective, but also, the lack of accountability means that individuals may be using the data inappropriately for decisions that it is not fit for, increasing the company’s risk of making a decision that it shouldn’t be based on data that were never intended to be used for that particular purpose.
The measurement of the value of an asset is unique to each organization, but in short, being able to tie back the impact to the organization is a good guiding principle. The following are a few example questions to consider as you attempt to value the data asset:
Does this asset enable additional revenue? How much?Does this asset save time? Can you calculate the hours saved by an hourly rate for an individual to calculate the person-hours saved?Does this asset improve customer satisfaction? Can this satisfaction be translated or calculated into value for the organization in terms of additional spending or increased customer retention?Figure 1.2 – Data assets, liabilities, and equity formula
Data assets may provide value across these components, and value should be calculated accordingly. The most important part of this valuation exercise is not the calculation itself; rather, it is the alignment and agreement with the business. Once you have calculated the value, it is important to go to the business and ask for their feedback. Do they agree with your assessment? If yes, then you have a fully vetted value for your data asset. If not, work with the business to iterate on your data asset valuation until you reach an agreement. If you skip this important step (vetting the value), data teams often are seen to be overselling their value to the organization. This immediately undermines your credibility in the organization. Agreeing on the value of the business supports a strong business relationship and provides credibility of past success when seeking future investment into data solutions.
The measure of the liability portion of the equation is of equal importance. Like data assets, the measurement of the liability carried by an organization’s data will vary based on your organization.
Important note
It is not as simple as more data equals more liability.
Rather, the less the data is managed, the higher the liability. When data is unmanaged, the risk to the organization is higher.
A great example is security risk. When an organization does not understand where data is, it cannot effectively or adequately protect it. This comes at a high risk (liability) to the organization and could result in a data leak or, worse, a data breach. Here are a few questions to consider when calculating your organization’s data liability:
Do data liabilities increase the risk to the organization? How much? Are there fines or regulatory penalties we could be subjected to as a result of this liability?Does liability drive inefficiencies in our business? Can you calculate the hours incurred by an hourly rate for an individual to calculate the person-hours impacted due to the inefficiency (for example, a manual process vs. an automated one)?Does this liability impact customer satisfaction? Can this satisfaction be translated or calculated into a decrease in value for the organization in terms of additional spending or decreased customer attrition?Once you have assessed your data asset value and data liability value, you can apply this to calculate data equity. The idea is to increase the equity over time. This initial calculation can serve as your baseline by which to calculate progress over time. Organizations also may like to leverage a data maturity model to measure progress; however, these models can be interpreted widely in an organization and do not take into account the business value associated with data solutions. Instead, they focus on the development of data capabilities, which do not always translate well for executive management. I prefer to focus on business value rather than an organization vs. a maturity model.
We will not dive into data monetization efforts in this book. The economics of the monetization of data is expertly described in Doug Laney’s book, Infonomics, and I would highly recommend his book to anyone looking to dive into the monetization of data further.
Now that we have classified data solutions into assets and liabilities and defined how to calculate value, let’s dive into the components in further detail. I prefer to group the components of data governance into building blocks. The reason I prefer this approach and have leveraged this framing in several companies is because it allows the organization to directly tie each building block to specific and straightforward outcomes. The first building block, policy and standards, is relatively basic and can be designed with a small team. This is a great place to get started in developing a data governance program.
The purpose of this building block is to define data ownership and the structures needed to design accountability to manage your organization’s data as an asset. This building block will ensure effective, sustainable, and standardized data governance on which the company can depend. This building block is a prerequisite for future building blocks because it defines what is required to drive effective data governance and who needs to be involved. Additionally, the components of this building block can be created in a simplified way and can be expanded as the company matures in its data journey.
An easy place to start is to draft a simple and straightforward data governance policy. The purpose of a data governance policy is to tell the company what they need to do, why, and who is accountable.
The objectives of a strong data policy include the following:
Establishing a single policy and set of standards for data managementEstablish the capabilities and data assets that are in scope for the policy and, in turn, for the office of the Chief Data and Analytics OfficerDefine the accountability and responsibilities for the implementation of the policy and the operationalization of data management capabilitiesSet minimum standards for data management, specifically for governance, quality, and meta- and master data managementDefine the procedures and usage requirements for tools to drive the consistent and robust adoption of minimum standards in a consistent mannerEnable flexibility where appropriate to allow for ease of implementation where possibleDefine what is out of the scope of the policyAs with any policy, it is important to identify the owner of the data governance policy, who will be accountable for managing the policy by refreshing it at least annually, updating the content, and evangelizing it to the company. It is also the owner’s responsibility to ensure buy-in from key stakeholders across the company. Ideally, this owner would be a chief data officer, head of data governance, or similar role. If your company does not have a data leader in the role yet, another option would be a chief information officer, chief information security officer, chief privacy officer, or even a general council.
A policy does not need to be lengthy to be effective. Ideally, the policy would set forth the basics and would be supported by more specific and topically focused data standards. This approach often allows the policy to go through a more formalized corporate governance approval process while allowing for slightly easier updates to the data standards as your organization matures. I recommend implementing a data standard for each of the core capabilities addressed in Part 2 of this book, plus any specific to your business requires additional guidance for data stakeholders. Remember, the policy sets forth the minimum expectations for the company.
To get started in developing your data governance policy, a suggested data governance policy outline may contain the following:
Purpose and scope statement (for example, to transform how the company utilizes data by creating additional revenue streams and simultaneously reducing data risk)The owner (for example, a Chief Data Officer)Reviewers/contributors and titles (for example, Head of IT, COO, and data stewards)Sign-off/approval (For example, CEO, CFO, and so on)Data governance requirementsRoles and responsibilities for implementationFeedback loops for improvements and/or additionsMeasures of successCompliance/audit expectations and frequencyGlossary of termsThe following is an example of an enterprise data governance policy:
Owner: Chief Data & Analytics Officer
Last Approval: 12/31/2023
Policy Leader: Head of Data Governance
Contributors:
Head of Information Technology/CIOHead of Human Resources/CHROHead of Marketing/CMOHead of Sales/CROProduct/Business Unit LeadersProduct/Business Unit Data StewardsPurpose and scope
This data management policy applies to all data held or processed by the company, which may include customer data, transactional data, financial information, regulatory and risk reporting, and any other data related to the business of the company. This data may be first-party data, derived data, or data acquired from another company (third-party data). The outcomes of this policy are the following:
Reduce riskUnlock revenue opportunitiesDrive operational efficienciesIntroduction
The company is responsible for ensuring all data is accurate, complete, secure, and accessible only to those who require access to fulfill their job responsibilities. This policy sets for the requirements for the enterprise to deliver on the outcomes established above.
Data governance
Data governance establishes the requirements and standards for all corporate data deemed “in scope” of this policy in the aforementioned policy and scoping section. The purpose of the data governance capabilities established within this policy are to drive enhanced transparency and accountability for our company’s data and to drive improved consistency, control, and oversight for how data is managed, stored, and used going forward.
Roles and responsibilities
Enterprise Data Committee: An Enterprise Data Committee will be established, chaired by the Chief Data & Analytics Officer, to provide an oversight and prioritization body to manage data and analytics initiatives and issue remediation enterprise-wide. A Data Domain Executive will be required to sit on this committee to ensure appropriate prioritization across all data domains.The Chief Information Officer will partner with the Chief Data & Analytics Officer to ensure technical requirements and systems are provided in support of the data and analytical needs of the organization, both for the Office of the Chief Data & Analytics Officer, but also for all functional data domains enterprise-wide.A Data Domain Executive will be established for each functional area to ensure the appropriate focus, funding, and resourcing is established and maintained to manage data in accordance with both this policy and the needs of the business.Data Stewards will be assigned by each Data Domain Executive to ensure the day-to-day execution of data requirements is completed in accordance with policy and the needs of the business. Data Stewards will also be required to work with the Office of the Chief Data & Analytics Officer to ensure that transparency of progress and ongoing operational effectiveness is maintained for leadership, regulators, and across domains.Requirements
This section provides the minimum expectations for compliance.
Data Governance
Each data domain will develop a plan to drive compliance with this policy to operationalize the requirements within their data domain. The Data Domain Executive will ensure appropriate prioritization, whereas the Data Stewards will execute the plan on behalf of the Data Domain Executive. Additionally, Technical Data Stewards will support the delivery of all technical requirements to ensure compliance with this policy and the broader needs of the business. The minimum requirements are the following:
Identify all data assets and systemsIdentify all data and technical data stewards for each asset and systemAssign each asset and system to the appropriate data domainDevelop a plan to meet the requirements for each asset and system and maintain compliance going forwardData Cataloging
The purpose of data cataloging is to centrally manage and publish business and technical metadata across the organization to enable accelerated discovery of the data available across the organization in a clear, transparent manner. As data cataloging is implemented, the Chief Data & Analytics Office will evaluate metadata to determine the best source of truth for a given data asset and identify opportunities to reduce proliferation and redundancy across the company. This will further simplify our data ecosystem over time and reduce the costs of duplicate data handling/management and storage. The minimum requirements to be published in the Enterprise Data Catalog are the following:
Description of the data asset/systemTechnical metadataDescription of schemas and tablesIdentification of critical data elements (CDEs)Business definitions for CDEsData classification for all data elements within the asset/system in accordance with the company data classification policyData Quality
The purpose of data quality is to ensure the data is fit for use. The following requirements have been set forth with the aim to centrally develop data quality rules, provide profiling resources and tooling, and monitor data hygiene to ensure the data can be trusted for analytical and business use and identify issues requiring disclosure and/or remediation. The minimum requirements are the following:
Define the data quality rules for each CDE and enter this into the enterprise data quality toolEnable CDEs for data quality monitoringProvide data quality dashboards to transparently report on current quality levelsIdentify data quality issues and create plans to address material data quality issuesPolicy Management
Feedback Loops