Data Catalogs and the Data Engagement Gap
“Data is a natural resource, but it can be either an asset or a liability,” said Paul Brunet, the Vice President of Marketing for Collibra, during a recent DATAVERSITY® interview. In a new study by CEB (a unit of Gartner), respondents said that 75 percent of marketing functions report only marginal returns on their digital investments, and 70 percent are increasing HR investments in “talent analytics,” yet only 12 percent are getting results. In addition, a top concern of 41 percent of assurance executives is slower decision-making and greater risk aversion caused by multiple versions of “truth,” said Brunet.
“People have been relatively consistent in their ability to get access to data, but the usefulness in driving insights and leveraging those insights has actually been going down over the last six years.” According to a study from MIT Sloan Management Review entitled Using Analytics to Improve Customer Engagement, 2018, in each of the last six years, 70 percent of respondents said they had more access to useful data than they did the prior year, yet only 49 percent reported being able to use the data strategically, compared with 55 percent in the previous year. Brunet said this gap between access and insight is called the “data engagement gap.”
“By just dumping it out there without context, without an understanding of Data Quality, without the ability to ask questions about the data — it’s actually made the situation worse,” commented Brunet, when the goal was to provide useful insights from data.
Brunet discussed this gap, along with Data Catalogs, Data Governance, and a range of other topics during the recent interview and associated DMRadio webinar Where’s the Data? Let Your Data Catalog Find It.
Further complications arise when easy access is coupled with a lack of Data Governance. “If I put private information or partner information out there and others get hold of it, it is a liability to the organization,” he said. Closing the data engagement gap and ensuring access without increasing risk is possible with the right tools and policies, he said. A Data Catalog, along with Data Governance, can bring order to the chaos of uncontrolled access.
The Data Engagement Gap
Brunet said in the rush to ensure everyone has access to data, there is a challenge in balancing freedom and control. “I want to take a look at new opportunities for growth, but at the same point in time, I need to make sure that there’s the right governance and the right usage of the data.”
He cited an example of the early use of Data Lakes, when the focus was on taking in as much data as possible in order to make it available to a larger population in the business. There was little emphasis on prioritizing, rating, lineage, or context, and users were so overwhelmed by the volume that they could not understand what data to use or how to use it. Data without context isn’t useful, and often when there is context, the terminology used is not understood by the business side, he said.
Without access to good data, companies are unable to grow and compete effectively, and a lack of structure can lead to redundant or inefficient growth. The flood of readily available data creates risk; at the same time, too much control of data resources threatens the survival of the business, he said.
There are ways to make data more readily available without sacrificing Data Quality and Data Governance, and the ideal solution is a balance of control and freedom so that users can find the data they need, understand the data they have, and trust its validity while limiting risk to the company.
Developing the Data Catalog
A Data Catalog is an automated solution to incrementally build out a map of the information landscape, across diverse, far-reaching environments: Cloud, on-premise, through partner channels — anywhere it lives. A Data Catalog allows users to easily find the right data, quickly understand what the data means, trust the data, and maintain appropriate data privacy in a changing regulatory environment.
Brunet said that the first step in closing the engagement gap is to get a clear understanding of who the data citizens are and exactly what they need in the Data Catalog. A business person in the marketing department creating campaigns needs a different set of assets than a Data Scientist does. “Tailor the catalog experience based upon the individual coming in,” he said.
The second step is to keep the specific needs of those users and their experience in mind when determining the key assets to build into the catalog. “We sometimes think about it as ‘data’ and ‘datasets.’ Well, there’s no reason why we shouldn’t think about it from a user perspective instead,” he said. From an analytics perspective, for example, it can be seen as “workbooks,” “reports,” and “visualizations” or, from a Data Science perspective, as “algorithms” and “models.”
“Now it’s no longer about ‘data.’ It’s glossaries, dictionaries, role definitions, metrics, workbooks, and so the list of things that you want to now make available in your catalog becomes much more extensive.” Grouping data and datasets in a more logical and useful fashion based on user perspective adds structure and simplicity to the process of accessing data.
The third step is measuring the impact of making the data available. “How do I, as CEO, show the progress, the impact that I’m making? What new revenue streams have we found?” For example, because catalog users take less time to access data and are not worrying about its quality, the company is able to put more campaigns in the market, he said. “Or now that I have the right data, I’m able to take a look at missed billing cycles, so I’m able to find new revenue opportunities in that sense.”
Brunet finds it useful to divide the work of Data Governance into “big governance” and “little governance.” “I think a lot of people get scared when they hear the word ‘governance.’ ‘Oh, no, it’s big brother, big overhead.’”
But a simple agreement on a definition of “customer” inside of an organization is governance, as is ensuring there’s a consistency to the way calculations are done, he said. Creating processes for adding data is governance. For example, before taking everything on-premise and moving it into the Cloud, “First let me make sure that it’s aligned to the business and there’s a purpose behind it so I can also reduce some of that potential liability.”
Some may start with the department level view of the catalog, but then as they look at trying to extend it across the organization from a more enterprise-based catalog, the idea of governance starts becoming a larger element, he said.
Collibra offers Data Catalogs and Data Governance, with a focus on how companies can better leverage their data assets while ensuring the data is secure and private, and complies with the way the business expects their data to be handled.
Collibra found that with a stronger degree of confidence in their data, their customers were able to resolve data issues as they occurred, in a much more rapid fashion. Closing the engagement gap is really about revenue growth, Brunet said. It’s about creating efficiencies around the quality of the data that individuals are utilizing to making their decisions going forward. “So, we’re constantly looking at ways to enhance that,” he said.
Being able to measure the impact of efforts toward data engagement is one of those enhancements. “What we’ve seen in our client survey is that their confidence goes up. They are able to move more quickly because they feel that there’s less risk in the decisions they’re making.”
Brunet said he hears from business people who say they spend 80 percent of their time trying to find the right data. “But what you really want them to think about is, ‘What if I fixed that? If people had 80 percent more time using the data and only 20 percent of their time finding the data, what would be the impact?’”
Image used under license from Shutterstock.com
analytics Cloud data catalog data engagement data governance data quality