Understanding the Value Data Lineage Adds to Your Business
July 11, 2023
Written by Frank Morreale, Senior Data Architect – Information Management, Prolifics
Data lineage and your business
Data lineage is all about how data enters your organization, how it moves and spreads, the characteristics of that data and how it may change. Knowing your data’s lineage means knowing and trusting its overall quality and usefulness for analytics and decision-making. Specifically, data lineage is very important when it comes to:
- Impact Analysis – When planning system modifications, either adding functionality to an existing system or changing the way an existing system works, it is necessary to determine the scope of the modifications needed to implement these changes.
- Data Error Research – It is necessary when researching data errors to determine the origin of the data and the transformations that were applied to the data along the way. A data lineage tool that graphically displays a data elements path through your data environment can speed up the process of determining the root cause of any data errors.
- Regulatory Compliance – It is often necessary for regulatory compliance reasons to be able to accurately show regulators how reported values were determined and where the data came from.
Compiling Data Lineage
So, how do you go about compiling data lineage? There are basically two methods, manually or using an automated data lineage solution.
- Manual data lineage compilation is very time-consuming and requires extraordinary resources for the initial compilation and the ongoing maintenance. Also, the accuracy/completeness of manual data lineage compilation can be challenging. Reporting the dataflow can be inconsistent and hard to follow in a manually implemented data lineage solution.
- Automated source code and data source scanners are available from various vendors. These tools greatly increase the speed and accuracy of data lineage compilation, with the added benefit of providing interactive user interfaces to the lineage graphs that in most cases let the user drill down from a high-level data flow diagram to progressively greater detail.
Microsoft (Azure) Purview has become a popular automated data lineage solution. Its governance portal enables your company to “create an up-to-date map of your entire data estate that includes data classification and end-to-end lineage.” But Purview has its limitations. It’s great if your company is working completely within a Microsoft environment. However, Purview can’t address lineage for data moving to the cloud from a company’s non-Microsoft legacy systems. Yet, companies with these robust end-to-end legacy systems can’t afford to leave that lineage behind. Purview doesn’t have the ability to talk to all of that.
To address these limitations Prolifics has partnered with Manta to produce the Manta-Prolifics Purview Connector. This allows the Purview user to leverage the vast array of Manta scanners and incorporate this lineage into Purview. Once Manta has scanned, analyzed and exported a data source, the Manta-Prolifics Purview Connector can be invoked to move the Manta generated data objects and data lineage into Purview. Below is an example of the data lineage the Connector can provide.
For further information or a demonstration of how the Manta-Prolifics Purview Connector for Purview can help your organization please contact solutions@prolifics.com.
Frank Morreale is a Senior Data Architect – Information Management at Prolifics. He is a highly seasoned consultant with more than 30 years of experience in Information Technology. He has delivered strategies for metadata management and data management across multiple industries. Morreale has extensive experience in delivering quality enterprise applications with a focus on data integration and quality. From 2000 to the present day, he has delivered results in defining data governance, data quality, and data integrity standards and methodologies.