Skip to content

"Forrester Alters Approach Towards Data Catalogs: Key Insights Revealed"

In 2022, as foretold earlier this year, metadata has escalated as a significant trend - and its temperature is rising. However, this isn't the conventional notion of metadata we've long despised. Instead, we're discussing the technical "data inventories" that require painstaking 18-month setup...

Forester shifts their perspective on data catalogs, offering key insights you should be aware of.
Forester shifts their perspective on data catalogs, offering key insights you should be aware of.

"Forrester Alters Approach Towards Data Catalogs: Key Insights Revealed"

In the rapidly evolving data landscape, a significant shift is underway from Machine Learning (ML) Data Catalogs to Enterprise Data Catalogs for DataOps. This transformation is driven by the need for improved data management and provisioning, extending beyond just conceptual data understanding.

The Role of Enterprise Data Catalogs (EDCs)

EDCs play a pivotal role in this transition, enabling data transparency and allowing data engineers to implement DataOps activities. They develop, coordinate, and orchestrate the provisioning of data policies and controls, manage the data and analytics product portfolio, and offer granular data visibility and governance.

Key components of EDCs include active metadata, which facilitates two-way movement of metadata and powerful programmatic use cases through automation. This active approach to metadata is crucial for modern EDCs to handle the growing use cases of metadata.

Overcoming Past Challenges

In the past, data catalogs were often underutilized due to issues such as rigid data governance teams, complex technology setup, lengthy implementation cycles, and low internal adoption. However, EDCs are designed to overcome these challenges, supercharging the power of metadata for modern businesses.

Best Practices and Automation

A prime example of DataOps best practices in action involves a data team using APIs to monitor pipeline health, automatically track and trigger notifications for schema and classification changes, and file Jira tickets to track and initiate work on issues. EDCs can also automatically trigger a data quality testing suite to ensure that only high-quality, compliant data makes its way to production systems.

The Differences Between EDCs and ML Data Catalogs

EDCs and Machine Learning (ML) Data Catalogs differ primarily in their scope, audience, functionality, and the way they support organizational goals. EDCs offer enterprise-wide integration of metadata from diverse data sources, whereas ML Data Catalogs focus on data specifically intended for ML workflows.

The Shift in Focus

The shift from ML-specific catalogs to broader Enterprise Data Catalogs arises because of broader and enterprise-wide data needs, the dependence of ML on trusted enterprise data, consolidation for efficiency and compliance, scalability, and sustainability.

In summary, Enterprise Data Catalogs for DataOps provide a holistic, integrated, and governed data framework that supports all organizational data workflows, including but not limited to ML, whereas Machine Learning Data Catalogs focus narrowly on AI lifecycle needs. This evolution towards enterprise-wide catalogs reflects the recognition that enterprise data governance and operationalization form the critical foundation needed for successful ML endeavors and other data-driven initiatives.

The Future of Metadata

The data industry is experiencing a shift in how metadata is conceptualized, with new ideas like the metrics layer, modern data catalogs, and active metadata emerging. Forrester has recently updated its Wave report, transitioning from "Machine Learning Data Catalogs" to "Enterprise Data Catalogs for DataOps". As we move forward, EDCs are expected to continue evolving to handle the diversity and granularity of modern data and metadata, acting as a "system of record" to manage all data through the data product lifecycle.

Read also:

Latest

Wealth tech company Valuefy celebrates its 15-year milestone by unveiling a cutting-edge wealth...

Fifteen years of Valuefy culminate in the unveiling of a cutting-edge WealthTech platform, revolutionizing the global landscape of wealth management.

WealthTech pioneer, Valuefy, celebrates its 15-year milestone by unveiling a state-of-the-art, comprehensive financial technology platform. This innovative solution is engineered to revolutionize the interaction between wealth managers and their clients, optimize data management, and streamline...