
(Shutterstock AI Image)
Collate, the startup behind the open source project OpenMetadata, has raised $10 million in Series A funding to tackle a growing challenge in enterprise data: managing metadata across an increasingly complex stack.
While much of the industry’s attention is on training larger AI models, Collate is focused on the underlying infrastructure: building tools that help data teams govern, document, and make sense of the systems those models rely on.
Its approach points to a broader shift in how companies are rethinking data readiness, not just in terms of access and storage but in structure, context, and usability. Collate aims to help teams move beyond fragmented documentation and manual processes by offering tools that promote more consistent governance and faster insight across departments.
The company positions its platform as a bridge between today’s disconnected data environments and the operational needs of AI systems, with a particular focus on supporting workflows at what it calls “agentic speed”, which refers to the pace at which decisions are increasingly made by autonomous systems.
At its core, Collate’s product focuses on reducing the operational burden that comes with managing metadata across dozens of tools. Instead of relying on engineers to manually update documentation or enforce access rules, the platform captures metadata automatically as pipelines run and schemas evolve. Policies are stored as code, so access controls and data classifications are checked and enforced before any query runs, whether it’s from a person or a machine.
The company says this model helps organizations keep their data systems more reliable and transparent without slowing down development. Everything from lineage tracking to data quality checks happens in the background, giving teams a real-time view into how data is flowing, who is using it, and whether it complies with internal rules.
Open source is central to that effort. Collate is built on OpenMetadata, a fast-growing project with an active community and wide integration support. Rather than replacing existing infrastructure, the platform connects to existing data warehouses, lakes, dashboards, and machine learning tools. This lets teams enrich what they already use with field-level documentation, usage insights, and governance controls.
The team behind Collate has worked on large-scale data systems for more than a decade. Before launching the company, the founders led infrastructure efforts at Yahoo, Hortonworks, and Uber. They were involved in building some of the most widely used open source tools in the space, including Hadoop, Kafka, and Storm. That background has shaped how they think about scale, flexibility, and automation.
Collate says this experience has helped them design a platform that supports what they call a virtuous cycle. The idea is simple. Better metadata helps teams get more out of AI, and AI can be used to improve metadata in return.
This feedback loop is a core part of how Collate sees its role. The company believes it can give data teams a system that improves over time, without relying on constant manual work.
“Collate’s Series A couldn’t come at a more critical time,” said Suresh Srinivas, CEO of Collate. “We’re in the midst of an AI race, not just for getting data ready for AI, but for how AI itself helps prepare that data. The winners will be organizations with highly functioning data teams augmented by AI.
“Our agentic approach is uniquely powered by richer metadata context from our knowledge graph and open source core,” continued Srinibas. “This is changing the game for our enterprise customers by solving the last mile of data challenges, helping them innovate faster with AI and data.”
That message is landing with customers working to modernize their data practices without slowing down operations. At Fundcraft, a financial services platform, Collate is now part of the company’s core data workflows. “Collate has been a game changer for strengthening our data culture,” said Victor Martin, CTO of FundCraft. “The biggest impact we’ve seen is accelerated development speed and improved cycle times, since our teams can now focus quickly on what matters most.”
Mango, the global fashion retailer, is also using the platform across its data organization. According to Collate, the company has seen 3× faster integration and a 20% increase in data team productivity. Improvements in data quality have also contributed to better performance in the company’s ML-driven pricing models. “Collate has proven to be the cornerstone of our data strategy,” said Jordi Orriols Torras, Mango’s Head of Data Governance.
With new funding in place and a growing customer base, Collate is turning its attention to scaling adoption and expanding automation within the metadata layer. Recent updates include Collate AutoPilot, a growing suite of AI agents that assist with documentation, data tiering, quality monitoring, and ingestion.
The platform now also supports enterprise-grade Model Context Protocol (MCP) support, which allows metadata to flow in both directions. This means systems can not only read metadata but also write changes back, closing the loop between insight and action.
Related Items
Cloudera Enhances Data Catalog and Metadata Management with Octopai Acquisition
Active Metadata – The New Unsung Hero of Successful Generative AI Projects
Data Warehousing for the (AI) Win