Today, we are thrilled to welcome the Fennel team to Databricks. Fennel improves the efficiency and data freshness of feature engineering pipelines for batch, streaming and real-time data by only recomputing the data that has changed. Integrating Fennel’s capabilities into the Databricks Data Intelligence Platform will help customers quickly iterate on features, improve model performance with reliable signals and provide GenAI models with personalized and real-time context — all without the overhead and cost of managing complex infrastructures.
Feature Engineering in the AI Era
Machine learning models are only as good as the data they learn from. That’s why feature engineering is so critical: features capture the underlying domain-specific and behavioral patterns in a format that models can easily interpret. Even in the era of generative AI, where large language models are capable of operating on unstructured data, feature engineering remains essential for providing personalized, aggregated, and real-time context as part of prompts. Despite its importance, feature engineering has historically been difficult and expensive due to the need to maintain complex ETL pipelines for computing fresh and correctly transformed features. Many organizations struggle to handle both batch and real-time data sources and ensure consistency between training and serving environments — not to mention doing this while keeping quality high and costs low.
Fennel + Databricks
Fennel addresses these challenges and simplifies feature engineering by providing a fully-managed platform to efficiently create and manage features and feature pipelines. It supports unified batch and real-time data processing, ensuring feature freshness and eliminating training-serving skew. With its Python-native user experience, authoring complex features is fast, easy and accessible for data scientists who don’t need to learn new languages or rely on data engineering teams to build complex data pipelines. Its incremental computation engine optimizes costs by avoiding redundant work and its best-in-class data governance tools help maintain data quality. By handling all aspects of feature pipeline management, Fennel helps reduce the complexity and time required to develop and deploy machine learning models and helps data scientists focus on creating better features to improve model performance rather than managing complicated infrastructure and tools.
The incoming Fennel team brings a wealth of experience in modern feature engineering for machine learning applications, with the founding team having led AI infrastructure efforts at Meta and Google Brain. Since its founding in 2022, Fennel has been successful in executing on its vision to make it easy for companies and teams of any size to harness real-time machine learning to build delightful products. Customers like Upwork, Cricut and others rely on Fennel to build machine learning features for a variety of use cases including credit risk decisioning, fraud detection, trust and safety, personalized ranking and marketplace recommendations.
The Fennel team will join Databricks’ engineering organization to ensure all customers can access the benefits of real-time feature engineering in the Databricks Data Intelligence Platform. Stay tuned for more updates on the integration and see Fennel in action at the Data + AI Summit June 9-12 in San Francisco!