Advertising

Databricks’ Acquisition of Lilac Enhances Data Quality for Generation AI Applications

Databricks, a leading data intelligence platform, has announced its acquisition of Lilac, a Boston-based applied research startup. Lilac offers tools for data understanding and manipulation, which will enhance the quality of datasets for developing large language model (LLM) applications. The terms of the deal were not disclosed.

The acquisition is part of Databricks’ efforts to become a one-stop-shop for generative AI. The company recently invested in Mistral, a generative AI startup, and has been making advancements in the AI space. With the integration of Lilac’s team and technology into its platform, Databricks aims to provide users with a seamless way to improve data quality and develop production-quality LLM applications.

One of the challenges in AI development is ensuring high-quality data for training and testing models. Lilac addresses this challenge with its scalable open-source solution. The platform offers an intuitive user interface and AI-driven features to analyze, understand, and modify unstructured text data at scale. It allows data scientists and AI researchers to cluster and categorize documents, perform semantic and keyword searches, detect personal information or duplicates, and make necessary edits to tailor the dataset.

Databricks executives have praised Lilac’s product for its ability to analyze model outputs for bias or toxicity and prepare data for LLMs. The entire tech stack of Lilac will be integrated into Databricks’ Mosaic AI tooling, providing developers with better ways to curate datasets for custom generative AI systems. The integration will simplify data tailoring, making it easier for teams to evaluate and monitor the outputs of their LLMs, as well as prepare datasets for RAG, fine-tuning, and pre-training.

The acquisition represents a significant step for Databricks in providing end-to-end tooling for developing high-quality generative AI applications using customers’ own data. The platform already offers open models from various players, including Meta, Stability, and Mistral, as well as dedicated Mosaic tools for experimentation and customization. Databricks’ major competitor, Snowflake, is also moving in the same direction, introducing Cortex, a fully managed service to help customers build apps using powerful open models.

With this acquisition, Databricks aims to enable businesses to have more visibility and control over their unstructured data, ultimately unlocking the potential of generative AI for enterprise developers. The integration of Lilac’s technology into the Databricks platform will simplify the data curation experience and empower developers to create customizable AI products with just a few clicks.

As Databricks continues its efforts to become a leader in the generative AI space, the company’s acquisition of Lilac marks a significant milestone in its journey to provide customers with comprehensive and powerful tools for data intelligence and AI development.