Databricks, a leading data ecosystem company, recently held its annual summit where it highlighted the growing importance of AI in the data industry. CEO Ali Ghodsi shared several innovations aimed at helping teams maximize their use of data assets on the Databricks Data Intelligence Platform.
One of the major announcements at the summit was the open-sourcing of Databricks’ Unity Catalog. This move allows other companies to utilize the underlying architecture and code to set up their own catalogs supporting data in any format. The catalog supports interoperability with major cloud platforms and compute engines, providing greater flexibility for data management.
Another significant update was the upgrade to Mosaic AI, Databricks’ suite of tools for building AI applications. The new Mosaic AI Model Training product, AI Agent framework, Evaluation framework, and AI Tools Catalog and AI Gateway all contribute to building trusted, production-grade compound AI systems. These offerings are now available in public preview, allowing teams to explore and experiment with the latest features.
Databricks also introduced a new text-to-image generative AI model called Shutterstock ImageAI. This model provides enterprises with high-quality images for various business use cases. It was pre-trained using Shutterstock’s trusted image collection and can be fine-tuned using the Mosaic AI platform. The integration via API makes it easy for organizations to incorporate these images into their applications.
To democratize access to analytics and insights, Databricks unveiled Databricks AI/BI, a compound AI system that leverages AI agents to answer business questions and generate natural language answers and visualizations. Each agent specializes in a specific task, such as SQL generation or visualization. This offering is available to all Databricks SQL Pro and Serverless customers, with Dashboards being generally available and Genie in public preview.
In addition to AI/BI, Databricks introduced LakeFlow, a unified experience for data engineering tasks. LakeFlow simplifies data ingestion, transformation, and orchestration, automating pipeline deployment and operation. It supports CI/CD and quality checks at scale, making data engineering more efficient and streamlined.
Databricks also announced partnerships with Nvidia and Gretel. The partnership with Nvidia focuses on adding native support for CUDA-accelerated computing in Databricks’ query engine, Photon, to improve speed and efficiency. The collaboration with Gretel allows Databricks to provide high-quality synthetic datasets for building and customizing machine learning models on its platform.
Overall, Databricks’ annual summit showcased the company’s commitment to advancing AI in the data industry. The various announcements and partnerships demonstrate Databricks’ efforts to empower teams with the tools and technologies needed to harness the power of data and AI effectively.