Elastic Introduces Search AI Lake for Scalable Data Storage and Querying

Elastic, a company known for its ElasticSearch search technology, is expanding its use cases to include observability, security, and generative AI workflows. In May 2023, they launched the ElasticSearch Relevance Engine (ESRE), which combines vector search with traditional search to improve outcomes. However, storage for Elastic deployments has often been coupled with compute, posing scalability barriers. To address this issue, Elastic is introducing Search AI Lake technology that decouples storage and compute, allowing for scalable data volumes and fast query performance. This technology also supports regular data types as well as vectors for generative AI. Alongside the Search AI Lake, Elastic is launching new serverless offerings for enterprise search, observability, and security, all built on top of the Search AI Lake. These offerings provide specialized user interfaces for each use case and are currently in tech preview.

Ash Kulkarni, CEO of Elastic, explained that the decoupling of storage, ingestion, and querying in the Search AI Lake enables a flexible serverless architecture. With Amazon S3 as the primary storage, Elastic can create a native-style architecture that is infinitely scalable. The concept of using a data lake as a repository for a database is not new, with multiple vendors offering data lake and data lakehouse architectures. However, Elastic differentiates itself with the Search AI Lake by bringing search capabilities to the data lake, allowing real-time exploration and querying without predefined schemas. The decoupled architecture also ensures rapid query performance and scalability. Additionally, the Search AI Lake has native support for dense vectors and features like hybrid search, faceted search, and relevance ranking that are crucial for generative AI and Retrieval Augmented Generation (RAG) applications.

Unlike other data lake and data lakehouse vendors that rely on table formats like Apache Iceberg or Apache Hudi, ElasticSearch AI Lake does not use any specific formats. The ability to explore data in the data lake has always been a challenge, which is why many vendors focus on table formats to facilitate data exploration. However, Elastic takes a different approach by making everything in the Search AI Lake searchable through ElasticSearch’s ad-hoc exploration capabilities. The Search AI Lake utilizes the Elastic Common Schema (ECS) as its format for storing data in an open format. ECS was recently contributed to the open source Linux Foundation’s Cloud Native Computing Foundation (CNCF) to become an open standard schema for observability and security in the cloud.

The Search AI Lake is designed to enable generative AI and vector search use cases. Elastic has seen significant customer adoption of its platform for Retrieval Augmented Generation (RAG) applications, adding hundreds of new customers in recent quarters. With the Search AI Lake, Elastic aims to provide the most scalable vector implementation available. By decoupling storage and compute, Elastic’s Search AI Lake technology offers a flexible and scalable solution for organizations looking to leverage AI and search capabilities in their data lakes.