Advertising

Snowflake Launches Polaris Catalog for Open Data Catalog Implementation

blankThe AI Impact Tour: Last Chance to Request an Invite

There’s only one week left to request an invite to The AI Impact Tour on June 5th. This exclusive event in NYC will bring together top executive leaders to discuss strategies for auditing AI models and ensuring optimal performance and accuracy across organizations. Don’t miss out on this incredible opportunity to learn from industry experts. Find out how you can attend here.

Snowflake Launches Polaris Catalog: An Open Data Catalog for Apache Iceberg

Snowflake has kicked off its annual data cloud summit with the launch of Polaris Catalog, a new open data catalog that indexes and organizes data conforming to the Apache Iceberg table format. The catalog will be open-sourced in the next 90 days and will interoperate with other query engines, giving enterprises the choice to mix and match multiple query engines without being locked into a specific vendor.

Addressing the Need for Interoperability

Enterprises have faced the need to interoperate as the adoption of open table formats like Delta Lake and Apache Iceberg has grown. They want to freely mix and match their data catalogs with different engines to run queries against the data and provide answers for downstream users. Snowflake’s EVP of Product, Christian Kleinerman, noted that customers have raised concerns about the strong coupling between closed-source catalogs and formats, which can lead to vendor lock-in.

Introducing the Polaris Catalog

To address these concerns and reinforce its commitment to Apache Iceberg, Snowflake has launched the Polaris Catalog. The catalog is based on Iceberg’s open-source REST protocol, providing an open standard for users to access and retrieve data using any engine that supports the Iceberg Rest API. This includes popular engines like Apache Flink, Apache Spark, Dremio, Python, Trino, and more.

Flexibility in Hosting Options

Enterprises have the flexibility to host Polaris on the Snowflake data cloud or self-host it on their own infrastructure using containers such as Docker or Kubernetes. The backend implementation of the catalog is always open-source, allowing enterprises to freely swap the hosting infrastructure and eliminating concerns of vendor lock-in.

Ensuring Security and Permissions Across Engines

Snowflake is actively working on building up the security for the Polaris Catalog. They are ensuring the same level of permissions and security entitlements across different engines, which has been a challenge for many catalog and interoperability efforts. The company is collaborating with partners to align the interface with the community and address these security concerns.

Preview Coming in June

Snowflake plans to make the Polaris Catalog available to first enterprise customers under preview later in June. Leading enterprises with open query engines, including Amazon Web Services (AWS), Confluent, Dremio, Google Cloud, Microsoft Azure, and Salesforce, have already expressed support for the effort. These open technologies provide the ecosystem interoperability and choice that customers deserve.

Don’t Miss the Snowflake Data Cloud Summit

The Snowflake Data Cloud Summit is running from June 3 to June 6, 2024. It’s an event where industry leaders come together to discuss the latest advancements in data management and analytics. Stay updated on the latest news by subscribing to VentureBeat’s newsletters.