Snowflake, a leading data platform, has announced the general availability of cross-region inference for its Cortex AI. This new feature allows organizations to overcome the challenge of waiting for language models to become available in their tech stack’s location. By enabling cross-region inference, developers can process requests on Cortex AI in a different region, even if the desired model is not yet available in their source region. This capability provides a significant competitive advantage, as organizations can access the latest large language models (LLMs) faster and innovate more quickly.
The regional availability of LLMs has been a challenge for many organizations, often due to resource constraints, western-centric bias, and multilingual barriers. This has forced them to wait until models are available in their specific location, potentially falling behind their competitors. However, Snowflake’s cross-region inference feature addresses this obstacle by allowing organizations to integrate new LLMs as soon as they are available, regardless of regional availability.
With just one line of code, developers can enable cross-region inference on Cortex AI. This feature supports data traversal between regions, both within the same cloud provider (such as Amazon Web Services) and across different cloud providers. When regions operate on the same cloud provider, data remains securely within the global network, thanks to automatic encryption at the physical layer. However, if regions are on different cloud providers, traffic will traverse the public internet via encrypted transport MTLS.
To execute inference and generate responses within the secure Snowflake perimeter, users must configure where inference will process by setting an account-level parameter. Cortex AI then automatically selects a region for processing if the requested LLM is not available in the source region. Currently, target regions can only be configured to be in AWS, so if cross-region is enabled in Azure or Google Cloud, requests will still process in AWS.
Arun Agarwal, who leads AI product marketing initiatives at Snowflake, explains that this cross-region inference feature is incredibly easy to use, requiring just one line of code. Users are charged credits for the use of the LLM as consumed in the source region, not the cross-region. While there may be some round-trip latency between regions, Snowflake expects it to be negligible compared to LLM inference latency.
Overall, Snowflake’s cross-region inference feature is a significant advancement in AI development, providing organizations with the ability to access and integrate new LLMs faster. This enhances their competitive advantage and allows for faster innovation. By eliminating the regional availability barrier, Snowflake is empowering organizations to leverage AI technologies more effectively and stay ahead in the rapidly evolving AI landscape.