Home ai “Nvidia and Hugging Face Partner to Bring Inference-as-a-Service to Developers”

“Nvidia and Hugging Face Partner to Bring Inference-as-a-Service to Developers”

H2: Introducing Inference-as-a-Service Powered by Nvidia NIM Microservices

During Nvidia CEO Jensen Huang’s talk at the Siggraph computer graphics conference, Hugging Face and Nvidia made an exciting announcement. They are partnering to offer developers an inference-as-a-service powered by Nvidia NIM microservices. This new service aims to bring up to five times better token efficiency with popular AI models to millions of developers. It also provides immediate access to NIM microservices running on Nvidia DGX Cloud.

H2: Empowering Developers with Easy Access to Nvidia-Accelerated Inference

The Hugging Face platform, which boasts a community of four million developers, will now have easy access to Nvidia-accelerated inference on the most popular AI models. This integration allows developers to prototype with open-source AI models hosted on the Hugging Face Hub and deploy them in production.

Kari Briski, vice president of generative AI software product management, emphasized the importance of making generative AI more accessible for developers. She stated that developers need simple ways to work with APIs, prototype models, and test their performance within applications. By providing serverless inference and optimized performance with Nvidia NIM, the new inference-as-a-service capabilities address these needs.

H2: Streamlining AI Development with Train on DGX Cloud

Nvidia’s Train on DGX Cloud, an AI training service, already available on Hugging Face, complements the new inference-as-a-service offering. Together, these tools provide developers with a hub where they can easily compare open-source models and experiment with cutting-edge models on Nvidia-accelerated infrastructure. The “Train” and “Deploy” drop-down menus on Hugging Face model cards simplify the process, allowing users to get started with just a few clicks.

H2: Unlocking Efficiency and Performance with Nvidia NIM

Nvidia NIM is a collection of AI microservices optimized for inference using industry-standard APIs. These microservices, including Nvidia AI foundation models and open-source community models, offer higher efficiency in processing tokens. They also enhance the efficiency of the underlying Nvidia DGX Cloud infrastructure, resulting in faster AI applications.

Developers can expect faster and more robust results when accessing an AI model as a NIM compared to other versions of the model. For example, the 70-billion-parameter version of Llama 3 delivers up to five times higher throughput when accessed as a NIM on Nvidia H100 Tensor Core GPU-powered systems.

H2: Nvidia DGX Cloud: Purpose-Built for Generative AI

The Nvidia DGX Cloud platform is specifically designed for generative AI. It provides developers with easy access to reliable accelerated computing infrastructure, enabling them to bring production-ready applications to market faster. With scalable GPU resources, developers can seamlessly transition from prototype to production without long-term infrastructure commitments.

H2: OpenUSD and the Next Evolution of AI

Nvidia is also bringing generative AI models and NIM microservices to the OpenUSD framework. This integration aims to accelerate developers’ abilities to build highly accurate virtual worlds for the next evolution of AI. By leveraging OpenUSD in metaverse-like industrial applications, Nvidia is pushing the boundaries of AI development.

In conclusion, the partnership between Hugging Face and Nvidia brings exciting opportunities for developers. The inference-as-a-service powered by Nvidia NIM microservices offers easy access to optimized compute resources, enabling developers to experiment with the latest AI models in an enterprise-grade environment. With the combination of Hugging Face’s platform and Nvidia’s DGX Cloud infrastructure, developers can unlock the full potential of generative AI and accelerate their AI development journey.

Exit mobile version