Azure previews scalable VMs using Nvidia H100 GPUs

Microsoft Azure on Monday previewed scalable virtual machines to speed up generative AI, with a nod to the company’s use of Nvidia H100 Tensor Core GPUs.

The new ND H100 v5 virtual machine will serve on-demand work from eight up to thousands of Nvidia H100 GPUs, Azure said in a blog.

Also, the interconnection of the GPUs will rely on Nvidia for its Quantum-2 InfiniBand networking. The ND H100 v5 will become a standard offering in Azure, but Azure did not specify when.

Nvidia GPU’s have been behind ChatGPT, and 10,000 Nvidia GPUs were used to train that model, according to earlier reports that Microsoft also discussed in a separate blog on Monday.

Azure said customers will see significantly faster performance for ND models over prior ND A100 v4 VMs. In addition to the Nvidia H100 GPUs being interconnected, Azure said there will be Quantum-2 CX7 Infiniband at 400 Gb/s per GPU as well as 4th Gen Intel Xeon scalable processors.

Nvidia has said its H100 Tensor core offers up to 9x faster AI training on the largest models when compared to A100. For AI inference, the H100 is up to 30x faster.

“AI is rapidly becoming a pervasive component of software and how we interact with it,” Azure said. “For Microsoft and organizations like Inflection, Nvidia and OpenAI that have committed to largescale deployments, this offering will enable a new class of large-scale AI models."

RELATED: Update: ChatGPT runs 10K Nvidia training GPUs with potential for thousands more