Microsoft Azure extols virtues of Nvidia A100 chips for AI work

nvidia a100 gpu
Nvidia's A100 GPUs are catching on with high performance computing, and Microsoft Azure announced how it will use them for heavy-duty AI work. (Nvidia)

Microsoft Azure on Wednesday announced its first use of the Nvidia A100 GPU, appearing in a virtual machine series that offers up to 20 times more performance for artificial intelligence supercomputing work.

AI brawn has led to recent innovations, including natural language generation with enough creativity to write poetry.  At Microsoft, natural language models already power various tasks in Bing, Word, Outlook and Dynamics.

In a blog, Microsoft called the new ND A100 v4 VM series “our most powerful and massively scalable AI VM, available on-demand from eight to thousands of interconnected Nvidia GPUs across hundreds of virtual machines.”

Free Daily Newsletter

Interesting read? Subscribe to FierceElectronics!

The electronics industry remains in flux as constant innovation fuels market trends. FierceElectronics subscribers rely on our suite of newsletters as their must-read source for the latest news, developments and predictions impacting their world. Sign up today to get electronics news and updates delivered to your inbox and read on the go.

Google Cloud made A100 available for customers in  early July and more than 50 A100-powered servers were announced in June from ASUS, Cisco, Dell Technologies, HP Enterprise and others.

Not to be outdone, Azure said its new A100 v4 clusters can scale to thousands of GPUs with an “unprecedented 1.6 Tb/s of interconnect bandwidth per virtual machine.” That means that thousands of GPUs can work together as part of a Mellanox Infiniband HDR cluster “to achieve any level of AI ambition,” exclaimed Ian Finder, senior program manager for accelerated HPC Infrastructure at Azure. (Editor’s Note: Any level, really?)

Most customers will see a boost of up to 3x compute performance over the previous system, Finder added. A boost of up to 20x is possible with new A100 features like multi-precision Tensore Cores with sparsity acceleration and Multi-Instance GPU (MIG).

He noted that the advantage of using such large models is that they need to be trained once with massive amounts of data using AI supercomputing, then fine-tuned for different domains with smaller datasets.  Such an approach can be used in natural language generation so that a model has the ability to understand language well enough to summarize a document or answer questions from it when seen for the first time, Finder said.

Azure has been supporting OpenAI its quest to build safe artificial general intelligence. In May, OpenAI showed the ability to use AI in a GPT-3 model to support tasks not specially trained for, such as writing poetry or translation. GPT-3 has even been shown to generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans.  (NOTE: This article was written by a human.)

OpenAI CEO Sam Altman said the AI quest toward general intelligence requires powerful systems that train more capable models. “The computing capability required was just not possible until recently,” he said in a statement, crediting Azure AI with helping accelerate OpenAI’s progress.

Azure’s ND A100 v4 VM series and clusters are in preview and will become a standard offering, although no timeline was announced.

RELATED: Nvidia names 12 companies to offer servers with A100 GPUs 

Suggested Articles

In June, Su stood out for heartfelt commentary on social injustice after the killing of George Floyd. She challenges industry to do more.

New chip will be incorporated in coffee mug-sized device to cost $199 next year

In a nutshell, Edge ML is a technique by which smart devices process data locally using machine and deep learning algorithms.