Nvidia breaks AI speed records again with shipping chips

Continental is relying on a supercomputer powered by Nvidia's A100 GPUs for driving simulations in the creation of autonomous driving tech. Nvidia set more benchmark records with its A100. (Continental)

Nvidia broke 16 records for AI performance with its A100 GPUs and DGX supercomputers based on the latest round of benchmark tests conducted by MLPerf and revealed Wednesday.

For eight difference benchmarks, the A100 TensorCore GPU performed the fastest, while a massive cluster of more than 2,000 A100 systems in Nvidia’s DGX SuperPOD system connected with HDR Infiniband also set eight performance milestones.

Previously, Nvidia set six benchmark records in December 2018 and eight in July 2019.  ML Perf is an industry benchmarking consortium of more than 70 companies and researchers created in 2018.  In this latest round, Nvidia provided the only commercially available products for testing. The full MLPerf spreadsheet of results is available online with a separate short press release.

Fierce AI Week

Register today for Fierce AI Week - a free virtual event | August 10-12

Advances in AI and Machine Learning are adding an unprecedented level of intelligence to everything through capabilities such as speech processing and image & facial recognition. An essential event for design engineers and AI professionals, Engineering AI sessions during Fierce AI Week explore some of the most innovative real-world applications today, the technological advances that are accelerating adoption of AI and Machine Learning, and what the future holds for this game-changing technology.

Nvidia has been touting its commercially-available AI capabilities in recent months.  On Tuesday, auto supplier Continental said it has been using since early 2020 more than 50 networked Nvidia DGX units for work on simulation and deep learning required for use in development of assisted, automated and autonomous driving. The simulations reduce the need for actual road tests. Continental called it the “fastest supercomputer in the auto industry” based on the TOP500 supercomputer list.

Independent analyst Karl Freund at Moor Insights & Strategy noted via email that no Nvidia competitor is publishing its benchmark results, at least not yet. “Nobody can compete with Nvidia, even after two to three years of expectations,” he said.

In the MLPerf benchmarks, a supercomputer or GPU (such as the A100s from Nvidia) is judged by how fast it can train different models to a set metric.  Image classification, object detection, translation, recommendation and reinforcement learning are among the training tasks. With reinforcement learning in the latest benchmark round, a full sized MiniGo game was used.  Under the rules, AI was used to train software agents to rival humans at playing the game with a 50% win rate. 

Reinforcement learning requires an AI program to both learn from its experience through inference while also training itself for future moves and games.  The AI generates training data through exploration instead of relying on a predetermined data set.  The training uses self-play between agents to generate data.  The latest MLPerf benchmark used a full-sized Go board, which increased complexity.

Using reinforcement learning, engineers can create a broad array of applications for robotics and optimization tasks.  In one example, an industrial robot can be trained to work alongside humans in a factory or other setting. In that sense, there’s not any pre-existing data set on which to train an AI robot, so the robot learns from its various encounters, perhaps even training from a video simulation of future work.  In similar fashion, reinforcement learning can be applied to optimizing dozens or hundreds of traffic signals working as a system to lessen congestion.

RELATED: BMW and Nvidia train logistics robots with AI to move car parts

“With reinforcement learning with MiniGo, there was a lot of inference and training going on back and forth between the two,” said Paresh Kharya, senior director of product management at Nvidia, during an online interview with reporters. He wrote a separate blog describing the MLPerf results.

Kharya said Nvidia was able to perform all the benchmarks under 18 minutes, which was 3.5 times faster than playing MiniGo before.   Nvidia’s A100 chips can be used for both inference and training work, and in the MiniGo example a group of A100s were used to handle training tasks while another group of the chips were used to handle inference.

“Delivering exceptional performance on AI is really hard,” Kharya said.  Successful AI works requires more than custom silicon, but also a broad ecosystem of software and components. 
“Nvidia has been investing billions and working on this for almost a decade.”

In the latest benchmark, no other companies submitted commercially-available chips. They included Huawei Ascent and Google TPUv3. The latest MLPerf found added two new tests for recommendation systems and conversational AI using BERT, a neural network model.

Nvidia took MLPerf results as an opportunity to mention customer use cases, including Alibaba, which used Nvidia GPUs to support its recommendation system for $38 billion in sales on Singles Day last November. 

Only nine companies submitted results from the MLPerf results and seven used Nvidia GPUs, including Alibaba Cloud, Google Cloud, Tencent Cloud and servers makers Dell, Funitsu and Inspur.  Separately, Google Cloud, Intel and the Shenzhen Institutes of Advanced Technology offered submissions.

Nvidia’s performance on the MLPerf benchmarks is important, said Freund, the senior analyst at Moor Insights & Strategy.  “These benchmarks demonstrate to all buyers that Nividia is the fastest and also the fastest at every popular AI task, which should help shorten purchase cycle time,” he said. “Cost is important, but developers need to train neural networks in hours, not days.  As Facebook once said, there are three key buying criteria for AI accelerators: performance, performance and performance.”

Suggested Articles

Hydrogen refueling stations are limited in the U.S., restricting interest in use of fuel cell electric cars


Silicon Labs is providing the BT module needed for detecting proximity with another Maggy device

Test automation won't fix everything, but can help, according to an automation engineer. Here are five problems to avoi to improve chances of success