Cisco, Dell, HPE, Lenovo and at least eight other vendors will begin shipping more than 50 versions of servers with the new Nvidia A100 GPU this summer and fall, Nvidia announced Monday.
Adoption of the A100 is outpacing previous Nvidia GPUs used in servers, according to Ian Buck, general manager of accelerated computing at Nvidia.
The high performance and low total cost of ownership are considered ideal for AI, data science and scientific computing. The A100 is based on Nvidia’s Ampere architecture, introduced in May, and is designed to improve server performance by 20 times over its predecessor.
Other vendors building servers on A100 include ASUS, Atos, Fujitsu, Gigabyte, Inspur, One Stop Systems, Quanta/QCT and Supermicro, Nvidia said.
A single A100 can be partitioned into as many as seven separate GPUs to handle jobs and several GPUs can be ganged to together to act as one big GPU.
Nvidia’s success with server vendors is an extension of the company’s vision of bringing accelerated computing to supercomputers and then of taking supercomputing outside the box.
“Supercomputing now includes connectivity to the edge, cloud, AI systems and other areas… that get tied together with the network,” said Zeus Kerravala, an analyst at ZK Research. “In such an environment, the network essentially becomes the backplane of a distributed computer and needs to be fast, ultra low latency, lossless and takes on the characteristics of what was once directly connected. This is why Nvidia bought Mellanox.”
Kerravala added, “this combination of networking plus computing creates an Nvidia system that can deliver the compute power to thing we could never do before.”
Karl Freund, an analyst at Moor Insights & Strategy said the A100 “has extended Nvidia’s encormous lead” in mainstream datacenter accelerated computing. “A lot of people have been betting billions of venture cash on startups that are supposed to take Nvidia down, but it’s just not going to happen,” Freund said. Graphcore and Cerebras have created unique technology that will compete, but Nvidia’s lead with V100 was never surpassed, except by Nvidia A100, he added.
Nvidia also announced Monday that a PCI3 form factor will be available as well as the four- and eight-way HGX A100 configurations launched in May. That means server makers can offer their customers a single A100 GPU system or a server with 10 or more GPUs.
HPE will support A100 PCIe GPUs in its ProLiant DL 380 Gen10 Server, while Lenovo will support A100 PCIe GPUs in its ThinkSystem SR670 AI-ready server, according to Nvidia.
Nvidia made the announcement at the start of ISC 2020 Digital, a high performance computing event normally held in Frankfurt, Germany, that is being held online this year. Nvidia accelerates a majority of the world’s fastest computers, according to Paresh Kharya, director of accelerated computing for Nvidia in a call with reporters.
Nvidia also unveiled an AI platform with newly acquired Mellanox that is designed to detect data center security threats and operational problems and predict network failures.
Called Nvidia Mellanox UFM Cyber-AI, it is an extension of the UFM product used to manage InfiniBand systems for a decade. It relies on AI to learn how a data center workload functions, then to detect performance irregularities and make corrections.
Nvidia also touted its big data analytics, saying it was able to run a benchmark called TPCx-BB in just 14.5 minutes on 16 Nvidia DGX A100 systems, compared with the previous leading result of 4.7 hours on a CPU system. The DGX A100x had 128 Nvidia A100 GPUs, combined with Nvidia Mellanox networking.
In a blog, Kharya described ways Nvidia scientific computing is fighting COVID-19 with data analytics, simulation and visualization to AI and edge processing.
In an example of the use of Nvidia’s scientific computing platform, Kiwi is building robots with wheels to deliver medical supplies autonomously Whiteboard Coordinator has also built an AI system to automatically measure and screen body temperatures at the rate of more than 2,000 healthcare workers an hour.
Whiteboard had the system installed in 22 hospitals as of mid-May after first starting at Northwestern Memorial in Lake Forest, Illinois.
Also in May, Nvidia announced new Clara AI models to study chest CT scans to measure infections.