Intel touts Gaudi 2 for AI as it confronts Nvidia GPUs

By Matt Hamblen Mar 12, 2024 1:43pm

As generative AI training and inference tasks consume the attention of data centers and cloud providers globally, Intel continues to take on Nvidia’s hefty leadership role in AI chips while keeping an eye on AMD.

Just days ahead of Nvidia’s GTC conference next week, Intel on Tuesday named three customers for its Gaudi 2 accelerator chip, some who have seen better performance in generative AI workloads using Gaudi 2 when compared to their use of GPUs from Nvidia. The comparisons build on how Intel performed in benchmarks on MLPerf last year as compared to Nvidia A100 and H100 GPUs.

The three customers – Stability.AI, AI Sweden and Prediction Guard – have been deploying Gaudi 2 accelerators, Intel said. Overall, Intel said Gaudi 2’s AI price-performance is up to 50% better than H100 GPUs on GPT-3 training tasks.

Stability. AI CEO Emad Mostaque said in a recorded testimonial presented to reporters that Gaudi 2 offered 3x better performance and higher TCO than Nvidia A100. Also, it took the company less than a day to port its Multimodal Diffusion Transformer architecture model code from A100 and H100 to Gaudi 2. Intel said its message is holistic for developers—that its Gaudi hardware and software are cost effective and offer better performance, but are also simpler to use.

Stability.AI. which provides, open-access AiImodels to customers, posted its comparisons in a blog post, noting: “Our findings underscore the need for alternatives like the Gaudi 2, which not only offers superior performance to other 7nm chips, but also addresses critical market needs such as affordability, reduced lead times and superior price-performance ratios. Ultimately, the opportunity for choice in computing options broadens participation and innovation, thereby making advanced AI technologies more accessible to all.” https://stability.ai/news/putting-the-ai-supercomputer-to-work

Intel also named Prediction Guard as a Gaudi 2 customer, noting that its enterprise focused service is designed to reduce the risks of Gen AI and LLMs around hallucinations. AI Sweden is also using Gaudi 2 to experiment with a variety of apps, including a Swedish language version of Chat GPT, Intel said. The government is experimenting with generative AI around document generation, email replies, evaluations, end user reviews and a chatbot for interactions with constituents.

While many of Intel’s marketing materials claim its Gaudi 2 is a “fraction of the cost” of Nvidia GPUs, Intel attempted to clarify. “A Gaudi 2 server is in the A100 ballpark, [although] there’s always variability,” said Eitan Medina, chief operating officer at Habana Labs, an Intel company. Gaudi 2 will provide three times the throughput and reduced capex and opex, compared to A100, he said.

One way this improvement is possible is because Gaudi 2 doesn’t have graphic rendering capability as an Nvidia H100 GPU, which offers “much better use of computer memory in the chip,” Medina said. “Designing from the ground up gives power performance advantage of Gaudi 2 compared to others.”

While customers might focus on advertised speeds and feeds of the latest chips, Medina said they are mostly interested in basics when porting to Gaudi 2. “The most important thing is to exist—[with] a mature software stack,” Medina said. “The idea is to have a credible alternative that is mature. We’re very encouraged from customers that its easy to use and port over.”

Gaudi 2, based on a 7nm node, will be followed later in 2024 by Gaudi 3, built on a 5 nm node, Intel said. Gaudi 3 is expected to offer 4 times the compute performance of Gaudi 2, with twice the network bandwidth speed and 1.5 times the high bandwidth memory. After Gaudi 3, Intel is in the design phase of its successor, code-named Falcon Shores. Since Gaudi 2 was launched in 2022, Intel said it has seen an 8x increase in the growth of the customer pipeline.

While the competition with Nvidia and AMD focuses heavily on hardware designed to handle AI workloads, Intel argued it takes a system approach across software and hardware combined. “We will drive growth by designing purpose-built systems,” said Jeni Barovian, vice president of data center AI solutions strategy and product management.

Intel NVIDIA AMD data centers Electronics Embedded