AMD targets generative AI, LLMs, and Nvidia with new accelerator

AMD for months has been teasing the market with hints as to how it would counter Nvidia’s growing dominance in the AI acceleration space, and at this week’s AMD Data Center those hints became reality as the company announced a new power-packed GPU to handle the increasing demands of generative AI large language model workloads.

In unveiling the Instinct MI300X accelerator chip, which is a GPU-only product and not an integrated CPU/GPU model, AMD CEO LIsa Su described it as “the world’s most advanced accelerator for generative AI.” That remains to be seen, but the GPU is built on the company’s CDNA 3 accelerator architecture, and supports up to 192 GB of HBM3 memory to tame those increasingly massive LLM workloads.

With the large memory of AMD Instinct MI300X, customers can run AI models of up to 80 billion parameters, Su said. To prove it, AMD demonstrated how it could fit the Falcon-40 LLM, a 40-billion parameter model on a single MI300X accelerator.

“What’s special about this demo is it's the first time a large language model of this size has run entirely in memory on a single GPU. We've run a number of even larger models as well,” she said.

Explaining the broader implications, she added, “With more memory, less GPU is needed. What this means for cloud providers as well as enterprise users, is we can run more inference jobs per GPU than you could before. And what that enables is for us to deploy the MI300x at scale to power next-gen LLMs with lower total cost of ownership than you see today, really making the technology much much more accessible to the broader ecosystem.”

Furthermore, eight MI300X accelerators can be combined into what AMD called its new Instinct platform for high-powered AI inference and training. The MI300X is sampling to key customers starting in the third quarter, Su said. AMD also announced that the AMD Instinct MI300A APU Accelerator for HPC and AI workloads is now sampling to customers. “We expect both of these products to roll out in the fourth quarter of this year,” Su said.

In another swipe at Nvidia, AMD announced further integration between PyTorch and AMD’s ROCm software ecosystem for data center accelerators to enable a more open AI software ecosystem. This provides what AMD and the PyTorch Foundation called “day zero” support for PyTorch 2.0 with ROCm release 5.4.2 on all AMD Instinct accelerators. This integration empowers developers with an extensive array of AI models powered by PyTorch that are compatible and ready to use “out of the box” on AMD accelerators.

AMD’s AI maneuvers are positioning the company to take advantage of what will be a $150 billion total addressable market opportunity in the data center market by 2027, Su said. To those who think AMD is embarking on a late chase to catch Nvidia in the market, Su noted that generative AI has reshaped data center market demand over the last six month, and added “We are still very, very early in the lifecycle of AI.”

Su’s announcement of the MI300X came near the end of a 90-minute presentation that earlier had included videos and testimonials from customers and partners like AWS, Microsoft Azure, Citadel Securities, PyTorch, and more. During that portion of the event, AMD drew back the curtain on a series of updates to its fourth-generation EPYC data center CPU family, saying thatAWS would be including the next-gen Genoa processors in its upcoming Amazon Elastic Compute Cloud (Amazon EC2) M7a instances. AMD also said that separately, Oracle announced plans to make available new Oracle Computing Infrastructure (OCI) E5 instances with fourth-generation EPYC processors.

AMD also introduced the fourth-generation AMD EPYC 97X4 processors, formerly codenamed “Bergamo.” They offer 128 “Zen 4c” cores per socket for greater vCPU density, performance, and energy efficiency.

AMD was joined by Meta, which discussed how these processors are well suited for their mainstay applications such as Instagram, WhatsApp and more, and how Meta is seeing impressive performance gains with the latest generation processors compared to 3rd Gen AMD EPYC across various workloads, while offering substantial TCO improvements as well.