AI

The GPU market’s primed for diversification, openness

Nvidia has become synonymous with GPUs thanks to its dominance in the current artificial intelligence (AI) boom, but there are other players who see the market broadening as AI evolves and becomes more application specific.

Mahesh Balasubramanian, director of data center GPU product marketing at AMD, told Fierce Electronics that the company sees AI as the biggest change to the technology industry in 50 years, which means rapid innovation to respond to what customers want.  “We see opportunities for innovation across silicon, packaging, system architecture, networking and software that will drive performance and capabilities for AI and high performance computing (HPC) workloads,” he said.

Balasubramanian expects GPUs will continue to be the predominant engine for generative AI for the foreseeable future. “They provide a flexible, powerful and easy to program environment for serving the many functions in data-intensive AI applications.”

But GPUs aren’t the solution to every problem that needs high performance compute capability – GPUs, CPUs, FPGAs, and DPUs all have their place and can be optimized around the design architecture of a given system, Balasubramanian said. “There is no one-size-fits-all when it comes to computing. Picking the right compute engine for the right workload is critical,” he said.

GPUs are too much for a web server, and a CPU isn’t up to training a multi-billion parameter LLM, Balasubramanian said. “The most important thing to do is assess the workload you need to accomplish, what your environment can support and then move forward with the compute engine that gives you the best results for what you need.”

Balasubramanian said AMD sees a path forward where software becomes more specific and targeted, and a chip customized for that application-specific AI use case becomes an option to support performance and cost efficiencies. “We view the market broadening to meet various application and AI model needs as the demand for AI computing overall continues to grow.”

The road to AI began with gaming graphics

GPUs have evolved quite a bit from their early beginnings when AI was the stuff of science fiction stories. They were first developed to handle 3D consumer graphics in the mid-1970s – think game consoles such as the Atari 2600.

Graphics applications for workstations would follow, and 1993 would see the emergence of new players including Nvidia, which by 2000 was competing primarily with ATI in the desktop PC graphics market. By 2013, Intel, AMD and Nvidia were the primary GPU players, but it’s not until 2020 where AI began to emerge as a high demand application for GPU technology.

Joseph Byrne, analyst with Xampata, said AMD and Nvidia are currently the only viable PC-class discrete GPU suppliers, with Nvidia being the overwhelming leader, but there are many other GPU makers that target specific workloads in part because GPUs predate AI.

Some of those applications are for smartphones, noted Byrne, with companies like Qualcomm and Apple making their own GPUs, while MediaTek uses an Arm GPU. Smartphones is a segment where you won’t find Nvidia, he added, and other companies dominate the embedded market, which includes automotive applications. 

When it comes to data center AI, Nvidia was in the right place at the right time, Byrne said, with companies like Intel falling behind.

He said GPUs becoming programmable was the starting point to get them to where they are now in the AI world, starting with adoption for high performance computing (HPC) applications in national labs and smaller systems in industries like oil and gas. “That led to the development of the data center class GPU,” Byrne said.

The ability to do matrix math is what has led to the adoption of GPUs for AI applications, and it was Nvidia’s decision to invest in its own CUDA language that put the company in a position to capitalize on the AI boom in way that AMD and Intel couldn’t Byrne said. Buying Mellanox gave Nvidia networking capabilities that allows it architect entire systems. “That's really propelled them forward.”

Software, infrastructure gives Nvidia advantages

Nvidia has particularly excelled at AI training because it has developed the necessary software infrastructure, Byrne said. “They've really built a wall in the market that protects them. They've been at the forefront in terms of performance.”

AMD is Nvidia’s closest challenger, which has GPUs being used for HPC applications, and primarily used for inference, Byrne said, which is more insulated from the software infrastructure. “The barrier for challengers is less on the on inference side.”

GPUs aren’t always necessary for doing inference. Intel hasn’t kept pace in the GPU race, but it does have capabilities with its Xeon processors to run some inference tasks, Byrne said. “They'll find it very difficult to challenge Nvidia on the training side and on the infrastructure inference side,” he said. “Most of the big data centers offload pretty much all of their inferencing to some kind of specialized AI accelerator.”

Intel does have its Gaudi 3 accelerators, based on technology from its Habana acquisition, which have made some inroads for inference and training, including support by IBM for its Watsonx AI platform. However, Gaudi is expected to be replaced by Intel’s Falcon Shores GPU, a fusion of Intel's Xe graphics DNA with Habana's technology, and the roadmap beyond is unclear.

Byrne said there’s always going to be a certain class of problems where it makes sense to offload from the CPU. “It's principally a resource for more general tasks,” he said. “It just can't be as efficient.”

GPUs aren’t solo acts, either. There are multiple GPUs across many racks of servers to scale out AI capabilities and that requires connectivity. While big players like Nvidia can provide an entire platform, companies like Astera step in to connect GPUs other elements in the infrastructure such as storage, Thad Omura, Astera’s chief business officer, told Fierce Electronics. “This is your so-called scale out fabric portion,” he said. “There’s always a so-called AI head node or something that's controlling the whole platform.”

Connectivity is critical for GPU scale out

Connectivity is essential for taking advantage of the capabilities of a GPU or an AI accelerator, Omura said. “The reason why it's scaling so fast is because the workloads can be parallelized.” The goal is to make the GPUs work on the same problem as efficiently as possible, he said, which requires an ecosystem that enables them.

Nvidia relies a lot on its own components, Omura noted, but there is a diversification in the different types of GPUs and accelerators that are coming to market, including those from AMD and Intel, while hyperscalers are developing their own, such as Microsoft’s Azure Maia custom chip. Even the Chinese hyperscalers have internally developed GPUs, he added. “We're seeing this area continue to diversify at a very fast pace.”

Various GPUs and AI accelerators are trying to use standard types of technologies to leverage an ecosystem of interconnected devices, such as PCI Express and Ethernet, to provide the level of connectivity necessary to scale all these GPUs working together, Omura said.

Interoperability is less important on the AI scale up portion, he said, because it's a homogeneous environment with same type of GPUs talking to each other. “That's an area where you're going to see increasingly more customization and platform specific protocols being used. That's where people tend to try to optimize performance or try to optimize reliability or do something special.”

Scaling out is different, however. “That's where interoperability and standards remain absolutely critical,” Omura said.

Aside from connectivity, Omura said everyone is trying to optimize cooling and power density, which means looking at liquid cooling systems and spreading a GPU cluster across more racks to spread the power density. The latter affects connectivity, which puts Astera’s connectivity offerings in a good position to serve a wide range of hyperscalers and AI platform providers. “The connectivity's key to making all that happen now.”

AMD’s Balasubramanian said the company has excellent relationships with the hyperscalers for its AMD Instinct Accelerators and AMD EPYC CPUs. “In a market that’s evolving as fast as AI, the most important thing is to drive an open ecosystem of partners, customers and end-users,” he said. “The future of AI is not a closed ecosystem, but one that’s open and thriving with numerous parties involved.”