AI

Nvidia announces Blackwell Ultra, Rubin roadmap plans at GTC

The patina of hype on AI is not as shiny as it used to be, but Nvidia CEO Jensen Huang isn’t about to let our interest fade. As energetic and leather jacket-clad as ever, Huang arrived at Nvidia’s Spring GTC event to reassure us that AI computing, particularly in the imminent age of agentic AI and physical AI, is going to require a lot more computing resources than anyone thinks (Deepseek be damned), and to introduce the Nvidia GPU system roadmap that will support this need.

In his keynote speech at the San Jose, California, event, Huang reiterated that the company’s Blackwell GPU system “is in full production,” and that the Blackwell Ultra GB300 superchip with two Blackwell Ultra GPUs and one Grace Hopper chip and weighing in with a 1.5x performance improvement over Blackwell, will arrive in the second half of this year. At a system level, the Blackwell Ultra family also will include the GB300 NVL72, connecting 72 Blackwell Ultra GPUs and 36 Arm Neoverse-based Nvidia Grace CPUs in a rack design, and the HGX B300 NVL16 system.

The GB300 NVL72 delivers 1.5x more AI performance than the earlier model GB200 NVL72, Huang said.

The HGX B300 NVL16 will provide 11x faster inference on large language models, 7x more compute and 4x larger memory compared with the Hopper generation on AI reasoning and other workloads.

Server partners planning to support the Blackwell Ultra rollout with products include Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo, Supermicro, Aivres, ASRock Rack, ASUS, Eviden, Foxconn, GIGABYTE, Inventec, Pegatron, Quanta Cloud Technology (QCT), Wistron, and Wiwynn. Cloud service providers Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure and GPU cloud providers CoreWeave, Crusoe, Lambda, Nebius, Nscale, Yotta and YTL will be among the first to offer Blackwell Ultra-powered instances.

But, there’s more on the roadmap heading into 2026 and 2027. “We’re talking about an extreme scale-up” that will be needed to support the next era of AI computing,” Huang said. In the second half of 2026, Nvidia will introduce the Vera Rubin GPU, with 2x the performance of its predecessor, and the Vera Rubin NVL144 system, with 3.3x the training and inference performance of the GB300 NVL72, with HBM4 memory and more. And in the second half of 2027, we’ll see the Rubin Ultra NVL576 systems, with more than 14x the performance for inference and training than this year’s GB300 NVL72.

Huang said this rapid ramp-up of computing power will be absolutely necessary to support “the age of AI reasoning.” 

He explained: “AI now understands the context, understands what we're asking, understands the meaning of our request, and generates what it knows... Rather than retrieving data, it now generates answers, which fundamentally changed how computing is done.” He added that agentic AI “can reason about how to answer or how to solve a problem. It can plan and take action. It can use tools because it now understands multi-modality information. It can go to a website and look at the format of the website, words and videos, and play a video, learn from that website, understand what it learns, and come back and use that information, use that newfound knowledge, to do its job.” 

Meanwhile, physical AI will be able “to understand the physical world, things like friction and inertia, cause and effect, object permanence” to enable a new generation of intelligent robotics, self-driving vehicles, and more, he added. Nvidia announced at GTC that it is partnering with GM to help use physical AI to power the company’s self-driving vehicles strategy.

These two paths of AI evolution will need to be built on new understanding about the computing resources AI will need to scale, something Huang said “almost the entire world got wrong,” in a clear reference to the lessons many drew from the arrival of the Deepseek AI model, which was understood to require fewer GPUs and other training resources than other large models. 

Echoing similar statements Huang made during Nvidia’s most recent earnings call, he said, “The computation requirement, the scaling law of AI, is more resilient and, in fact, hyper-accelerated the amount of computation we need at this point. As a result of agentic AI, as a result of its reasoning process, the need is easily 100 times more [compute] than we thought we needed this time last year.”