Google Cloud, Nvidia showcase joint AI plans

Nvidia’s dominance of the AI market thus far could run into challenges in the years ahead, and one of the challengers could be Google, as the company continues to develop custom processors for Google Cloud. That could have made things a little weird this week when the CEOs of Nvidia and Google Cloud met on stage at the Google Cloud Next event, but no tension was obvious, and in fact, the execs came together to announce an expansion of their ongoing partnership with generative AI in mind, along with further talk of how they already have begun to deepen their collaboration.

At the event, after being called to the stage by Google Cloud CEO Thomas Kurian, Nvidia co-founder and CEO Jensen Huang jogged out, shook hands with Kurian, and described the partners’ intentions as no less than a “reengineering of the entire stack, from the processors to the systems to the networks and all of the software, and all of this to accelerate AI and to create software and infrastructure for the world's AI researchers and developers.”

That proclamation comes almost a year after Huang said Google was among the companies getting its hands on Nvidia’s H100 Tensor Core GPU. In the months since, Google unveiled plans for its A3 supercomputer featuring H100 GPUs, and which the web giant said would be aimed at tackling the rising challenges posed by generative AI and large language model training. Along with announcing the expanded partnership this week, Google Cloud also marked the culmination of the past year of work by announcing that its new A3 instances powered by those Nvidia H100 GPUs will go into general availability next month, offering 3x faster training and significantly improved networking bandwidth compared to previous versions.

These announcements also came on the heels of Nvidia being named Google Cloud’s Generative AI Partner of the Year, which at this point probably should not come as a surprise.

Regarding the expanded joint work, Google officials said the company is using both H100 and A100 GPUs for internal research and inference in its DeepMind and other divisions, while Huang pointed to the deeper levels of collaboration that enabled Nvidia GPU acceleration for the PaxML Jax-based machine learning framework for creating massive LLMs. PaxML has been used by Google to build internal models, including DeepMind as well as research projects, and will continue to leverage Nvidia GPUs. The partners also announced that PaxML is available immediately on the NVIDIA NGC container registry.

Huang described the PaxML work as creating frameworks that allow Google Cloud and Nvidia “to push the frontiers of large language models distributed across giant infrastructures, so that we could save time for the AI researchers scale up to gigantic next generation models, save money, and save energy, and all of that requires cutting-edge computer science.”

In addition, the partners also announced at Google Cloud Next that:

  • Nvidia’s H100 GPUs will power Google Cloud’s Vertex AI platform. H100 GPUs are expected to be generally available on VertexAI in the coming weeks, enabling customers to quickly develop generative AI LLMs.

  • Google Cloud will be one of the first companies to gain access to Nvidia’s DGX GH200, the new AI supercomputing system that Nvidia has been talking up for most of the last three months, to explore its capabilities for generative AI workloads.

  • The Nvidia DGX Cloud, which the company announced at its Spring GTC event, is coming to Google Cloud, allowing AI supercomputing and software to be available to customers directly from their web browser to provide speed and scale for advanced training workloads.

  • Nvidia AI Enterprise software will be available on Google Cloud Marketplace.

  • Google Cloud is the first to offer Nvidia L4 GPUs, the products that Nvidia announced last March that are to be among its first GPUs aimed at AI inference tasks, as opposed to AI training, where the semiconductor giant’s early strength in AI has come from.

So, at a time when Google has been seen working on and investing in more of its own processors, like its Tensor Processing Unit (TPU), it appears intent on working even more closely–and not less, at least for now–with Nvidia. Kurian said, “For us at Google, it's a natural evolution of the AI market. Many people ask me, ‘What's the relationship between the TPU and GPUs? How do you think about that?’ Very simply put, as AI evolves, the needs of the hardware architecture and software stack evolve from training to inferencing, to new capabilities like embedding, and we want to offer customers the broadest, most optimized choices… We're actually offering 13 different types of accelerators in GCP. Secondly, we're also a platform company… at the heart of it, and we want to attract all those developers and customers who love Nvidia GPU technology or software to our platform.”