Customizable generative AI? Nvidia's heading in that direction

The SIGGRAPH 2023 computer graphics conference is not until August, but Nvidia appears to be packed and ready to go, announcing this week that it will present around 20 research papers the event--Aug. 6-10 in Los Angeles--that advance the state of generative AI and neural graphics technology to make it easier and quicker to work with for developers and enterprises.

Some examples of the topics of these papers, several of which were produced in collaborations with more than a dozen universities in the U.S., Europe and Israel, include work on generative AI models that turn text into personalized images; inverse rendering tools that transform still images into 3D objects; neural physics models that use AI to realistically simulate complex 3D elements; and neural rendering models that unlock new capabilities for generating real-time, AI-powered visual details.

For example, two papers from researchers at Nvidia and Tel Aviv University discuss how generative AI can enable users to provide image examples that an AI model can quickly learn from to create outputs that address specific, contextual needs. 

An Nvidia blog post stated, “One paper describes a technique that needs a single example image to customize its output, accelerating the personalization process from minutes to roughly 11 seconds on a single NVIDIA A100 Tensor Core GPU, more than 60x faster than previous personalization approaches… A second paper introduces a highly compact model called Perfusion, which takes a handful of concept images to allow users to combine multiple personalized elements… into a single AI-generated visual.”

This kind of work will be important to helping commercial organizations figure out what kind of value generative AI can bring to their specific needs and how to work with the technology. These are issues they may be struggling with as generally available generative AI solutions continue to become more popular, but also carry an aura of mystery, risk and controversy.

With a recent series of announcements, including last week’s launch of NeMo Guardrails, which is software that keeps large language models on track and away from potential privacy and accuracy breaches, Nvidia appears to be working to demystify generative AI, and show how it can be more practical and useful.

Innovations presented in such papers from the Nvidia Research organization are regularly shared with developers on GitHub and incorporated into Nvidia products, like the Omniverse platform for creating metaverse applications and virtual worlds, and the Picasso foundry for custom generative AI models for visual design. The latter was unveiled at Nvidia’s own GTC ‘s conference back in March.

The blog post stated that developers and enterprises will be able to leverage these innovations to rapidly generate synthetic data to populate virtual worlds for robotics and autonomous vehicle training. They’ll also enable creators in art, architecture, graphic design, game development and film to more quickly produce high-quality visuals for storyboarding, previsualization and even production.