Intel, Blockade Labs leverage generative AI in new 3D model

The competitive battles between semiconductor giants like Intel and Nvidia are happening on many different battlefields, and the latest and most active fighting has centered around generative AI.

Thus far, Nvidia has made more noise than many companies when it comes to leveraging generative AI technology, but Intel Labs  this week countered those efforts by unveiling, with Blockade Labs, a new novel diffusion model that uses generative AI to create realistic 3D visual content for applications like digital twins and other metaverse environments (another area where Nvidia has been making a lot of noise).

The new model from the partners is called Latent Diffusion Model for 3D (LDM3D), and Intel claims it generates a depth map using the diffusion process to create 3D images with 360-degree views that are vivid and immersive. Intel claims LDM3D has the potential to revolutionize content creation for metaverse applications, digital twins, and other digital experiences in a variety of industries.

To test the model’s value in such applications, Intel internally created a digital 3D replica of its own AI supercomputing cluster that consists of Intel Xeon CPUs and Habana Gaudi accelerators, the same kind of cluster that the LDM3D model itself is trained on, according to a an email statement Intel sent to Fierce Electronics. “This kind of digital twin of real-world facilities can be created using just the still 2D images of the facility and LDM3D’s image-to-image pipeline,” the statement read.

Intel said the new model addresses the typical limitations of working with generative AI models capable of generating only 2D images.

“Generative AI technology aims to further augment and enhance human creativity and save time,” said Vasudev Lal, AI/ML research scientist, Intel Labs. “However, most of today’s generative AI models are limited to generating 2D images and only very few can generate 3D images from text prompts. Unlike existing latent stable diffusion models, LDM3D allows users to generate an image and a depth map from a given text prompt using almost the same number of parameters. It provides more accurate relative depth for each pixel in an image compared to standard post-processing methods for depth estimation and saves developers significant time to develop scenes."

To demonstrate the potential of LDM3D, Intel and Blockade researchers developed DepthFusion, an application that leverages standard 2D RGB photos and depth maps to create immersive and interactive 360-degree view experiences. The LDM3D model is a single model to create both an RGB image and its depth map, leading to savings on memory footprint and latency improvements.

The introduction of LDM3D and DepthFusion paves the way for further advancements in multi-view generative AI and computer vision, according to Intel. LDM3D is being open sourced through HuggingFace, which will allow AI researchers and practitioners to improve this system further and fine-tune it for custom applications, the company said.