Intel and nimble AI: smaller models can run on an AI PC

Gadi Singer has been at Intel for 40 years. One of his first projects was the 386 processor going back to 1985.  He’s about as close as it gets to the proverbial man-who-has-seen-it-all, at least at Intel.

Now, as vice president and director of emergent AI research at Intel Labs, he has a few things to say about AI.

“AI is the future….The state of the art of AI defines large multidimensional arrays [ but] there is also a current emergence of nimble models,” he tells Fierce Electronics in an interview.

Still, he continues, the number of parameters in AI is growing at an incredible pace of 10x a year. “If everything continues, you would assume for broad deployment of AI in small, medium and large models on very large machines…I’m driving that 2023 is a transition year, that we will have giant models of 100 billion to 1 trillion parameters.”

Despite this trend, smaller can be better.                                      .

“Smaller can be more targeted. We don’t need all the functionality…You can bring AI to the way you run your business. You can do smaller models, retrieval-based and targeted, and you can run them on an AI PC.”

Generally speaking, AI tasks can be done at, say, 15 billion parameters, a point he makes in a slide presentation at Intel Innovation.

One slide shows LLaMA at 7 billion parameters, compared to Alpaca at 7 billion, then Vicuna at 13 billion, MPT at 7 billion, Falcon at 1.3 billion, ORCA at 13 billion, and  LLaMA 2 at 7 billion.

Part of how these nimble models can be effective is that AI generated information is retrieved from a traceable source, outside the model, instead from the model’s memory.

He argues that retrieval-based traceability has several benefits: transparency and verifiability; higher accuracy; the ability to address copyrights associated with a source; reduction of model bias; a broader scope of private data; and reliance on selective and secured data sources.

In addition, running smaller models can be orders of magnitude less in  carbon footprint and more cost effective.

“Security is a big thing,” he adds. “We can run nimble models in a trusted environment where you don’t want to send queries out to the cloud.”

Singer assures me he’s not arguing to replace the cloud, which is how the large models will continue to run. “I think we need both [cloud and edge],” he says. “Huge traction started with consumers where the cloud model was the most appropriate in the most cases. For companies that have cloud, there will be a lot of uses on the cloud, some remote, and for a spectrum of uses, it will be just about how much you care about security.

“The thing is, AI is coming to your business. We at Intel can bring AI to the way you run your business. Through the cloud you can do generative AI. If you can do smaller models, retrieval based and targeted, you run them on an AI PC.”

AI PC at the center of edge 

At Intel Innovation on Sept. 19, CEO Pat Gelsinger introduced the chip behind the AI PC, the Intel Core Ultra, code-named Meteor Lake.  It features a CPU, GPU and NPU (Neural Processing Unit) where the AI functionality will be located. It will ship Dec. 14 and be used in the new Acer Swift laptop. 

Intel promotes the Core Ultra as the “largest client architectural shift in 40 years” and “an inflection point in Intel’s client processor roadmap.”  It is the first processor manufactured on the new Intel 4 process node using its 3D high-performance hybrid archicture and the first client tile-based design enabled by Foveros packaging technology.

In addition to running the AI PC, it will become the basis of client computing for edge devices, some used in rugged environments such as industrial, an Intel official said.

RELATED: Intel to expand AI Meteor Lake chip to edge, beyond the AI PC