Recent AI chip announcements from Google and NTT provide a glimpse at how large companies, including hyperscalers, telecom operators, and other industrial giants, are getting more involved in developing their own AI chips with their own special application, cost, and efficiency requirements in mind.
That was already true in the early stages of the AI training era, and it is a trend that is continuing as these companies are transitioning to AI inference and reasoning throughout their infrastructures.
Google's newest TPU
For Google, designing custom data center AI chips is nothing new, as the company this week unveiled the seventh generation of its Tensor Processing Unit (TPU), called Ironwood. While Google has worked with partners such as Broadcom and TSMC on past TPUs focused on training needs, Ironwood is “the first designed specifically for inference,” according to a blog post from Amin Vahdat, vice president/general manager for ML, Systems & Cloud AI at Google Cloud. “Ironwood is our most powerful, capable and energy efficient TPU yet. And it's purpose-built to power thinking, inferential AI models at scale.”
The TPU, announced at Google Cloud Next 25 in Las Vegas, reportedly was developed in collaboration with MediaTek, rather than Broadcom, this time around. Its performance-per-watt is roughly twice that of Google’s sixth-generation Trillium chip, and it comes in 256-chip or 9,216-chip configurations, depending on workload. The latter configuration delivers 42.5 Exaflops, which Vahdat stated is “more than 24x the compute power of the world’s largest supercomputer – El Capitan – which offers just 1.7 Exaflops per pod.”
Ironwood also features SparseCore, “a specialized accelerator for processing ultra-large embeddings common in advanced ranking and recommendation workloads,” and is supported by Google’s Pathways ML runtime to “enable efficient distributed computing across multiple TPU chips.”
These capabilities and features represent just some of the ways that Google is applying what it has been learning about AI processing and its Gemini AI models over the years to its custom chip program.
Jack Gold, president and principal analyst at J. Gold Associates, told Fierce Electronics, “Given all that Google has learned about its products and compute needs, it's logical for them to focus on building an optimized processing infrastructure that is geared towards its AI software and cloud needs. Optimized chips that can also run more efficiently in their infrastructure particular is a way for them to also provide a more efficient capability that, when you spread it over tens of thousands of chips, can make a big difference in what they can charge competitively for their compute.”
Gold added, “Many industries do this ‘specialized’ chip thing, like networking, telco, medical, industrial devices, etc. General purpose processors are great at doing lots of things, but if you want the most optimized (meaning runs best at lowest cost of compute) then you build out custom, or if cost is no object you do that as well. Given the huge volumes of cloud needs by Google and others (e.g., AWS, Azure), it's economically feasible to design and build your own chips, where smaller scale systems wouldn’t necessarily offer the economics in a cost competitive nature.”
NTT's new LSI
Some of the same reasoning applies to NTT’s new AI inference chip. At NTT Research’s Upgrade 2025 event, it unveiled and demonstrated what it described as a “large-scale integration (LSI) for the real-time AI inference processing of ultra-high-definition video up to 4K-resolution and 30 frames per second (fps).”
An NTT researcher affiliated with the project told Fierce Electronics that NTT Research designed the LSI chip itself, and that it is manufactured by TSMC. It includes an AI inference engine that “reduces computational complexity while ensuring detection accuracy, improving computing efficiency using interframe correlation and dynamic bit-precision control,” according to an NTT statement. “Executing the object detection algorithm You Only Look Once (YOLOv3) using this LSI is possible with a power consumption of less than 20 watts.”
That fits a common edge AI power consumption profile. As demand for edge AI processing grows in video and image data applications in particular, being power-efficient means aiming for “tens” of watts, compared to the hundreds or thousands of watts in the world of data center CPUs and GPUs.
“Inference needs for large processing at low cost is a driver in this space as well,” Gold observed.
NTT offered an example application, stating that when the LSI chip is installed on a drone, “the drone can detect individuals or objects from up to 150 meters (492 feet) above the ground, the legal maximum altitude of drone flight in Japan, whereas conventional real-time AI video inferencing technology would limit that drone’s operations to about 30 meters (98 feet). One use case includes advancing drone-based infrastructure inspection for operations beyond an operator’s visual line of sight, reducing labor and costs.”
Kazu Gomi, president and CEO of NTT Research, added, “The combination of low-power AI inferencing with ultra-high-definition video holds an enormous amount of potential, from infrastructure inspection to public safety to live sporting events. NTT’s LSI, which we believe to be the first of its kind to achieve such results, represents an important step forward in enabling AI inference at the edge and for power-constrained terminals.”
Similar to Google, NTT’s development could be seen as a company applying years of experience with AI to develop its own chip with some efficiency adjustments that directly address its own needs. The company said it plans to commercialize the LSI within its 2025 fiscal year, and its researchers are considering how to apply it to the data-centric infrastructure of the Innovative Optical and Wireless Network (IOWN) Initiative led by NTT and the IOWN Global Forum.
But, unlike Google, which uses its TPUs only in its own cloud, NTT may have broader aspirations. The NTT researcher told Fierce Electronics, “We are considering making it available to a broader market, not just for IOWN.”