Chip designers must rethink low power for AI

 

With the scaling of process geometries and the exponential demand for lower power devices, power challenges have intensified, bringing low-power design into the spotlight as use cases evolve.

While companies continue to innovate new features and functionality for portable, handheld devices, they’re also focused on minimizing power consumption to enable battery life improvements (an important differentiator for consumers). These mobile design challenges are well-known due to the wide availability of smartphones.

Power efficiency is also increasingly important for plug-in products because it can affect the overall cost of building the systems as well as operating them.

AI presents unprecedented challenges

The latest challenge in the low-power space that designers must now confront is the AI chip, especially those intended for high-performance computing (HPC) applications. While they don’t have the same constraints as traditional mobile devices, like battery life and portability, implementing AI introduces new power challenges due to the physics of smaller, denser, and more novel architectures and manufacturing processes. While the traditional holy grail of performance, power, and area (PPA) is still led by the need for the highest possible performance, now the performance is actually limited by the power. It’s becoming extremely hard to deliver power reliably to every part of the chip without worrying about the dissipated heat impacting the reliability of the chip and leading to a thermal run-away.

Power ramifications with advanced AI chips can have significant impact on overall functionality, manufacturability, cost, and reliability. As a result, design teams must begin using even more power-smart methodologies, along with sophisticated power analysis techniques and tools.

Leakage power is a constant challenge

Low-power design is all about reducing the overall dynamic and static power consumption of an integrated circuit (IC). Dynamic power comprises switching and short-circuit power, while static power is leakage or current that flows through the transistor when the device is inactive.

Leakage power was the primary concern for design teams in the 90-to-16nm range of process geometries because dynamic power was insignificant (just 10-15%) compared to its counterpart, leakage power (at 85-95%). Once the industry shifted to 16-to-14nm, dynamic power became more dominant than leakage power.

However, now as we move to process nodes like 7, 5, and 3nm and architectures similar to “gate all around” implementation, leakage is again becoming an issue. Today, design teams are exploring options that were set aside in past designs to enable as much power and performance out of a design as possible. The need to reduce margin at advanced nodes has been discussed for a while, but the ability to actually do something about it was spread across different parts of the design process. While the techniques and technologies to address today’s issues are familiar, we are only just beginning to really understand the precision with which they can be used.

Emulation is critical

The most critical component for dynamic power analysis and optimization is the quality of the vectors. Vector quality is defined by the realistic activity seen when the SoC is working in a real system. As mentioned above, the traditional power analysis process involved checking with the SoC architect to identify which vectors to use for power analysis and optimization. This was a hit-or-miss activity that didn’t always cover all aspects and scenarios.

To be able to accurately predict the amount of power that SoCs are going to consume, designers need to put the devices under a testbench that is true-to-life of how they are going to be used. The best system that can be used to run live applications is called emulation.

The sheer amount of data involved in running a power analysis for an AI chip requires high-powered tools. Even when running an application for a few seconds on an emulator, the resulting data is massive (hundreds of gigabytes comprised of trillions or billions of clock cycles). To help solve this problem, power profiling within an emulation system identifies the window of interest for power analysis and prunes the windows from billions to millions to thousands, which makes the power analysis from an emulation system much more practical.

Additionally, the new third dimension that comes into the picture when designing AI chips that isn’t dominant in mobile chip design is temperature. Generating a heat map at an early stage via emulation becomes a lot more important for the entire design process.

When it comes to low-power design for AI chips, adopting new methodologies and tools are critical to creating a tightly interwoven team of design professionals who come from different disciplines.

Godwin Maben is Low Power Architect and Synopsys Fellow at Synopsys Design Group.