Perceive looks to gain edge in busy edge AI market

The demand for edge AI processing has been ramping up the last couple of years, and with it a throng of start-ups has surged into the market with aims to address a set of edge device requirements that may not seem like a natural fit for the larger-footprint, more power-hungry chips.

One of those companies, Perceive, recently stepped forward at CES 2023 with an updated version of its Ergo processor – the Ergo 2 AI processor – which is designed to address not just the need for edge intelligence but the evolution toward ever larger and more larger and more complex neural networks inside edge devices. Examples of these larger neural architectures include transformer networks for language and imaging, but true to edge deployment profiles they may draw less than 100 milliwatts of compute power.

David McIntyre, Perceive's VP of Marketing, told Fierce Electronics via email that the original Ergo chip may still be the best option for product developers for whom power-efficiency “is really critical, because they’re constrained to battery power, or USB power, for example,” yet they still require AI performance. 

But he said a new segment of the market is emerging, developers that need to run larger neural networks, or process video at higher frames per second, or run multiple networks at once, or use transformer networks for either video or language processing applications, for example. This might include enterprise-grade cameras for security or retail analytics applications, or possibly more complex features in consumer devices like tablets and laptops. 

In both scenarios, device developers do not want to sacrifice any single attribute–performance, power efficiency, cost, size, temperature–for another.

The Ergo 2 addresses increasingly complex applications with the capability to run up to four times faster than Perceive’s first-generation Ergo chip, and delivering more processing power than typical chips designed for tiny ML edge applications. 

Ergo 2 can run multiple heterogeneous networks simultaneously, enabling intelligent video and audio features for devices such as enterprise-grade cameras for security, access control, thermal imaging, or retail video analytics; for industrial use cases including visual inspection; or for integration into consumer products such as laptops, tablets, and advanced wearables.  

Perceive claimed Ergo 2 processing, which is done on-chip and without external memory for better power efficiency, privacy, and security, is able to achieve:

  • 1,106 inferences per second running MobileNet V2

  • 979 inferences per second running ResNet-50 

  • 115 inferences per second running YoloV5-S 

To provide the performance enhancements needed to run these larger networks, the Ergo 2 chip has been designed with a pipelined architecture and unified memory, which improve its flexibility and overall operating efficiency. As a result, Ergo 2 can support higher-resolution sensors and a wider range of applications, including: 

  • Language processing applications such as speech-to-text and sentence completion  

  • Audio applications such as acoustic echo cancellation and richer audio event detection 

  • Demanding video processing tasks such as video super resolution and pose detection. 

The processor has a 7 mm by 7 mm footprint, and is manufactured by GlobalFoundries using the 22FDX platform. It is designed to operate without requiring external DRAM. Its low power draw also means it doesn’t need cooling. 

Perceive does not perceive the increasingly crowded market, which in addition to Perceive includes edge AI processor specialists like Quadric, Hailo Technologies, Esperanto Technologies and more, as a problem.

“We see competition from a couple of different sides,” McIntyre said. “On the one end, we see customers who are evaluating us vs. an SoC with NPU that may not really be able to run the neural network they want or can’t meet their power budget but is considered easy to implement and is from a known vendor. On the other end, we see competitors with really powerful chips, but they’re too large, too power-hungry, generate too much heat, and/or cost too much to be practical. One way this becomes evident is in looking at the power requirements. There are a lot of solutions for speech and audio processing that draw less than one milliwatt, but are limited to this basic functionality – a small set of voice commands, for example. And there are plenty of chips that will run larger networks for video processing, but they draw over a Watt of power. There’s actually quite a lot of space in between those positions, so while it seems like the market is crowded, we don’t see a lot of solutions occupying this particular space in between. Ergo and Ergo 2 are well-suited to meet the needs in that gap.”