What is a neural network accelerator?

To describe a neural network accelerator, also known as an “AI accelerator,” we first to need to define a neural network. Simply put, a “neural network” is just a fancy name for a set of algorithms that performs the clustering and classification of data in machine learning applications. It got that name because many years ago someone drew a comparison to how neurons in a human’s brain work.

A neural network accelerator is a processor that is optimized specifically to handle neural network workloads. As the name implies, it is very efficient in doing its job of taking data and clustering and classifying it at a very fast rate.

The central processing unit (CPU) of a computer is typically slow and power hungry, Engineers figured out that a highly tuned processor completely optimized only to do a specific set of tasks can run extremely quickly on low power and help to make certain algorithms run faster.

Examples include graphic processing units (GPU), which got their start in gaming and graphics and DSPs, which are optimized to do signal processing quickly and efficiently.

The theory and the technology for neural network accelerators have been around for decades, but it has only been within the last five years that the economics have allowed the technology to be commercialized. The migration of processing from the cloud to the edge to embedded devices, which translates to a higher demand for heavy compute power at the local level, is also driving market growth.

When it comes to neural network accelerators, there is no one size fits all. Many devices are custom built for specific verticals and applications, and may have many different types of accelerators, including DSPs and GPUs and a neural network accelerator all on the same chip.

Chip makers also have to take into account the peripherals—the means for how data will get into and out of the chip—in order to avoid a potential bottleneck or delay in processing. Peripherals will typically vary by vertical.

In the end, why does processing speed matter so much when it comes to neural networks?

Take for example the “pipeline of processing” for object detection, classification, and tracking—one of the most challenging tasks for collision avoidance in an autonomous vehicle. The data is collected through camera sensors, GPS, LiDAR, inertial sensors, etc. The information will likely go through some vision processing algorithms to be enhanced and filtered and then sent to the neural network, where the necessary computations must be done quickly enough to avoid a potential collision—all while the vehicle is traveling along at 60 miles per hour.

For an autonomous vehicle using machine learning to be considered safe, it needs to “experience” thousands of different scenarios to “train” the software well enough to recognize the multitude of different scenarios while driving. While this can be sped up using simulation, modeling, etc. a lot of what is required is literally operating the vehicle in traffic and letting the ML improve over time by experiencing more and more.

RELATED: Intel unveils array of chip research focused on edge data processing