AI Inference Processor Cranks Up Performance Levels

Habana Labs describes its Goya HL-1000 as the world’s highest performance artificial intelligence (AI) inference processor. A PCIe card based on the HL-1000 processor delivers 15,000 images/second throughput on the ResNet-50 inference benchmark, with 1.3-ms latency, while consuming 100W of power. 

 

Allegedly, company’s AI processors offer one to three orders of magnitude better performance than solutions commonly deployed in data centers today. Designed to process various AI inferencing workloads such as image recognition, neural machine translation, sentiment analysis, recommender systems and many other applications, the Goya platform has been designed from the ground up for deep learning inference. It incorporates a fully programmable Tensor Processing Core (TPC™) along with development tools, libraries and a compiler, that collectively deliver a comprehensive, high performance and power efficient platform.

 

Habana Labs' SynapseAI software stack analyzes the trained model input and optimizes it for inferencing efficiently on the Goya processor. The software includes a rich kernel library and its toolchain is open for adding proprietary kernels by customers, as it interfaces seamlessly with popular deep learning neural-network frameworks such as TensorFlow and ONNX. For more information, visit Habana Labs.