Inspur and Altera Launch Speech Recognition FPGA Solution with OpenCL

AUSTIN, TX -- Server vendor Inspur Group and FPGA chipmaker Altera launch a speech recognition acceleration solution based on Altera's Arria® 10 FPGAs and DNN algorithm from iFLYTEK, an intelligent speech technology provider in China, at SC15 conference in Austin, Texas. The launch results in Inspur becoming a HPC systems vendor with HPC heterogeneous computing application capabilities in GPU, MIC and FPGA.

The deep learning speech recognition acceleration solution leverages an Altera Arria 10 FPGA, iFLYTEK's deep neural network (DNN) recognition algorithms and Inspur's FPGA-based DNN parallel design, migration and optimization with OpenCL. The solution has a hardware platform in CPU+Arria 10 FPGA heterogeneous architecture and software in a high-level programming model in OpenCL to enable migration from CPU to FPGAs.

"Software algorithms for deep learning models need be fine-tuned and optimized continuously. Server accelerators with fixed functionalities will have increasingly low efficiency over time and waste space and electricity," said Yu Zhenhua, director of technology, iFLYTEK Co., Ltd. "In contrast, FPGAs are flexible, customizable and power-efficient. This is also an important reason that iFLYTEK decided to migrate DNN algorithms to a FPGA platform."

Field-Programmable Gate Arrays (FPGA), which have the characteristics of both an application-specific integrated circuit (ASIC) and a general chip, have the ability to do data parallel and task parallel computing simultaneously, which allows them to be more efficient in dealing with specific applications. FPGAs are currently used in logic control, signal processing and image processing and recently in online recognition systems.

"Inspur's Arria 10 FPGA-based deep learning speech recognition solution further demonstrates the performance-per Watt advantages that FPGA accelerators provide," said David Gamba, general manager of the computer & storage business unit at Altera. "This success in solution development will become an important reference for FPGAs in the deep learning field."

Meanwhile, Inspur is also expanding its software cooperation on the speech recognition system, designing OpenCL programming frameworks combined with iFLYTEK's applications, to increase the efficiency of application programming. With these efforts, Inspur can enable the migration of more applications to FPGA-based platforms and foster an FPGA ecosystem, which includes FPGA software, hardware and an applied algorithms library.

When speaking about further cooperation, Hu Leijun, vice president of Inspur, said that Inspur is committed to providing clients with computing solutions that best suit their needs. Given FPGA-based solutions great advantages in terms of performance per watt, Inspur will expand its software cooperation with IFLYTEK and Altera on FPGA-based deep learning online speech recognition applications. Moreover, Inspur will develop an FPGA-based system solutions, covering full cabinet computing, Internet and storage solutions, with the aim of making these solutions available for applications and clients in other fields.

In the future, a CPU+FPGA solution will probably be the new heterogeneous computing model for HPC, more and more HPC applications, data center applications and Internet deep learning applications will use CPU+FPGA solution.

The solution revealed include:

•High performance: When processing 100 bounds data, the DNN running time based on two Intel's Xeon E5-2650 V2 CPU (16 cores) is 242.027s, while the DNN running time based on Altera's Arria 10 FPGA is 84.312s, with a faster performance acceleration of 2.871.
•Low power consumption: The power consumption of Altera's Arria 10 FPGA and two Intel Xeon E5-2650 V2 CPU are respectively 30W and 190W, with the power consumption of FPGA-based system is 15.7 percent that of a CPU system. In an actual test of the DNN algorithms, an FPGA-based system can realize high performance per watt, up to 30GFlops/W, greatly saving application power costs.
•Easy to program: It only took four man-months for software engineers to do FPGA-based DNN parallel program development with OpenCL programming models. If traditional underlying languages, such as Verilog and VHDL, were used, it would take 12 man-months at least to do similar development, with collaboration between software engineers and hardware engineers required.
•High adaptability: FPGA can execute data-parallel computing with a DNRange model or task-parallel computing with a Pipeline model to address more applications and bring overall performance improvement to more applications and software.

For more info, go to http://en.inspur.com/inspur/494735/index.html