Voice-Capture Devices Grace IoT and Consumer Designs

XMOS will demonstrate its comprehensive portfolio of VocalFusion true-stereo, voice-capture solutions for far-field voice-enabled stereo smart TVs, soundbars, and set-top boxes at Computex 2018 in June. The technology captures voice commands for use in Alexa, Google, DuerOS, and other voice-enabled artificial intelligence (AI) and internet of things (IoT) systems. The company will also be previewing next-gen VocalSorcery blind-source separation technology.

 

VocalFusion Stereo Dev Kit technology accurately captures voice commands from across the room, even in complex noisy environments, and when the same audio appliance is playing content at high volume. This development kit is described as the first stereo acoustic echo cancellation (AEC) far-field linear microphone array solution. It also supports configurable AEC latency, where the AEC reference signals can be accurately calibrated, and the latency adjusted, to enable after-market far-field voice accessories for existing consumer electronics products. Available as an Amazon Alexa Voice Service qualified solution, and for use with Google, Baidu or other voice enabled AI systems.

Fierce AI Week

Register today for Fierce AI Week - a free virtual event | August 10-12

Advances in AI and Machine Learning are adding an unprecedented level of intelligence to everything through capabilities such as speech processing and image & facial recognition. An essential event for design engineers and AI professionals, Engineering AI sessions during Fierce AI Week explore some of the most innovative real-world applications today, the technological advances that are accelerating adoption of AI and Machine Learning, and what the future holds for this game-changing technology.

 

VocalFusion voice processors deliver voice digital signal processing (DSP) including a full duplex acoustic echo canceller (AEC) with barge-in capability that enables users to interrupt or pause a device that's playing music, and an adaptive beamformer that follows a speaker. Additional dereverberation, automatic gain control, and noise suppression provide clear voice interaction experiences even in noisy environments. The processor interfaces directly to four PDM microphones in a linear array with 33.33-mm inter-mic spacing, making it viable for integration into flat screens and products found at the edge of a room.

 

VocalSorcery is a blind sound source signal separation technology that spatially identifies individual speakers or conversations within a crowded noisy audio environment to optimize voice capture and input into speech recognition systems. This technology solves the cocktail-party problem, and opens a wide variety of applications from video and conference calls, to automotive.

 

For more information, checkout the VocalFusion data page and visit XMOS.

Suggested Articles

One forecast from Cameron Chell: the best AI designers of the future won’t come from top universities


Survey of 30 chipmakers offers a good sign for research and development of self-driving vehicles, analyst says

Research dollars for AV are expected to remain, if slowed, especially for companies that see self-driving as a key to their success