Sensory offers speech recognition model that understands children's voices

Coming off a school year during which even the youngest students saw their teachers only through the windows of laptops or tablets, we are more aware about how children engage with technology. As voice and AI more often become the interaction points with devices and applications, it makes sense that the next generation of access tools needs to account for the unique needs of children.

That’s what Sensory has done with its new custom-trained speech recognition models for children’s voices, now available for a range of voice-activated devices that process North American English language commands. 

While many speech recognition models and technologies are available, most of them don’t give special attention to how a child’s voice differs from an adult’s in terms of clarity, word patterns, understanding and audibility. Sensory AI and Ireland’s Soapbox Labs are among the few that have made the extra effort..

A spokesman for Santa Clara, California-based Sensory told Fierce Electronics via email that the “children’s speech recognition model was over a decade in the making, with a combination of Sensory-led voice data collection and publicly available speech corpora.” He added that collecting and incorporating children’s voice samples resulted in recognition models that “understand the unique linguistic patterns associated with children’s speech,” and have shown up to a 33% reduction in word error rate compared to speech recognition models designed around adult voices. 

Sensory new AI-driven recognizer supports both Sensory’s TrulyHandsfree phrase spotting technology and TrulyNatural large vocabulary continuous speech recognizer, and leverages the company’s AI-on-the-edge architecture. This will allow the new models to be used in the development of a range of devices and applications, including children’s toys, wearables designed for kids, and education technology.

For those concerned about privacy, edge on-device processing also means that a child’s voice doesn’t have to be sent to the cloud for processing, the company spokesman noted.

Taiwan-based Generalplus Technology, a supplier of integrated circuits for speech and toys, is already using Sensory’s new children’s speech recognition models, and other companies are in the midst of evaluating kid-focused models, Sensory said.

“Sensory’s new kid’s speech recognition has already been integrated into several of our ICs and we are seeing strong interest from our customers,” said Jacky Chen, marketing manager at Generalplus. 

Sensory already has a long list of customers for its adult-based speech recognition, voice control and wake word detection solutions that it may be able to tap into as it markets the kid-voice models. This list includes Samsung, LG, Amazon, Spotify, Snap, Garmin and Honda. Sensory offers a VoiceHub developer portal that eases use of its speech models by allowing developers to directly export to many supported DSP and microcontroller formats, including the newly added Generalplus ICs.

RELATED: Nvidia projects are helping AI find its human-like voice