New Speech Recognition Chip Saves 99% Power

New Speech Recognition Chip Saves 99% Power
Researchers at MIT’s Microsystems Technology Laboratories have built a low-power chip specialized for automatic speech recognition. With power savings of 90 to 99 percent, it could make voice control practical for relatively simple electronic devices. Image: Jose-Luis Olivares/MIT

Speech recognition is one of the fastest growing and commercially most promising applications of natural language technology. People with disabilities that prevent them from typing have also adopted speech-recognition systems. But the drawback is this technology consumes lots of battery power. So, MIT scientists have developed a low-power special-purpose- new speech recognition chip. The chip reduces the power consumption of electronic devices.

The chip only requires 0.2 and 10 milliwatts compared with the 1-watt requirement in smartphones. And the more interesting is that it saves battery between 90 percent to 99 percent. That means the speech recognition can be introduced in more number of devices going forward.

Nowadays, each device consists of speech recognition technology. For example, recently introduced Android Wear 2.0 watches with Google Assistant. According to scientists, it could be the real game-changer.

Research led Anantha Chandrakasan said, “Speech input will become a natural interface for many wearable applications and intelligent devices. The miniaturization of these devices will require a different interface than touch or keyboard. It will be critical to the speech functionality locally to save system energy consumption compared to performing this operation in the cloud.”

The chip saves power due to efficient implementation of speech-recognition networks. It also has a “voice activity detection” circuit that separates the ambient noise. The noise identifies whether it might be speech or not. It confirms that it is voice only when the circuit gives green signal to get into the action.

Michael Price, a graduate student said, “The chip that we demonstrated includes a continuous speech recognizer based on hidden Markov Models (HMMs). It transcribes an arbitrary length audio input into a sentence. The transition model is a weighted finite-state transducer (WFST). The acoustic model is a feed-forward neural network. The same general techniques are used in some software speech recognizers.

We trained models for this recognizer using Kaldi, an open source toolkit. We used a few different speech datasets for training and testing. The largest recognizer we tested had a vocabulary of 145k words and required 7.78 mW for real-time operation. The smallest was a digit recognizer (11 words including “oh” for zero) which required 172 uW.

In recent time, the demand for IoT-based devices has been increased. For example, devices like Amazon Alexa and Google Home having good success. According to scientists, this new speech recognition chip might end up setting a new milestone in the industry.

Characteristics of this new Speech recognition chip:

  • Comes with the circuit to separate ambient noise from speech
  • Require power of between 0.2 and 10 milliwatts
  • Can potentially improve battery life on devices considerably