AUD | USD

How to Reduce Power Consumption in Always-On Voice Interface Designs

By Stephen Evanczuk

Contributed By Digi-Key's North American Editors

Voice-controlled interfaces such as Amazon Alexa, Apple Siri, and Google Home rely upon always-on voice capture capability to detect the wake word or phrase used to initiate complex speech processing algorithms that often require cloud-based resources. However, as voice-based control moves into battery-powered devices and adds to the standby power budget of other consumer devices such as TVs, this always-on capability represents a significant power drain, and an added design challenge. However, using a few low-power devices, developers can more easily implement voice-controlled interfaces without compromising the power budget.

This article shows how developers can use sound-activated microelectromechanical systems (MEMS) microphones in combination with low-power processors or codecs to create ultra-low-power, always-on voice-activated designs. By way of example, it will introduce and describe the usage of the VM1010 MEMS microphone from Vesper Technologies and the Apollo3 AMA3B1KK microcontroller from Ambiq Micro in this application.

Moving from low-power to ultra-low-power microphones

Designing sound-activated circuits is not a significant challenge in itself. Using a microphone and little more than an operational amplifier, engineers can easily create circuits able to detect when ambient sound exceeds some preset threshold (for example, see "Make a Switch Sound Activated Using LaunchPad"). The challenge arises in applying these simple methods for always-on detection in designs where low power consumption is a key requirement. For these devices, even the relatively modest current requirements of an active microphone and amplifier can exceed power budgets, particularly in battery-powered designs and in consumer devices with standby power that must comply with Energy Star guidelines.

Designers have for years taken advantage of miniature, low-power electret condenser microphones in low-power electronic devices. For example, subminiature electret microphones such as the Knowles FG series consume at most 50 microamperes (µA) with a 1.3 volt supply.

The emergence of MEMS technologies has enabled manufacturers to create ultra-low-power singled-ended output microphones that combine buffer amplifiers and other support circuitry in the same package (Figure 1).

Diagram of integrated MEMS transducer, buffer amplifier, and voltage regulatorFigure 1: Manufacturers have integrated a MEMS transducer, a buffer amplifier, and a voltage regulator to provide a complete microphone that generates a single-ended voltage output, Vout. (Image source: Vesper Technologies)

The result is an integrated MEMS microphone device that helps reduce the overall cost, complexity, and power draw of audio front-end designs. Yet the need to maintain full power to these robust, power-thrifty microphones means that even the most power efficient sound-activated products continually drain current and do so unproductively during extended periods of quiet.

This issue has been addressed using a type of specialized sound-activated MEMS microphone, such as the VM1010 from Vesper Technologies. Using these devices, developers can further reduce power consumption during inactive periods. Further, by using this microphone along with ultra-low-power microcontrollers or codecs, developers can design sophisticated always-on voice-activated speech interfaces in products with strict requirements for low power consumption.

Sound-activated microphone

In its normal, full-power mode of operation, the VM1010 microphone operates as a conventional high performance single-ended microphone. It consumes 85 µA while providing the captured sound signal on its analog output pin, Vout. In this mode, the microphone converts sound across its full 20 hertz (Hz) to 20 kilohertz (KHz) frequency range with a sensitivity of -38 voltage decibels (dBV) at a sound pressure level (SPL) of 94 dB. Unlike some earlier MEMS microphones, the Vesper VM1010 needs only a few milliseconds to recover from blasts of sound at very high SPLs.

Unlike conventional MEMS microphones, the VM1010 provides a second, low-power wake-on-sound mode of operation. In this mode, the VM1010 takes advantage of the Vesper ZeroPower Listening subsystem (Figure 2). This is Vesper's unique extension to the conventional MEMS microphone architecture.

Diagram of Vesper Technologies VM1010 MEMS microphoneFigure 2: The Vesper Technologies VM1010 MEMS microphone extends the conventional MEMS microphone architecture with a specialized ZeroPower Listening subsystem that monitors transducer output and generates a signal on its Dout digital output when the detected sound exceeds a configurable threshold. (Image source: Vesper Technologies)

In its wake-on-sound mode, the VM1010 consumes only 10 µA, which is typically less than the self-discharge current of batteries in portable speakers, or even of those in smartwatches or fitness wearables. In this mode, the device operates with a more limited frequency response from 250 Hz to 6 KHz. Working with this reduced range, the VM1010 can more reliably capture the dominant frequencies of the human vocal range while reducing false positives from different sources of noise in the environment.

The VM1010 disables its Vout analog output in wake-on-sound mode, but the microphone continues to monitor the external environment for sounds. As sounds occur, the MEMS microphone's piezoelectric element deflects in response, generating a small voltage level internally. When sound pressure increases this internal voltage level above a configurable threshold, the device's integrated comparator circuit responds by setting the VM1010 Dout output to a digital “high” level.

Developers can adjust the VM1010's wake-up sound threshold. If the VM1010's GA1 and GA2 wake-on-sound acoustic threshold pins are left open, the device operates with a maximum acoustic threshold of 89 dB SPL. By connecting a resistor between GA1 and GA2, however, the threshold can be adjusted down to a minimum 65 dB SPL (Figure 3).

Diagram of reducing Vesper Technologies' VM1010's default sound-detectionFigure 3: Developers can reduce the VM1010's default sound-detection threshold by placing a resistor across the device's GA1 and GA2 pins. (Image source: Vesper Technologies)

The VM1010 microphone's ability to signal its detection of a sound above a preset threshold provides the foundation for reducing power consumption in the complete voice processing chain of a larger design.

Wake-up signal

When designing with the VM1010, developers can use the device as a conventional single-ended analog microphone, connecting the VM1010’s Vout analog signal output to a processor's internal analog-to-digital converter (ADC). In a typical power efficient design, however, developers would make two additional connections between the VM1010's digital pins and the processor's GPIO pins (Figure 4).

Diagram of Vesper VM1010 MEMS microphone's wake-on-sound featureFigure 4: Developers can implement the VM1010 MEMS microphone's wake-on-sound feature using a simple interface that requires only two additional digital connections besides the usual analog connection between the microphone’s analog output and an MCU’s ADC. (Image source: Digi-Key, from Vesper Technologies source material)

To use the VM1010's wake-on-sound detection capabilities, its Dout digital output needs to be connected to an interrupt-enabled GPIO, and its mode input port needs to be connected to a separate GPIO.

Using this simple interface, developers can significantly reduce overall system power consumption by limiting the time spent in a fully active state. To switch the VM1010 to its low-power wake-on-sound mode, the processor GPIO is used to set the mode pin to a high level. Typically, the switch to wake-on-sound mode is performed in conjunction with a transition to a low-power sleep state in the processor.

When the VM1010 detects sound and sets Dout high, the resulting signal transition wakes the processor from its sleep state. As part of the return to active mode, the GPIO connected to the mode pin is set to a low level, causing the VM1010 to return to its full-power mode within 200 µs—well within the time needed to capture the input audio waveform. After processing the received audio, the processor can return the VM1010 to wake-on-sound mode and return to a low-power sleep state until the VM1010 issues the next wake-up signal.

In a system designed to detect a wake word, developers can simply extend this wake-on-sound sequence with a series of tests performed by the processor. After the VM1010 detects sound and uses its Dout signal to wake the processor, the processor first tests for signs of voice activity. Here, the processor might execute code looking for indicators such as frequency range and duration of the sound. If these indicators suggest a voice signal, the processor would then engage the more substantial processing sequence required to detect the wake word.

Upon detecting the wake word, the processor would initiate the high-level speech processing sequence by communicating with a mobile host over a Bluetooth Low Energy (BLE) or other wireless interface to engage cloud resources for full speech recognition. As each stage of this sequence completes (or fails for some reason), developers can optimize power consumption by returning the VM1010 to wake-on-sound mode and placing the processor in a low-power sleep state (Figure 5).

Diagram of returning the Vesper VM1010 MEMS microphone and processor to low-power statesFigure 5: Developers can reduce power in a wake-word detection design by returning the VM1010 MEMS microphone and processor to low-power states following each operational stage in a wake-word detection sequence. (Image source: Vesper Technologies)

Reducing system power consumption

Although a wake-on-sound capability can reduce system power during extended quiet periods, overall system power consumption typically depends on the power-saving characteristics of the processor. The emergence of ultra-low-power processors with integrated connectivity options provides developers with efficient solutions. For example, the Ambiq Micro Apollo3 AMA3B1KK microcontroller consumes as little as 6 μA/megahertz (MHz) in active mode while executing from either flash or random-access memory (RAM) off a 3.3 volt supply rail.

Using the Apollo3, developers can easily implement the kind of power conservation sequences described earlier. For example, after using Apollo3 GPIOs to set the VM1010 in wake-on-sound mode, developers can issue a wait-for-interrupt (WFI) instruction to put the microcontroller into a deep sleep mode. In its deep sleep mode, the microcontroller consumes as little as 3 µA while retaining 384 kilobytes (Kbytes) of static RAM, or less than 1 µA with no static RAM retention (all off a 1.8 volt supply). Radio operations, typically a dominant source of power consumption in wireless designs, require only about 3 milliamperes (mA) for receive (Rx) and transmit (Tx) with the Apollo3's integrated radio subsystem.

Just as important, the extensive set of capabilities integrated in advanced microcontrollers such as the Apollo3 help simplify design, reduce footprint size, and shorten the bill of materials. For example, along with its integrated radio subsystem, the Apollo3 microcontroller includes a highly integrated power management unit (PMU) comprising multiple low-dropout (LDO) regulators and buck converters.

Using these integrated capabilities, developers can typically create a complete design with the Vesper VM1010 MEMS microphone, Ambiq Micro Apollo3 microcontroller, and a minimal set of external passive components. In fact, developers can use the Apollo3’s integrated PMU in an LDO mode that even eliminates the external capacitors and inductors needed for the buck converters.

The Apollo3's integrated 14-bit ADC subsystem further enhances both design efficiency and power optimization. For example, it’s possible to reduce ADC power consumption by placing the ADC subsystem in different low-power modes. In the ADC's low-power mode 1, for example, the ADC controller powers off its clocks and buffers while retaining ADC calibration data. When operating in this mode, the Apollo3 ADC takes less than 70 µs to initiate signal conversion from an analog source such as the VM1010 microphone. It’s worth noting that the ADC can complete conversions in this mode while the processor core remains in a deep sleep state.

Identification of the wake word will usually require a fully active processor, as noted earlier. Consequently, it will be necessary to transition the microcontroller from deep sleep to run mode. For this transition, the Apollo3's integrated PMU works with the device's Wake-Up Interrupt Controller to restore power and system state.

The wake-up time from deep sleep is only 15 µs. Combined with the 200 µs needed to transition the VM1010 to full-power mode, the overall time to transition to active mode is still within the time required to capture the audio input signal using the VM1010 microphone.

Enhanced audio designs

For applications with more complex audio processing requirements, designers can use the VM1010 in combination with digital MEMS microphones such as the Vesper VM3000. This device uses an integrated ADC to generate a pulse-density modulation (PDM) representation of the captured signal (Figure 6).

Diagram of Vesper’s VM3000 MEMS microphoneFigure 6: Vesper’s VM3000 MEMS microphone integrates an ADC to provide a clock-enabled digital data output stream. (Image source: Vesper Technologies)

Using the VM3000's clock-enabled output, developers can multiplex a pair of these microphones on a single data line to more easily implement beamforming arrays used to enhance voice pickup sensitivity. During quiet periods, developers can conserve power by turning off the clocks to the VM3000, thereby placing the device in sleep mode where current draw falls to 2.5 µA.

An enhanced audio design would extend the basic design comprising the VM1010 and processor. Using processor GPIOs, developers can combine the VM1010 and VM3000 microphones using the former to wake the system and an array of the latter for voice recording.

Getting a voice-activated design off the ground

To start experimenting with voice-activated designs, developers can use evaluation boards from Vesper and Ambiq Micro. The Vesper S-VM1010-C and S-VM3000-C MEMS omnidirectional microphones audio evaluation boards bring out the pins from the VM1010 and VM3000, respectively. To prototype audio processing designs based on the VM1010 wake-on-sound capability, it’s possible to use a card-edge connector such as CW IndustriesCWR-170-10-0000 to breadboard the MEMS microphone evaluation boards with the Ambiq Micro AMA3BEVB Apollo3 evaluation board.

Conclusion

Always-on voice interfaces offer the convenience of using speech to control products or interact with cloud-based services. Using a specialized MEMS microphone in combination with an ultra-low-power processor, developers can implement always-on capabilities while meeting power consumption requirements.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of Digi-Key Electronics or official policies of Digi-Key Electronics.

About this author

Stephen Evanczuk

Stephen Evanczuk has more than 20 years of experience writing for and about the electronics industry on a wide range of topics including hardware, software, systems, and applications including the IoT. He received his Ph.D. in neuroscience on neuronal networks and worked in the aerospace industry on massively distributed secure systems and algorithm acceleration methods. Currently, when he's not writing articles on technology and engineering, he's working on applications of deep learning to recognition and recommendation systems.

About this publisher

Digi-Key's North American Editors