SpectroMic – AIStorm

SpectroMic KWS Solution

AIS240A Based Key Word Spotting Solution

SpectroMic's KWS

AIStorm’s SpectroMic KWS is a key word spotting solution combining the AIS240A SpectroMic, a MEMs microphone, a smart activity detector (VAD), charge domain spectral engine and AI model libraries compatible with popular microcontrollers allowing rapid deployment of KWS solutions in IoT AI applications.

Key advantages of SpectroMic™ KWS

Integrated solution — MEMS microphone, smart VAD, and charge‑domain spectral engine in a 5.5×5.5 mm microphone package
18 µA always‑on current — >10× lower standby power than conventional digital mics
Smart VAD — maintains always‑on operation and reduces false triggers, can be used to adapt to noisy environments and does not lose words emerging from low power mode
Charge‑domain spectral processing — offloads heavy signal processing from the MCU, often eliminating the need for a separate DSP, further providing a continuous digital stream from the integrated spectral engine through SPI
Spectral rolling buffer — can now store only required spectral components, reducing rolling‑buffer power by 10× and memory by 8× (patents pending), making digital time domain data available only once words are recognized for verification by online systems
Configurable downloadable models — supports multiple recognition libraries for popular Cortex‑M33 and RP2350 MCUs

How It Works

Traditional analog MEMS microphones stream a continuous analog signal that an always‑awake MCU must digitize—or designers switch to digital mics that consume hundreds of µW and add cost. Even when these legacy mics offer a voice‑activity detector (VAD), background noise often pushes them into high‑power mode, and their slow recovery from VAD can miss the first syllables of or entire words. SpectroMic fixes these problems; its charge‑domain spectral engine turns incoming sound into a compact spectral image and makes it available digitally through the SPI bus, while a smart VAD can be used to wake the MCU and adapt to ambient noise. Or for smart speaker enabled devices, only required spectra needs to be stored for the rolling buffer, minimizing power and memory but still providing the restored digital time domain information required by online branded smart speaker verification systems when necessary.

Background Noise Adaption

In the video to the left SpectroMic is adapting to background noise in a bar. The two LEDs indicate at first that SpectroMic is being triggered almost continuously. After a short time, however, SpectroMic has adapted to the background noise and the LEDs go dark indicating that there is no spectral content of interest. During the adaption period SpectroMic is back to drawing its 18uA input current until spectra of interest is found.

Despite the background noise, once we hear words of interest on top of the noise of the bar, SpectroMic is still able to process these words. This can be seen from the activity from the LEDs in response to the spoken words. In fact the LED colors indicate the word recognized.

Google 10 DataSet Example

In the video below SpectroMic KWS is recognizing words from the Google 10 dataset. See how quickly the words are recognized (see the red box showing the recognized words). In this example the Google 10 dataset is implemented using only 23.7k parameters, a very low cost Raspberry Pi microcontroller (RP2040), with an inference time of 261ms, and an accuracy of 90.16%. This model is available for download and more complex models are also available for Raspberry Pi and other microcontrollers. This implementation demonstrates how SpectroMic’s spectral engine takes the burden from the microcontroller such that even a low end low cost microcontroller, or a portion of the resources from a host microcontroller doing other things, can be used to implement voice interaction.

Spectral Rolling Buffer (Charge Domain)

In smart speaker applications such as Alexa(TM), Bixby(TM) or Siri(TM), it is necessary to maintain a rolling buffer for verification by online systems before acceptance of a word by an edge device. For example it might be necessary to continuously store the last 2 seconds of acoustic information and thereafter once the local edge systems believes that it has identified a word then that 2s of information and the word that is believed to be identified is sent to the cloud for verification. Normally a 16kHz ADC continuously stores this information requiring 32kB of memory and using a lot of power. SpectroMic KWS can reduce this power by 10x and reduce the amount of memory by 8x by storing instead only the necessary components of the digital output of the spectral engine. More specifically, these online systems do not require all the spectra that makes sound pleasurable to our ear. Instead they need the minimum for their AI algorithms to identify a word. This information is a subset of the overall bandwidth that is stored in a standard system and also is compacted by its conversion to AIStorm’s spectral format.

*Alexa, Bixby and Siri are Trademarks of Amazon, Samsung and Apple respectively.

Package

SpectroMic is packaged in a standard 5.5×5.5mm microphone “can” assembly. A minimum of external components is required minimizing area requirements.

Key Features

Integrated MEMs Microphone

Activity Detector – Frequency and Amplitude

1.2V Input or 1.8V Input LDO

Digitally Programmable AGC

Front End for Key Word Spotting (KWS)

Charge Domain Spectral Rolling Buffer

SPI Interface

Efficient Charge Pump & MEMS Interface

5.5×5.5mm Microphone Package

Generates Digital Spectral Image

Works with External AI Microcontrollers or GPUs

On Board Filters

Sound to Spectral Converter with 32 Definable Spectral Bins

18uA Always on Supply Current

Applications

Key Word Spotting (KWS) · Sound Spotting (eg. Glass Break, Gunshot) · Heart Rate Variability (HRV) · Vibration Monitoring · Audio Monitoring

SpectroMic KWS Solution

AIS240A Based Key Word Spotting Solution

SpectroMic's KWS

Key Features

Integrated MEMs Microphone

Activity Detector – Frequency and Amplitude

1.2V Input or 1.8V Input LDO

Digitally Programmable AGC

Front End for Key Word Spotting (KWS)

Charge Domain Spectral Rolling Buffer

SPI Interface

Efficient Charge Pump & MEMS Interface

5.5×5.5mm Microphone Package

Generates Digital Spectral Image

Works with External AI Microcontrollers or GPUs

On Board Filters

Sound to Spectral Converter with 32 Definable Spectral Bins

18uA Always on Supply Current

Applications

SpectroMic Operating Diagram

Brochures

Documentation

Copyright AIStorm 2025.
All Rights Reserved.

SpectroMic KWS Solution

AIS240A Based Key Word Spotting Solution

SpectroMic's KWS

Key Features

Integrated MEMs Microphone

Activity Detector – Frequency and Amplitude

1.2V Input or 1.8V Input LDO

Digitally Programmable AGC

Front End for Key Word Spotting (KWS)

Charge Domain Spectral Rolling Buffer

SPI Interface

Efficient Charge Pump & MEMS Interface

5.5×5.5mm Microphone Package

Generates Digital Spectral Image

Works with External AI Microcontrollers or GPUs

On Board Filters

Sound to Spectral Converter with 32 Definable Spectral Bins

18uA Always on Supply Current

Applications

SpectroMic Operating Diagram

Brochures

Documentation

Copyright AIStorm 2025. All Rights Reserved.

Copyright AIStorm 2025.
All Rights Reserved.