from Hacker News

Ask HN: Best method to classify short audio events in real time?

by desertraven on 12/5/22, 6:54 AM with 5 comments

I don't have too much experience with statistics (or ML), but a lot of the articles I've found are quite complicated for something I expected to be simple.

There are four distinct sounds I need to detect in real time with an embedded device. Think a clap sensor, but with 4 different sounding claps.

How might I go about this? How much training data (if any) do I need to collect? Is there an off-the-shelf method to just classify a few different audio events to a high degree of accuracy, and then embed that to a microcontroller (even a computer at this point)?

Thanks!

by tkanarsky on 12/5/22, 9:18 AM
Edge Impulse does everything you described. It has a really nice web UI that lets you collect and annotate data, extract features, train a model, and bake it into a microcontroller image for inference. It supports a good chunk of microcontrollers and SBCs out there.
https://docs.edgeimpulse.com/docs/development-platforms/full...
by t0mas88 on 12/5/22, 10:00 AM
This book has a good example using Tensorflow Lite on a micro controller for speech recognition on a few commands, that would probably work for different sounds: https://www.amazon.com/TinyML-Learning-TensorFlow-Ultra-Low-...
(And it's overall a nice book, very easy to read and follow along the examples)
by simne on 12/5/22, 4:46 PM
I think this task is lot more about DSP, and very little ML, just Bayes classification (I will write on it later).
Best book I know on DSP:
The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.