Researchers use population monitoring to understand how native birds react to changes in the environment and conservation efforts. But many of the endangered birds across are isolated in difficult-to-access, high-elevation habitats. With physical monitoring difficult, scientists have turned to sound recordings. Known as bioacoustic monitoring, this approach could provide a passive, low labor, and cost-effective strategy for studying endangered bird populations.
SPRESENSE™ is a low-power board computer that is equipped with a GPS receiver and supports High-Resolution Audio codecs. The board allows for IoT versatility and can be developed for a vast range of uses, such as a drone utilizing the GPS and high-performance processor, a smart speaker utilizing High-Resolution Audio recording and playback as well as the built-in full-digital amplifier which is of my interest.
We can interface Sony Spresense board with Edge Impulse. The first steps to follow are shown here https://docs.edgeimpulse.com/docs/development-boards/sony-spresense Please, follow the instructions carefully to avoid failures, for example installing the CLI, updating the bootloader and firmware, setting keys, and verifying that the device is connected.
But if you want information about how to use microphones on spresense board, please follow this tutorial: https://developer.sony.com/develop/spresense/tutorials-sample-projects/spresense-tutorials/using-multiple-microphone-inputs-with-spresense
3. Building a machine learning modelWe require a lot of bird data and it´s impractical to find data that is high in quantity as well as quality, I took data for specific birds from kaggle community dataset. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, and work with other data scientists and machine learning engineers. Initially, I picked three endangered birds that are commonly found in North America.
- Gadwall (Mareca Strapera)
- Western-Meadowlark (Sturnella Neglecta)
- Correlimos pectoral (Calidris melanotos)
In this project I will try to develop a model with three endangered birds that belong to North America.
4. Data acquisitionI downloaded around 120 audio files for each bird and worked on preprocessing using Audacity software. And then proceeded to augment the data and reducing noise. And this helped me generate 386 files of 5 seconds approx for training data, and 97 files of 5 seconds for test data, for all four of the above-mentioned labels (including noise label). Now I have a balanced dataset with 32 minutes and 25 seconds of training data (80%), and 8 minutes and 9 seconds of test data (20%).
You can create an impulse once the training is in place. An impulse takes raw data, cuts it into smaller windows, extracts features with signal processing blocks, and then classifies new data with a learning block. Signal processing blocks are used to make raw data easy to process by returning the same values for the same input, whereas learning blocks learn from previous experiences. After some experimenting, I settled with the following parameters:
- Window Size = 2500 ms
- Window Increase = 500 ms
We can build windows that overlap by setting a Window increase that is smaller than the Window size. Each overlapping window is a unique example of audio that conveys the sample's label, despite the fact that they may contain comparable data. We may make the most of the training data by employing overlapping windows. Now, onto selecting blocks:
- Processing Block - Audio (MFE)
- Learning Block - Neural Network (Keras)
After exploring a few variations for the parameters of Mel-filter-bank energy features, I determined that the default values were the best.
The default values of the Neural Network model in the Edge Impulse studio gave the best results. In the end I got 92.6% accuracy and 0.30 loss.
To ensure that the model works as well on new and previously unseen data as it has for the training data, we can use the option of ‘Live classification’ provided by the studio.
Click on Live classification in the left-hand menu. Your device should show up in the 'Classify new data' panel. Capture 5 seconds of audio data.
I performed live classification of audio samples of birds played on a tape recorder. I used both my smartphone as well as the Sony Spresense board to capture it sounds
9. Deploying the model to the deviceEdge Impulse project> Deployment.
Create the full library which contains the impulse and all external required libraries. Select C++ library and click Build to create the library.
Download and extract the.zip file. This will run the signal processing pipeline, and then classify the output.
10. Model Simple TestingCloning the base repository. This is a small application for Sony's Spresense, which takes the raw features as an argument, and prints out the final classification. Import the repository using Git:
$ git clone https://github.com/edgeimpulse/example-standalone-inferencing-spresense
Extract the zip file directories of the step 9 deployment model in the example-standalone-inferencing-spresense/edge_impulse/ folder. Your final folder structure should look like this:
example-standalone-inferencing-spresense/
|_ edge_impulse/
| |_ edge-impulse-sdk/
| |_ model-parameters/
| |_ tflite-model/
| |_ README.md
|_ mkspk/
|_ spresense-exported-sdk/
|_ stdlib/
|_ tools/
|_ .gitignore
|_ Dockerfile
|_ LICENSE
|_ Makefile
|_ README.md
|_ ei_main.cpp
|_ main.cpp
Running the impulse: Head back to the studio and click on Live classification. Then load a validation sample, and click on a row under 'Detailed result'.
Then click on the Copy to clipboard button next to 'Raw features'. This will copy the raw values from this validation file, before any signal processing or inferencing happened.
Open ei_main.cpp file and paste the raw features inside the static const float features[] definition, eg:
On Makefile type the arm-none-eabi path, eg:
CROSS_COMPILE ?= C:\Program Files (x86)\GNU Tools ARM Embedded\8 2018-q4-major\bin\arm-none-eabi-
Building the application by calling make in the root directory of the project:
$ make -j
Connect the board to your computer using USB and flash the board:
$ make flash
Once flashing the board, you can see the test of the sample uploaded previously.
Below I show you an image with the sony spresense board mounted in a box for protection.
For more details, follow this blog instructions: https://docs.edgeimpulse.com/docs/tutorials/running-your-impulse-locally/running-your-impulse-spresense
11. Demo VideoUntil now for streaming audio, Edge Impulse only runs my model on Mobile Phone or Computer. Below I show you a test with my smartphone, and playing audio with Audacity on my computer.
12. Conclusion- As you can see, the model is quite laborious to gather enough audio, process the signal and make adjustments to increase the accuracy. I am satisfied that this model works, and you can try it! I leave the links in the download section.
- If you want to do live streaming tests, for now you can use the Edge Impulse platform in the "Live Classification" section.
- If you want to run your Impulse directly, then in the "Deployment" section you can use the Computer and mobile options.
- Now I describe some issues that I faced: I reported to the technical staff of Edge Impulse the problem I had with the binary file to run it on my Sony Spresense board, and they solved my issue. Here the link: https://forum.edgeimpulse.com/t/cant-deploy-the-library-for-sony-spresense-with-audio-data/4187
- But if you want to make a live audio stream with your Sony Spresense card, I provide you with this useful information: I reported to the Edge Impulse technical staff that the downloaded C++ library didn't work for me to do live stream tests, and they gave me the required information to try to solve this issue. Here the link: https://forum.edgeimpulse.com/t/my-model-doesnt-work-properly-with-audio-data-on-sony-spresense/4287
Comments