The winners of the TensorFlow Lite for Microcontrollers Challenge were announced on Octuber 18, 2021. This project was among the five worldwide winners and was featured in the Experiments with Google website.
The ConceptA good practice to consider while being into traffic is to warn other users on the road of the direction you are going to take, before turning or changing lanes. This habit contributes to a smoother traffic flow and reduces sudden moves from unaware drivers. In fact, cars, motorbikes, trucks, buses and most of the vehicles you can think of, incorporate some kind of turn signalling devices.
Although being among the most vulnerable vehicles on the road, bicycles do not usually have such built-in signalling devices. In fact, bicycle riders either do not warn of a turn or if they do, they need to release a hand from the handlebar in order to make a sign to other drivers. This move reduces the rider stability and it might not be properly understood by everyone.
It is possible to find some add-on turn signal lights for bicycles in the market, but they usually require the rider to push a button to activate them and to push again to switch them off, in a similar way to motorbikes. And it is really easy to forget to switch them off. If you think twice about it, this is something worth to be improved.
From here is born VoiceTurn, a voice-controlled turn signal lighting concept initially thought for bicycles, which could be extended to other vehicles as well. The aim of this project is to use a Machine Learning algorithm to teach a tiny microcontroller to understand the words left! and right! and act accordingly by switching the corresponding turn signal light on.
The BoardThe microcontroller board to be used is an Arduino Nano 33 BLE Sense: an affordable board featuring a 32-bit ARM® Cortex™-M4 CPU running at 64 MHz, a bunch of built-in sensors, including a digital microphone, and Bluetooth Low Energy (BLE) connectivity.
However, what makes this board an excellent candidate for this project is its possibility of running Edge Computing applications on it using Tiny Machine Learning (TinyML). In short, after creating Machine Learning models with TensorFlow Lite, you can easily upload them to the board using the Arduino Integrated Development Environment (IDE).
Speech recognition at the Edge does not require to send the voice streaming to a Cloud server for processing, thus suppressing network latency. Additionally, it runs offline, so you can be certain that your turn signal lights won't stop working when passing through a tunnel. Last but not least, it preserves user privacy since your voice is not stored or sent anywhere.
Train the Machine Learning ModelThe word recognition model has been created using Edge Impulse, a development platform for embedded Machine Learning focused on providing an amazing User Experience (UX), awesome documentation and open source Software Development Kits (SDKs). Their website states:
Edge Impulse was designed for software developers, engineers and domain experts to solve real problems using machine learning on edge devices without a PhD in machine learning.
Meaning that you can have little to no knowledge of Machine Learning and still develop your applications successfully.
You can use this tutorial as a starting point for audio classification with Edge Impulse. The following steps will describe how this project has been tailored to fulfill the particular needs of VoiceTurn.
The first thing you need to do is to sign up on Edge Impulse to create a free developer subscription. After the account confirmation step, log in and create a project. You will be prompted with a wizard asking about the kind of project you wish to create. Click on Audio:
In the next step, you can choose between three options. The first one is to build a custom audio dataset yourself by connecting a microphone-enabled development board. This process requires to record a large amount of audio data in order to obtain acceptable results, so we will ignore it for now. The second option is to upload an existing audio dataset and the third option to follow a tutorial. Click on Go to the uploader, within the Import existing data choice to keep going with VoiceTurn.
The audio dataset that we are going to use is the Google Speech Commands Dataset, which consists of 65, 000 one-second-long utterances of 30 short words, by thousands of different people. You can download the version 2 of this dataset from this link.
It is possible to think that only the audio recording subsets corresponding to the words left and right would be needed to train the model. However, as Pete Warden, from Google Brain, states in the article describing the methods used to collect and evaluate the dataset:
A key requirement for keyword spotting in real products is distinguishing between audio that contains speech, and clips that contain none.
Therefore, we will also use the subset of audio recordings contained in the _background_noise_ folder in order to enrich our model with some background noise. In addition, we will complement the noise database with the audios from the noise folder from the Keyword spotting pre-built dataset available as part of the Edge Impulse documentation.
Having a dataset containing the words left and right and some background noise is not enough, since we also need to provide the model with additional words. This way, if another word is heard, it will not be classified as left or right, but it will go into another category. To do that, we can select a random collection of audio recordings from the first dataset we downloaded. Just notice that the total amount of additional audio recordings should be similar to the total amount of recordings of each word of interest.
Once the audio recordings are gathered, upload them into your Edge Impulse project. Note that you will need to upload 4 different datasets, each of them corresponding to a different category. Browse your files, enter the labels manually, being Left, Right, noise and other, respectively, and make sure Automatically split between training and testing is selected. This will leave aside about 20% of the samples to be used for testing the model afterwards. Click on Begin upload.
As you can notice, all the audio samples you uploaded are available to check and listen to. Make sure that all of them are one-second-long in duration. If they are longer, click on the dots on the right of the audio sample row and click on Split sample, setting a segment length of 1000 ms.
As a result, you should now see the total duration of your training and testing data, respectively, as well as your data split into four categories:
You can also double-check that the duration of your testing data is roughly 20% of the total dataset duration.
The next step is to design an impulse, which is the whole set of operations performed on the input voice data until the words are classified. Click on Create Impulse on the left hand menu. Our impulse will consist of an input block to slice the data, a processing block to pre-process them and a learning block to classify them into one of the four labels previously defined. Click on Add an input block and add a Time series data block, setting the window size to 1000 ms. Then, click on Add a processing block and add an Audio Mel Frequency Cepstral Coefficients (MFCC) block, which is suitable for human speech data. This block produces a simplified form of the input data, easier to process by the next block. Finally, click on Add a learning block and add a Classification (Keras) block, which is the Neural Network (NN) performing the classification and providing an output.
It is possible to configure both the processing and learning blocks. Click on MFCC on the left hand menu and you will see all the parameters related to the signal processing stage. In this project we will keep things simple and trust the default parameters of this block. Click on Generate features at the top and then click on the Generate features button to generate the MFCC blocks corresponding to the audio windows. After the job is finished you will be prompted with the Feature explorer, which is a 3D representation of your dataset. This tool is useful for quickly checking if your samples separate nicely into the categories you defined before, so that your dataset is suitable for Machine Learning. On this page you can also see an estimation of the time the Data Signal Processing (DSP) stage will take to process your data, as well as the RAM usage when running in a microcontroller.
Now you can click on NN Classifier on the left hand menu and start training your neural network, a set of algorithms able to recognize patterns in their learning data. Watch this video to have a quick overview about the working principle of neural networks and some of their applications. We will leave most of the default Neural Network settings unchanged but we will slightly increase the Minimum confidence rating to 0.7. This means that during training, only predictions with a confidence probability above 70% will be considered as valid. Enable Data augmentation and set Add noise to High to make our neural network more robust in real life scenarios. Click on Start training at the bottom of the page. When training is finished you will see the accuracy of the model, calculated using a subset of 20% of the training data allocated for validation. You can also check the Confusion matrix, being a table showing the balance of correctly versus incorrectly classified words, and the estimation of on-device performance.
You can now test the model you just trained with new data. It is possible to connect the Arduino board to Edge Impulse to perform a live classification of data. However, we will test the model using the test data we left aside during the Data acquisition step. Click on Model testing on the left hand menu and then on Classify all test data. You will receive feedback regarding the performance of your model. Additionally, the Feature explorer will allow you to check what happened with the samples that were not correctly classified, so you can re-label them if needed or move them back to training to refine your model.
Finally, you can build a library containing your model and being ready to be deployed in a microcontroller. Click on Deployment on the left hand menu, choose to create an Arduino library and go to the bottom of the page. Here it is possible to enable the EON™ Compiler to increase on-device performance at the cost of reducing accuracy. However, since the memory usage is not too high for the Arduino Nano 33 BLE Sense, we can disable this option so to perform with the highest possible accuracy. Finally leave the Quantized (int8) option selected and click on the Build button to download the.zip file containing your library.
The VoiceTurn Edge Impulse project is publicly available, so you can directly clone it and work on it if you wish.
Check how words are classifiedYou can use the Arduino IDE to deploy the library built with Edge Impulse to your board. If you have not installed it yet, download the latest version from the Arduino software page. After that, you will need to add the drivers package supporting the Arduino Nano 33 BLE Sense board. Open the Arduino IDE and click on Tools > Board > Boards Manager...
Write nano 33 ble sense in the Search box and install the Arduino Mbed OS Nano Boards package.
Now your IDE is ready to work with your board. Connect your Arduino Nano 33 BLE Sense to your computer and apply the following settings:
- Click on Tools > Board > Arduino Mbed OS Nano Boards and select Arduino Nano 33 BLE as your board.
- Click on Tools > Port and choose the serial port your board is connected to. This will vary depending on your OS and your particular port.
To add the VoiceTurn library to your Arduino IDE, click on Sketch > Include Library > Add.ZIP Library... and go to the path where you saved your library .zip file. After that, click on File > Examples > VoiceTurn_inferencing > nano_ble33_sense_microphone_continuous to open the voice inferencing program provided by Edge Impulse.
You can already compile and upload this program to your board by clicking on the two icons located on the top left corner of the Arduino IDE. Once uploaded, click on the Serial Monitor icon on the top right corner and you will see the output of the previously developed Machine Learning classifier. You can test the accuracy of the program by saying the words left! and right! and check the probability of being inside one group or another, calculated by the program. You can also try saying other words and check if they are properly classified in the other group.
If you want to learn more about continuous audio sampling, check this tutorial from the Edge Impulse documentation.
TensorFlow Lite for MicrocontrollersAt this point you might be wondering: didn't you mention you would be using TensorFlow? Well, as it is mentioned in this interesting article from the TensorFlow Blog, TensorFlow is inherently used by Edge Impulse:
Edge Impulse makes use of the TensorFlow ecosystem for training, optimizing, and deploying deep learning models to embedded devices.
If we look back to the previous steps of this project, we trained a model for word classification, then performed a Quantized (int8) optimization and finally built an Arduino library for deployment on our board. The article also states:
Developers can choose to export a library that uses the TensorFlow Lite for Microcontrollers interpreter to run the model.
In short, the VoiceTurn_inferencing library previously developed with Edge Impulse uses the TensorFlow Lite for Microcontrollers interpreter to run the word classifier Machine Learning model.
It is indeed quite easy to verify by yourself that this library uses TensorFlow Lite for Microcontrollers for running the classifier. From your Arduino IDE, click on File > Preferences and check the Sketchbook location. From the file explorer, go to this location and open VoiceTurn_inferencing/src/VoiceTurn_inferencing.h using a text editor. In line 41, you will find:
#include "edge-impulse-sdk/classifier/ei_run_classifier.h"
Now go back to the src folder and open edge-impulse-sdk/classifier/ei_run_classifier.h using again a text editor. In line 45 you will find:
#include "edge-impulse-sdk/tensorflow/lite/micro/micro_interpreter.h"
Which refers to the use of the TensorFlow Lite for Microcontrollers library. In fact, the TensorFlow Litelibrary is part of the Edge Impulse SDK, including its Micro C++ sub-library.
Build the HardwareThe turn signal lights will consist of two LED strips: one on the left and the other on the right side of the bicycle. I have used an addressable RGB LED strip composed of WS2813 LEDs. Other LED strips could be used if you pay attention to the wiring and adapt the code afterwards. The first thing to do is to cut the LED strips so that they have the desired length. In this case, I have used a piece of 10 cm for each side, which corresponds to 6 LEDs. Pay attention to cutting the LED strips by the dotted line not to damage the electrical contacts.
As an optional step now, drill two holes on the ruler or the flat surface of your choice. If you want to strictly follow the project dimensions, use a 30-cm-long (roughly 12 inch) surface and drill the holes at the 11.5 cm and 18.5 cm positions, respectively. Use a rotary tool and place the flat surface between two objects with a similar height in order to have some free space for the drill head not to damage your table.
To be able to connect the LED strips to the rest of the setup, we need to add hook up wires to their pins. First, remove a small section of the jelly material covering the electrical contacts, using the scissors. Then, use the solder iron together with the solder flux to solder three hook up wires to the pins of each LED strip. I recommend to follow the usual color code, so that a red wire will be soldered to the 5V pin and a black one to the GND pin. Regarding the other pins, I have used a yellow wire for the right side and green wire for the left side. Although the WS2813 LED strip has four pins, DI (data input) and BI (back-up input) can be soldered together for the sake of simplicity. Once this step is done, remove the tape behind the LED strips and stick them to the flat surface at the 0-10 cm and 20-30 cm positions, respectively. If you drilled the holes on the surface in the previous step, pass the wires through them. If not, just put them around the border.
The overall setup is going to be divided into two parts and they will connect to each other using 3.5 mm jack/plug connectors. Grab the corresponding cables and cut them in about a 20%-80% proportion length, so that you will end up with a 80% length of cable containing the jack from one cable and another 80% length of cable containing the plug from the other cable. Remove a small portion of the external coating and you will see that each cable consists of 4 connections: three wires and the shield. Roll all the shield wires together, so that it will be easier to solder them.
Grab the long piece of cable containing the 3.5 mm jack and solder the wires as follows (if the color code is different, adapt it to your needs):
- The red wire of the cable to both red wires of the two LED strips (5V).
- The green wire of the cable to the green wire of the left LED strip (DI-BI).
- The yellow wire of the cable to the yellow wire of the right LED strip (DI-BI).
- The shield from the cable to both the black wires of the two LED strips (GND).
As a result, you should have the first part of the setup finished, consisting of the flat surface containing the turn signal lights and a piece of cable ending on a 3.5 mm jack. Avoid the electrical contacts to touch each other by individually covering them with tape or heat sink tubes.
The other part of the setup is simpler to build. Grab the long piece of cable containing the 3.5 mm plug and solder the wires to the Arduino board following the same color code as in the previous step and the Arduino pinout:
- The red wire to the 3V3 pin of the Arduino.
- The green wire to the D4 pin of the Arduino.
- The yellow wire to the D7 pin of the Arduino.
- The black wire to one of the GND pins of the Arduino.
You can find a wiring diagram in the Schematics section.
We can manage to have such a simple setup because we are using short LED strips containing just a few LEDs. If you use longer LED strips, you might need to power them with an external source, instead of using the 3V3 pin of the Arduino board, not to exceed the maximum current supported by the latter. If you do so, remember to solder a 1000 μF capacitor between the 5V and GND connections of your LED strips to ensure the stability of the supply voltage.
Add functionality to the codeWe previously checked in the Arduino Serial monitor how the words we say get classified into each of the four predefined groups. Once the hardware is built, now it is time to add the required functionality to switch one of the turn signal lights on after saying the word corresponding to that side.
First, you will need to install a library required by VoiceTurn to control the addressable LED strips that we will use for turn signalling. Click on Sketch > Include Library > Manage Libraries...
Write NeoPixel in the Search box and install the Adafruit NeoPixel library.
We will work on the nano_ble33_sense_microphone_continuous example program provided by the VoiceTurn_inferencing library built and imported in previous steps.
One LED strip is required for each direction, left or right, and each LED strip has 6 LEDs in total. Additionally, the built-in RGB LED of the Arduino board will be used for the rider to quickly know if the board is either ready to listen to a word or still making use of the turn signal lights. Define the pinout at the beginning part of the program, remembering the pins you used for the green and yellow wires when you built the hardware part:
/* LED strip pinout */
#define LED_PIN_LEFT 4
#define LED_PIN_RIGHT 7
#define LED_COUNT 6
/* Built-in RGB LED pinout */
#define RGB_GREEN 22
#define RGB_RED 23
#define RGB_BLUE 24
Include the NeoPixel library next to the other libraries already included:
#include <Adafruit_NeoPixel.h>
Declare the two LED strips by defining their number of LEDs, the data pin they are connected to and the strip type. Here, we will use NEO_GRB + NEO_KHZ800
for the strip type, as this is the one corresponding to WS2812 LED strips and similar. You can modify this with the help of the library documentation if you are using a different LED strip type.
/* LED strips for left and right signals */
Adafruit_NeoPixel left(LED_COUNT, LED_PIN_LEFT, NEO_GRB + NEO_KHZ800);
Adafruit_NeoPixel right(LED_COUNT, LED_PIN_RIGHT, NEO_GRB + NEO_KHZ800);
Declare two constants that will be useful to tune the duration of the light signalling. The period
is roughly the waiting time required between an LED is switched on and the next one is switched on as well: the higher this time is set, the slower the animation will be. The cycles
variable refers to the number of times the animation will repeat for each time the left or right words are detected.
static int period = 100;
static int cycles = 3;
Now in the setup()
function, add the following lines of code to initialize the two LED strips and set their brightness level to about 20%. The latter acts as a software-driven current control, so that we make sure not to burn the LEDs or exceed the maximum current supported by the Arduino board, even if no resistor is soldered to the strip pins.
// Initialization of LED strips:
left.begin();
right.begin();
left.show();
right.show();
// Set BRIGHTNESS to about 2/5 (max = 255)
left.setBrightness(100);
right.setBrightness(100);
Initialize the built-in RGB LED pins as OUTPUT
.
pinMode(RGB_RED, OUTPUT);
pinMode(RGB_GREEN, OUTPUT);
pinMode(RGB_BLUE, OUTPUT);
As you may know, the code inside the loop()
function contains the program instructions that will be executed over and over. Add the following line at the beginning of the function to indicate to the bicycle rider that the board is listening. The rgb_green()
function will be created afterwards.
rgb_green(); // Shows the board is READY
After that, word classification is carried out and the program output is stored in the result.classification
variable. This variable is indexed following the labeling order we used when training the Machine Learning model with Edge Impulse. Remember we used the labels Left, Right, noise and other in that order. Therefore, result.classification[0]
contains the information corresponding to the Left group and result.classification[0].value
is the probability that the spoken word belongs to the Left group. Similarly, result.classification[1]
contains the information corresponding to the Right group and result.classification[1].value
is the probability that the spoken word belongs to the Right group. The indices 2 and 3 correspond to the noise and other groups, respectively, although we will not use them in this project.
We want to activate the corresponding LED strip if the word is detected with a probability above a certain threshold. For this project, the threshold probability is set to 80% for the left word and to 85% for the right word, to avoid false positives as far as possible. The thresholds are different, since they are tuned according to the output of the model testing stage performed with Edge Impulse. Add this piece of code at the end of the loop()
function, after all the Machine Learning processing is finished.
// 0 -> LEFT; 1 -> RIGHT
if (result.classification[0].value >= 0.80) {
turn(left);
}
if (result.classification[1].value >= 0.85) {
turn(right);
}
The only things left to add to the program are the functions in charge of activating the LED strips and controlling the built-in RGB LED. The turn()
function receives the LED strip to activate as an input parameter and switches on the LEDs one by one in orange following an animation similar to the ones in modern cars. Once the predefined number of cycles finish, the LED strip is switched off. Additionally, the bicycle rider is warned of the state of the program by setting the built-in RGB LED in red while the LED strips are being used and switching it off at the end of the function. Add this function at the end of your program, after the loop()
function finishes.
static void turn(Adafruit_NeoPixel& strip) {
rgb_red(); // Shows the board is BUSY
for (int i = 0; i < cycles; i++) {
for (int j = 0; j < strip.numPixels(); j++) {
strip.setPixelColor(j, strip.Color(255, 104, 0)); // Color: Orange
strip.show();
delay(period);
}
strip.clear();
strip.show();
delay(2 * period);
}
rgb_off(); // The board has FINISHED lighting the LED strip
}
And finally add three simple functions controlling the built-in RGB LED.
void rgb_red() {
digitalWrite(RGB_RED, HIGH);
digitalWrite(RGB_GREEN, LOW);
digitalWrite(RGB_BLUE, LOW);
}
void rgb_green() {
digitalWrite(RGB_RED, LOW);
digitalWrite(RGB_GREEN, HIGH);
digitalWrite(RGB_BLUE, LOW);
}
void rgb_off() {
digitalWrite(RGB_RED, LOW);
digitalWrite(RGB_GREEN, LOW);
digitalWrite(RGB_BLUE, LOW);
}
If you compile the program, upload it to the Arduino board and connect the hardware to your board, you will be able to test the VoiceTurn setup by yourself. After saying the word left!, the left-side turn signal light should be activated and after saying right! the same should happen with the right-side turn signal light.
If your tests are not as successfull as expected, you might need to adjust the sensitivity of the word classification process to your needs. You can do that by tuning two parameters:
- The word detection probability, initially set to 85%: by lowering this number, you will increase the amount of times that the device responds to your voice, but you will also increase the amount of false positives.
- The
EI_CLASSIFIER_SLICES_PER_MODEL_WINDOW
constant defined at the beginning of the program: this is the number of pieces in which the model window is subdivided. You can find further information about this parameter in this link.
I have included the link to the GitHub repository containing the project's code in the Code section, so check it out if you got lost at some point or if you want to directly try my code.
Set up VoiceTurn on a bicycleThe proof-of-concept of VoiceTurn has been carried out on a bicycle. Plastic cable ties have been used to attach all the components to it, so that the setup can be easily installed and removed without any permanent modification made to the bicycle. In case you want the setup to be permanently installed, you can use other means.
The first part consists of the Arduino board and the two cables attached to it: the one containing the wires soldered to the Arduino pins and ending on the 3.5 mm plug and the micro-USB to USB cable plugging the board to the power bank. A couple of cable ties have been placed, attaching the two aforementioned cables to the rear brake cable. Both cables are rigid enough for the Arduino board to remain in a floating and stable position, with the microphone facing the rider.
We need to fix the two cables to the bicycle so that the 3.5 mm plug can reach the rear part of the bicycle and the USB plug can reach the seat, where the battery will be accommodated. The cables have been fastened alongthe top side of the bicycle frame.
Now place the bicycle watch mount around the seat bar, so that we obtain a flat surface where holding the ruler containing the turn signal lights. Use a couple of cable ties in cross arrangement to fix the other part of the setup to the seat bar at the watch mount location. Fix the remaining wires to the seat bar as well.
Connect the 3.5 mm jack to its corresponding plug and attach the remaining cable around the bicycle frame so that you do not pull from it unintentionally while riding.
Finally, place the power bank at the gap under the seat, hold it tight with cable ties and connect the USB cable to it, so that VoiceTurn is ready to use.
Now it is time to build and try VoiceTurn yourself. You can even take it as a base project and keep improving it: reduce the wiring by using the Arduino's built-in BLE connectivity, add a wake word such as Hey Bike!, extend the word set adding more functionality... there are lots of possibilities.
I hope you enjoy this project and do not forget to leave your comments and feedback!
Comments