Intro
Competing with TensorFlow Lite for Microcontrollers
Procedure
Conclusion

Published July 22, 2022

Making Famous Magic Wand 33x Faster

Creating a faster, smaller, and more accurate magic wand than the famous experiment provided in Arduino Sketch, given the same hardware.

IntermediateFull instructions provided6,170

Things used in this project

Hardware components

Arduino Nano 33 BLE Sense

Wooden Magic Wand

Software apps and online services

Neuton Tiny ML Neuton

Arduino IDE

Story

Intro

This case is a remake of a well-known “magic wand” experiment. Last year, Pete Warden, the famous creator of the TensorFlow Lite for Microcontrollers (TFLM) framework, ran a workshop on building a digital ML-based magic wand at the tinyML Summit. Today, it’s one of the most popular gesture recognition experiments as only on Hackster.io, dozens of “magic wand” tutorials can be found.

Can I build a model better than Pete Warden? It turned out that today everyone can do it. Only a year ago, the solution offered by Pete Warden seemed to be a breakthrough, but now it’s no longer small, fast, and accurate enough. Thanks to the new technology, now I can show how you can build a similar model automatically, and with impressive results.

For my experiment, I used an original TFLM model built into the Arduino SDK, and also collected my own dataset to build a new model on Neuton.AI. As the hardware component, I used Arduino Nano 33 BLE Sense, as in the original experiment.

Competing with TensorFlow Lite for Microcontrollers

Using the public magic wand dataset

For my first experiment, I used an official TensorFlow Lite MagicWand example.

The trained TensorFlow Lite model is built into the Arduino SDK and is described in the official Google documentation.
The dataset is provided in the official TensorFlow repository.

For benchmarking, I used the trained TFLM model from the Arduino SDK and trained a Neuton model using the dataset in the mentioned repository.

Both models are validated on the same hold-out dataset and were tested on the same MCU (Arduino Nano 33 BLE Sense).

Please find the model codes attached:

Neuton model on TensorFlow example data (Arduino sketch)
TensorFlow trained model (Arduino sketch)

In the table below, you can see the resulting benchmarks. Models built with Neuton have advantages in several if not all aspects: accuracy, size, inference time, and most importantly, they are built automatically, hence also accelerating the time to market. Additionally, TensorFlow Lite for Microcontrollers (TFLM) and Neuton solutions were embedded into Arduino Nano 33 BLE Sense (CortexM4).

RESULTING BENCHMARKS

TensorFlow’s model was capable of recognizing only 3 classes: Wing, Ring, and Random, but it was not capable of recognizing Slope. On the other hand, Neuton successfully recognized all 4 classes. Below you can see the results of the experiments.

NUMBER OF CORRECTLY RECOGNIZED CLASSES

Collecting my own magic wand dataset

I decided to reproduce this experiment from the very beginning, namely from the data collection stage since in the previous experiment with the TF dataset, I could not control the data collection. The aim was to prove that the real-time accuracy on the device could be improved. I have used the Arduino Nano 33 BLE Sense for this experiment as well.

After training the Neuton model on the data I collected, I recorded the following real-time outcomes which showed great improvement in accuracy:

NUMBER OF CORRECTLY RECOGNIZED CLASSES IN MY OWN MAGIC WAND DATASET

Procedure

The onboard accelerometer is the main sensor for this use case. The board is attached to the end of a stick/wand with a USB facing downwards, towards the place where you hold it so that the cable runs down the handle. Use sticky tape, a velcro strap, or some easy-to-remove method to attach the board and cable to the wand.

Step 1: Data Collection

The data collection included four classes: Wing, Slope, Ring, and Random class. Sensors reading were collected at 100 Hz frequency, with a sample duration of 2 seconds. So you get 200 rows of accelerometer sensor data for each sample. Check the instruction on how to perform the motion activity for each gesture.

Any other gestures apart from the above are treated as a random class. The instruction for data collection, codes and datasets can be found here:

Data collection explanatory document
Codes for data collection
Data that I have collected

Below is the summary of the collected training dataset.

TRAINING DATASET

In a similar way, collect validation dataset as well.

Step 2: Model Training

Upload your dataset on the Neuton platform and enable the following training settings:

Input data type: FLOAT32

Normalization type: Unified Scale for All Features

Window size: 2000ms

Frequency: 100 Hz

Feature Extraction Method: Digital Signal Preprocessing Neuton pipeline (enabled by default) which creates statistical features for each of the original features processing a window of 2000ms.

Feature Selection Method:

Disable option Remove Lag Features
Select recursive Feature Elimination with Cross-validation

Bit depth: 8-bit

Once training is complete, you can check the metrics on the console. In my case, I got Accuracy: 0.977028.

From the confusion table, I can see that the model works really well across all target classes.

I got a model that weighs about 1 kb!

Please find the Neuton model on the data I have collected.

The total footprint of the solution embedded into the MCU is highlighted in the table below:

Step 3: Model Embedding and Inference

In this final step, I am going to embed the model to the device. Download your model as a Zip archive from the Neuton platform. Extract all the downloaded files and put them inside a folder named “src” in your project directory. It should look as shown below:

C:.
|   .DS_Store
|   MagicWandTestPP.ino
|
\---src
    |   .DS_Store
    |   neuton.c
    |   neuton.h
    |   README.md
    |
    +---converted_models
    |       README.md
    |
    +---fe
    |   \---statistical
    |       |   Common.c
    |       |   Common.h
    |       |   DSP.h
    |       |   DSPF32.c
    |       |   DSPF32.h
    |       |
    |       \---fht
    |               fhtf32.c
    |               fhtf32.h
    |
    +---model
    |       dsp_config.h
    |       model.h
    |
    +---postprocessing
    |   \---blocks
    |       \---moving_average
    |               moving_average.c
    |               moving_average.h
    |
    +---preprocessing
    |   |   .DS_Store
    |   |
    |   \---blocks
    |       |   .DS_Store
    |       |
    |       +---normalize
    |       |       normalize.c
    |       |       normalize.h
    |       |
    |       \---timeseries
    |               timeseries.c
    |               timeseries.h
    |
    \---target_values
            multi_target_dict_csv.csv

To understand how the model will be functioning on the device, we will have a look at some important functions in the Arduino codes.

In the function, void OnDataReady(void *ctx, void *data), we pass the data buffer which contains the accelerometer sensor values for the defined window size using the callback neuton_model_window_size(). In this experiment, the window size is 2 seconds which means 200 rows of accelerometer data or 600 values of x, y and z acceleration axis to be fed into the neuton_model_set_inputs(inputs).To understand how the model will be functioning on the device, we will have a look at some important functions in the Arduino codes.

void OnDataReady(void *ctx, void *data)
{
input_t *inputs = (input_t*)data;
size_t inputsCount = neuton_model_inputs_count();
size_t windowSize = neuton_model_window_size();
for (size_t i = 0; i < windowSize; i++)
{
#ifndef NO_CALC
if (neuton_model_set_inputs(inputs) == 0)
{
float* outputs;
uint16_t index;
uint64_t start_time = micros();
if (neuton_model_run_inference(&index, &outputs) == 0)
{

uint64_t stop_time = micros();

After the neuton_model_run_inference(&index, &outputs) callback is invoked, the model returns the output predictions. We use a custom function called NeutonPostprocessingBlockMovingAverageProcess(&avg, outputs, &outputs, &index) to accept probabilities from the neural network, average these probabilities, and return averaged probabilities if one class reaches a threshold (we have set a threshold to 0.95f). Also, if no gesture is recognized (index 3), we cancel suppression.

if (NeutonPostprocessingBlockMovingAverageProcess(&avg, outputs, &outputs, &index) == 0)
{
if (index == 0)
{
Serial.print(" Ring: O      ");
}
if (index == 1)
{
Serial.print("Slope: /_     ");
}
if (index == 2)
{
Serial.print(" Wing: \\/\\/   ");
}
if (index == 3)
{
//             Serial.print("Ooops! Nothing");
avg.suppressionCurrentCounter = 0;
continue;
}
Serial.print("     score: ");
Serial.println(outputs[index], 2);
}
}
}
#endif

inputs += inputsCount;
}

Conclusion

This remake of a popular case vividly demonstrates that a magic wand can be several times faster, smaller and more accurate than in the original experiment. The use of Neuton empowers the model to have better results on such ML applications. It's amazing how quickly the TinyML technology develops. A year ago, such experiments seemed pure magic, but now any schoolboy can produce similar solutions automatically.

Credits

Alex Miller

8 projects • 77 followers

Director of Data Science and Engineering at Neuton.AI

Contact

Comments

Please log in or sign up to comment.

Embed the widget on your own site

Making Famous Magic Wand 33x Faster

Making Famous Magic Wand 33x Faster

Things used in this project

Hardware components

Software apps and online services

Story

Intro

Competing with TensorFlow Lite for Microcontrollers

Procedure

Conclusion

Code

Neuton model on the data I have collected

TensorFlow trained model (Arduino sketch)

Neuton model on TensorFlow example data (Arduino sketch)

Data that I have collected

Data collection explanation document

Codes for data collection

Credits

Alex Miller

Comments

Related channels and tags