tfmicro/third_party/flatbuffers/include/flatbuffers/base.h
tfmicro/tensorflow/lite/kernels/internal/reference/concatenation.h
accelerometer_handler.cc
output_handler.cc

@boochowp • Posted by vany5921

Published June 19, 2020 © MIT

Gesture Recognition with M5Stack + TensorFlow Lite

This is an example based on TensorFlow gesture recognition.

IntermediateFull instructions provided2 hours3,221

Gesture Recognition with M5Stack + TensorFlow Lite

Things used in this project

Hardware components

M5Stack FIRE IoT Development Kit (PSRAM 2.0)

Software apps and online services

PlatformIO IDE

Story

It recognizes three different gestures from the accelerometer input.

The gesture is "wing (draw w)", "ring (draw circle)" "slope (move from right diagonally to lower left, and move horizontally to right).

A script for learning new gestures models has also been published recently.I tried this magic wand using the M5 stack fire's built-in accelerometer.The original sample is Arduino Nano 33 BLE Sense and SparkFun edge, but there are cases of Adafruit EdgeBadge and Particle (STM32 based modules) on the net.

https://learn.adafruit.com/tensorflow-lite-for-edgebadge-kit-quickstart?view=all

https://blog.particle.io/machine-learning-102/

With these, we will transfer the magic wand demo to the M5Stack Fire.

https://github.com/boochow/TFLite_Micro_MagicWand_M5Stack

Inference processing using TensorFlow

First, I will briefly explain what TensorFlow will do. For the TensorFlow itself, see the following documentation.

https://docs.google.com/presentation/d/e/2PACX-1vT1LHPLi1j09lKfVkaCNra-x_1zsf50H1mARpiBvurN7caEB-07f2n6k4fWbqGMVA0P-8izVYTUnWv_/pub?slide=id.g7b4e0d0dae_0_10

TensorFlow Lite for Microcontrollers (TFLM, abbreviated below) does not allow learning and only reasoning.

Inference is a neural network trained by data in advance, by giving input to a model and obtaining output.

The model is a set of "weights" connecting neurons, and "operations" in neurons.

The inference by the model is done using tflm because the library provides an interpreter.

The input and output to the model are tensors (= multidimensional matrices).Shaping input data to suit the model and interpreting output data is a program side task.

Since the Magic Wand demo has a trained model, the contents of the model can be treated as a black box.

However, because the processing itself is software, it is not possible to ignore the memory usage and the contents of the processing.

This demo model is described in the script for learning the new gestures above.

import tensorflow as tf
seq_length = 128
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(
8, (4, 3),
padding="same",
activation="relu",
input_shape=(seq_length, 3, 1)),                # output_shape=(batch, 128, 3, 8)
tf.keras.layers.MaxPool2D((3, 3)),                  # (batch, 42, 1, 8)
tf.keras.layers.Dropout(0.1),                       # (batch, 42, 1, 8)
tf.keras.layers.Conv2D(16, (4, 1), padding="same",
activation="relu"),          # (batch, 42, 1, 16)
tf.keras.layers.MaxPool2D((3, 1), padding="same"),  # (batch, 14, 1, 16)
tf.keras.layers.Dropout(0.1),                       # (batch, 14, 1, 16)
tf.keras.layers.Flatten(),                          # (batch, 224)
tf.keras.layers.Dense(16, activation="relu"),       # (batch, 16)
tf.keras.layers.Dropout(0.1),                       # (batch, 16)
tf.keras.layers.Dense(4, activation="softmax")      # (batch, 4)
])

Model. Summary() output:

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_2 (Conv2D)            (None, 128, 3, 8)         104       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 42, 1, 8)          0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 42, 1, 8)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 42, 1, 16)         528       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 1, 16)         0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 14, 1, 16)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 224)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 16)                3600      
_________________________________________________________________
dropout_5 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 4)                 68        
=================================================================
Total params: 4,300
Trainable params: 4,300
Non-trainable params: 0
___________________________________________________________

Code preparation

First, you generate a code that is the source of the magic wand for ESP32 from the locally sorted TensorFlow repository.

This time, the tflm directory has been moved directly from tensorflow / lite / experimental under tensorflow / lite.

Create a source code expression with make, and put it in zip to make it easier to get into the M5 stack development environment.

$ make -f tensorflow/lite/micro/tools/make/Makefile TARGET=esp generate_magic_wand_esp_project
$ cd tensorflow/lite/micro/tools/make/gen/esp_xtensa-esp32/prj/magic_wand/esp-idf/
$ ls -CF
CMakeLists.txt  LICENSE  README_ESP.md  components/  main/
$ zip -r esp32mw.zip components main

The M5Stack development environment uses PlatformIO. The Platform is Arduino.

The procedure for using M5Stack in PlatformIO has been written previously.

Expand the code to the zip file, expand the [components / tfmicro] folder under [lib], and move the contents of the [main] folder to [src]

PlatformIO automatically generated because setup() and loop() would be included in main.cpp. "main.cc''、"main_functions.h'' on the TFLM side are deleted, and main_functions.cc rename to main.cpp.Also, delete #include "main_functions.h" in main.cpp.

The file structure is as follows

In order to pass the include path under the tflite folder, edit "platformio.ini" and add the include settings as shown below.

[env:m5stack-fire]
platform = espressif32
board = m5stack-fire
framework = arduino
build_flags = -Ilib/tfmicro/third_party/gemmlowp -Ilib/tfmicro/third_party/flatbuffers/include

When I build it, I get some errors, so I'll fix it

tfmicro/third_party/flatbuffers/include/flatbuffers/base.h

Lines 35-39

#if defined(ARDUINO) && !defined(ARDUINOSTL_M_H)
  #include <utility.h>
#else
  #include <utility>
#endif

Error occurs, but I want to use the #else clause, so edit platformio.ini and add -DARDUINOSTL_M_H to the compile options.Also, set the serial port to 115200bps

[env:m5stack-fire]
platform = espressif32
board = m5stack-fire
framework = arduino
monitor_speed = 115200
build_flags = -DARDUINOSTL_M_H -Ilib/tfmicro/third_party/gemmlowp -Ilib/tfmicro/third_party/flatbuffers/include

tfmicro/tensorflow/lite/kernels/internal/reference/concatenation.h

Lines 125-126

static_cast(std::round(input_ptr[j] * scale + bias)) + output_zeropoint;

Std::round() gives an undefined error.

This seems to be an error of arduino-esp32, but for the time being, change it to use Arduino's round() macro (std::round() → round())

Porting to M5Stack Fire

Now that the build has passed, I will add input and output code for M5Stack.

The code of this demo is configured to implement the reading of the accelerometer in accelerometer_handler.cc and the processing of the detection result in output_handler.cc. In the initial state, the former is a dummy code, and the latter is a code that outputs the result as text to the serial port.

accelerometer_handler.cc

Implements the following API

extern TfLiteStatus SetupAccelerometer(tflite::ErrorReporter* error_reporter);
extern bool ReadAccelerometer(tflite::ErrorReporter* error_reporter,
float* input, int length, bool reset_buffer);

SetupAccelerometer() does initialization

ReadAccelerometer() measures the acceleration in the X, Y, and Z directions at a frequency of 25 Hz, and records the results in time series in mg units (1000 times the measurement result) for each of X, Y, and Z in the internal ring buffer. To go. The position of the ring buffer is held by begin_index.

Then, copy length data from the latest one to the array input from the ring buffer

The TFLM sample includes an implementation example for Arduino, so we will implement it while referring to this.

One thing to keep in mind is the coordinate system, which cannot correctly recognize the gesture unless it matches the coordinate system in which you train the model.

Regarding this, there was the following comment in the Adafruit source code.

/* this is a little annoying to figure out, as a tip - when
 * holding the board straight, output should be (0, 0, 1)
 * tiling the board 90* left, output should be (0, 1, 0)
 * tilting the board 90* forward, output should be (1, 0, 0);
 */

In other words, when the display of the M5Stack stands upright, x, y, z are (0, 0, 1), and when it is tilted 90 degrees to the left (0, 1, 0), the display is turned up and placed on the desk. It should be (1, 0, 0) when placed.

In case of M5Stack, this condition can be satisfied by arranging the accelerometer data in the order of (z, -x, -y).

For the time axis, the sampling rate for acceleration measurement is in constants.h.

const float kTargetHz = 25;

It seems that sampling at 25Hz is desirable.

The input of the model is 128 continuous x acceleration data for 3 coordinate axes, totaling 384.

Since the sampling rate is 25Hz, there will be insufficient data for about 5 seconds after the start of measurement.

Also, it is necessary to be able to store at least 128 x 3 data in the internal buffer.

Based on the above, accelerometer_handler.cc is implemented as follows.

#include "accelerometer_handler.h"
#include "constants.h"
#include <Arduino.h>
#include <M5Stack.h>
#include "utility/MPU9250.h"
MPU9250 IMU;
float save_data[600] = {0.0};
int begin_index = 0;
bool pending_initial_data = true;
long last_sample_millis = 0;
TfLiteStatus SetupAccelerometer(tflite::ErrorReporter* error_reporter) {
IMU.calibrateMPU9250(IMU.gyroBias, IMU.accelBias);
IMU.initMPU9250();
error_reporter->Report("Magic starts!");
return kTfLiteOk;
}
static bool UpdateData() {
bool new_data = false;
if ((millis() - last_sample_millis) < 40){
return false;
}
last_sample_millis = millis();
IMU.readAccelData(IMU.accelCount);
IMU.getAres();
IMU.ax = (float)IMU.accelCount[0] * IMU.aRes;
IMU.ay = (float)IMU.accelCount[1] * IMU.aRes;
IMU.az = (float)IMU.accelCount[2] * IMU.aRes;
save_data[begin_index++] = 1000 * IMU.az;
save_data[begin_index++] = -1000 * IMU.ax;
save_data[begin_index++] = -1000 * IMU.ay;
if (begin_index >= 600) {
begin_index = 0;
}
new_data = true;
return new_data;
}
bool ReadAccelerometer(tflite::ErrorReporter* error_reporter, float* input,
int length, bool reset_buffer) {
if (reset_buffer) {
memset(save_data, 0, 600 * sizeof(float));
begin_index = 0;
pending_initial_data = true;
}
if (!UpdateData()) {
return false;
}
if (pending_initial_data && begin_index >= 200) {
pending_initial_data = false;
M5.Lcd.fillScreen(BLACK);
}
if (pending_initial_data) {
return false;
}
for (int i = 0; i < length; ++i) {
int ring_array_index = begin_index + i - length;
if (ring_array_index < 0) {
ring_array_index += 600;
}
input[i] = save_data[ring_array_index];
}
return true;
}

The acceleration sensor of M5Stack can be read using PlatformIO library for M5Stack.

Other platform implementations use LIS3DH for the accelerometer, while M5Stack uses MPU9250. LIS3DH has a built-in FIFO and seems to automatically and periodically store data in the FIFO.

The M5Stack library doesn't seem to support FIFO, so it measures only one data each time.

Also, it seems that a sampling rate of 25Hz is desirable, so I decided not to measure it when 40msec has not elapsed since the last measured timing.

Immediately after initializing the internal buffer, not enough data is stored for inference. This state is represented by the pending_initial_data flag. When true, acceleration measurement itself is performed but inference does not start.

Initialization of the internal buffer is performed at startup as well as immediately after recognizing any gesture. (Otherwise, the same gesture will be recognized many times.

As I will write later, the recognized gesture is displayed on the LCD of the M5Stack, so the LCD display should be erased when sufficient data is accumulated in the initialized buffer (when pending_initial_data becomes false). I chose

Initializes M5Stack, serial port and I2C bus. I2C bus is trying to run at 400KHz.

Add the following three lines to the beginning of setup()

M5.begin();
  Serial.begin(115200);
  Wire.begin();
  Wire.setClock(400000);

Add the following include files after the Tensorflow related include files

#include <Arduino.h>
#include <M5Stack.h>

Include Arduino.h after the TensorFlow related include files. Since various macros such as min and max are defined in Arduino.h, a compile error will occur if included before TensorFlow-related include files.

The main loop is as follows.

void loop() {
// Attempt to read new data from the accelerometer
bool got_data = ReadAccelerometer(error_reporter, model_input->data.f,
input_length, should_clear_buffer);
// Don't try to clear the buffer again
should_clear_buffer = false;
// If there was no new data, wait until next time
if (!got_data) return;
// Run inference, and report any error
TfLiteStatus invoke_status = interpreter->Invoke();
if (invoke_status != kTfLiteOk) {
error_reporter->Report("Invoke failed on index: %d\n", begin_index);
return;
}
char s[64];
float *f = model_input->data.f;
float *p = interpreter->output(0)->data.f;
sprintf(s, "%+6.0f : %+6.0f : %+6.0f || W %3.2f : R %3.2f : S %3.2f", \
f[381], f[382], f[383], p[0], p[1], p[2]);
error_reporter->Report(s);
// Analyze the results to obtain a prediction
int gesture_index = PredictGesture(interpreter->output(0)->data.f);
// Clear the buffer next time we read data
should_clear_buffer = gesture_index < 3;
// Produce an output
HandleOutput(error_reporter, gesture_index);
}

I added the debug output from line 16 to the original code.

model_input->data.f[381..383] contains the latest acceleration data.

interpreter->output(0)->data.f[0..2] contains the inference result (0.0 to 1.0, the total of three is 1.0) indicating that three gestures may have been performed.

output_handler.cc

Performs processing according to the recognized gesture.

By default, the result is output to the serial port, but I added the code to display the result on the LCD of M5Stack.

Below is the whole code, but I don't think it needs any explanation

#include "output_handler.h"
#include <Arduino.h>
#include <M5Stack.h>
void DrawWing() {
int x = 60;
int y = 20;
int w = 200;
int h = 200;
int t = 20;
int k;
k = (w-t)/4;
M5.Lcd.fillTriangle(x, y, x+t, y, x + k, y+h, GREEN);
M5.Lcd.fillTriangle(x+t, y, x+k, y+h, x+k+t, y+h, GREEN);
M5.Lcd.fillTriangle(x+2*k, y, x+2*k+t, y, x+k, y+h, GREEN);
M5.Lcd.fillTriangle(x+2*k+t, y, x+k, y+h, x+k+t, y+h, GREEN);
x += 2*k;
M5.Lcd.fillTriangle(x, y, x+t, y, x + k, y+h, GREEN);
M5.Lcd.fillTriangle(x+t, y, x+k, y+h, x+k+t, y+h, GREEN);
M5.Lcd.fillTriangle(x+2*k, y, x+2*k+t, y, x+k, y+h, GREEN);
M5.Lcd.fillTriangle(x+2*k+t, y, x+k, y+h, x+k+t, y+h, GREEN);
}
void DrawRing() {
int x = 60;
int y = 20;
int w = 200;
int h = 200;
int t = 20;
M5.Lcd.fillEllipse(x+w/2, y+h/2, w/2, h/2, RED);
M5.Lcd.fillEllipse(x+w/2, y+h/2, w/2-t, h/2-t, BLACK);
}
void DrawSlope() {
int x = 60;
int y = 20;
int w = 200;
int h = 200;
int t = 20;
M5.Lcd.fillTriangle(x+w-t*1.5, y, x+w, y, x, y+w, BLUE);
M5.Lcd.fillTriangle(x+w, y, x, y+w, x+t*1.5, y+w, BLUE);
M5.Lcd.fillRect(x+t, y+h-t, w-t, t, BLUE);
}
void HandleOutput(tflite::ErrorReporter* error_reporter, int kind) {
// light (red: wing, blue: ring, green: slope)
if (kind == 0) {
error_reporter->Report(
"WING:\n\r*         *         *\n\r *       * *       "
"*\n\r  *     *   *     *\n\r   *   *     *   *\n\r    * *       "
"* *\n\r     *         *\n\r");
M5.Lcd.fillScreen(BLACK);
DrawWing();
} else if (kind == 1) {
error_reporter->Report(
"RING:\n\r          *\n\r       *     *\n\r     *         *\n\r "
"   *           *\n\r     *         *\n\r       *     *\n\r      "
"    *\n\r");
M5.Lcd.fillScreen(BLACK);
DrawRing();
} else if (kind == 2) {
error_reporter->Report(
"SLOPE:\n\r        *\n\r       *\n\r      *\n\r     *\n\r    "
"*\n\r   *\n\r  *\n\r * * * * * * * *\n\r");
M5.Lcd.fillScreen(BLACK);
DrawSlope();
}
}

Setting gesture recognition threshold

Since gestures are time-series signals, something that seems to be a gesture is recognized as a gesture only after a certain period of time.

As an image, it feels like the following image recognition.

-The accelerations of the three axes are represented by the brightness of 1 pixel each. Three pixels are lined up side by side.

・New lines will be added every 40 msec (25 Hz). An image of a vertically elongated image.

・When the vertical length exceeds 128 lines, the oldest line disappears by scrolling.

・Find the pattern when you make a gesture from this 3 x 128 pixel image

False positives for gestures can occur, so the gesture is actually recognized as a gesture only after the gesture has been detected several times in a row and the threshold has been exceeded.

The threshold is listed in constants.h, but higher values make gestures less recognizable. You can also set different thresholds for different gestures.

In the original file, the thresholds for Wing/Ring/Slope were {15, 12, 10}.

In Arduino's constants.h, the threshold has dropped a little further to {8, 5, 4}.

This time, as a result of cut and try, the threshold was set to {9, 7, 6}.

Model replacement

After moving it, the detection accuracy was not so good (including many false detections), but when I checked it, I found that the model data (magic_wand_model_data.cc) was updated recently.

It seems that new data was released when the new gesture training script I wrote at the beginning was released.

When I brought in the old data from the r2.1 branch and tried to move it, it seemed that the detection accuracy was higher.

There may be personal habits and compatibility with devices around here

Credits

@boochowp

Posted by

vany5921

Comments

Please log in or sign up to comment.

Embed the widget on your own site

Gesture Recognition with M5Stack + TensorFlow Lite

Gesture Recognition with M5Stack + TensorFlow Lite

Things used in this project

Hardware components

Software apps and online services

Story

tfmicro/third_party/flatbuffers/include/flatbuffers/base.h

tfmicro/tensorflow/lite/kernels/internal/reference/concatenation.h

accelerometer_handler.cc

output_handler.cc

Credits

Comments

Related channels and tags