The term “drone” usually refers to any unpiloted aircraft. Sometimes referred to as “Unmanned Aerial Vehicles" (UAVs), these crafts can carry out an impressive range of tasks, ranging from military operations to package delivery. Drones can be as large as an aircraft or as small as the palm of your hand. Originally developed for the military and aerospace industries, drones have found their way into the mainstream because of the enhanced levels of safety and efficiency they bring. These robotic UAVs operate without a pilot on board and with different levels of autonomy. A drone’s autonomy level can range from remotely piloted (a human controls its movements) to advanced autonomy, which means that it relies on a system of sensors and detectors to calculate its movement.
Because drones can be controlled remotely and can be flown at varying distances and heights, they make the perfect candidates for taking on some of the toughest jobs in the world. They can be found assisting in a search for survivors after a hurricane, giving law enforcement and the military an eye-in-the-sky during terrorist situations, and advancing scientific research in some of the most extreme climates on the planet. Drones have even made their way into our homes and serve as entertainment for hobbyists and a vital tool for photographers.
Drones are used for various purposes:
- Military
- Delivery
- Emergency Rescue
- Outer Space
- Wildlife and Historical Conservation
- Medicine
- Photography etc.
====================================================================
MotivationThe main motivation behind this project is my curiosity to explore the various control schemes for small-scale drones. The paper "Design and Development of Voice Control System for Micro Unmanned Aerial Vehicles" talks about various drone control methodologies such as Radio, GCS, Gesture, Voice, Joystick, PC, FPV, and Autonomous. In the paper Design and Development of an Android Application for Voice Control of Micro Unmanned Aerial vehicles, it is observed that situational awareness is at a medium level for Radio and Gesture UAV control methods, whereas situational awareness is high for the voice control method. In this project, we will work on IMU sensor-based gesture control, and later we will go up to voice control and also other advanced controls.
The motivation for this project also raised from the need to implement these different control methods in a low-cost portable and scalable embedded platform with computation at the edge, without relying on external resources for its working.
====================================================================
Methodology====================================================================
DJI Tello DroneThe DJI Tello is a small-sized drone that combines powerful technology from DJI and Intel into a very tiny package. It is a lightweight, fun, and easy-to-use drone that is the perfect tool for learning the ropes of drone piloting before investing in a more expensive option. Tello boasts a 14-core processor from Intel that includes an onboard Movidius Myriad 2 VPU (Video Processing Unit) for advanced imaging and vision processing. It is equipped with a high-quality image processor, for shooting photos and videos. The camera features 5MP (2592x1936) photos and HD720 videos. The drone has a maximum flight time of 13 minutes. This incredibly small drone fits in your palm and only weighs approximately 80g (propellers and battery included). You can control Tello directly via the Tello app or with a supported Bluetooth remote controller connected to the Tello app. The drone is programmable via Python, C++, Scratch, and DroneBlocks.
Specs
- Weight: Approximately 80 g (with propellers and battery)
- Dimensions: 98mm*92.5mm*41mm
- Propeller: 3 inch
- Built-In Functions: Range Finder, Barometer, LED, Vision System, WIFI 802.11n 2.4G, 720P Live View
- Port: Micro USB Charging Port
- Max Flight Distance: 100m
- Max Speed: 8m/s
- Max Flight Time: 13min
- Detachable Battery: 1.1Ah/3.8V
- Photo: 5MP (2592×1936)
- FOV: 82.6°
- Video: HD720P30
- Format: JPG(Photo); MP4(Video)
- Electronic Image Stabilization: Yes
Preparing Tello Drone for the project
The Tello drone SDK provides ample information on how to program the drone to achieve the tasks via Tello commands, but are somewhat limited in the features. The Tello SDK connects to the aircraft through a Wi-Fi UDP port, allowing users to control the aircraft with text commands. We use Wi-Fi to establish a connection between the Tello and the M5Stack module. Once powered on Tello acts as Soft AP Wi-Fi (192.168.10.1) to accept commands via port 8889.
The Tello SDK includes three basic command types.
Control Commands (xxx)
Returns “ok” if the command was successful.
Returns “error” or an informational result code if the command failed.
Set Command (xxx a) to set new sub-parameter values
Returns “ok” if the command was successful.
Returns “error” or an informational result code if the command failed.
Read Commands (xxx?)
Returns the current value of the sub-parameters.
Even though Tello is pretty maneuverable, with a number of different axes on which we can control the drone, in this project, we will use the following commands.
- takeoff : Auto takeoff.
- land : Auto landing.
- up x : Ascend to “x” cm.
- down x : Descend to “x” cm.
- left x : Fly left for “x” cm.
- right x : Fly right for “x” cm.
- forward x : Fly forward for “x” cm.
- back x : Fly backward for “x” cm.
Please refer to the SDK for a full set of commands.
As a safety feature, if there is no command for 15 seconds, the Tello will land automatically.
Tello API
As we are using Arduino as the platform, we need an API that can translate our commands to UDP packets to be sent using the Arduino program. TelloArduino is an Arduino library for controlling DJI Tello through ESP32 Module. This library controls the Tello by sending commands via UDP as mentioned in the SDK documentation.
- Click the "
DOWNLOAD ZIP
" button. - Place the "tello" folder in your Arduino sketch folder/libraries/ folder. Now Restart the IDE.
- In your Arduino IDE, go to Sketch > Include Library > choose "tello" to include this library in your sketch.
====================================================================
M5Stack Fire ModuleM5Stack FIRE is one of the M5Stack developing kits, providing 9-Axis IMU sensor(6-Axis posture acceleration measurement + 3-Axis magnetic measurement), 16M Flash + 4M PSRAM, enhanced Base, larger battery, etc. With an IMU posture sensor, there are a lot of situations to which you can apply this kit, like detecting acceleration, angulation, and trajectory. You can make relative products like sports data collectors, 3D remote gesture controllers, etc. It is a modular, stackable, scalable, and portable device is powered with an ESP-32 core, which makes it open-source, low cost, full-function, and easy for developers to handle new product development on all stages include circuit design, PCB design, software, mold design, and production.
M5Stack Fire comes with three separable parts. The top part has the processors, chips, sockets, 2.4G antenna, ESP32, power management IC, an LCD screen, and some other interface components.
The middle part is called the M5GO base which provides a lithium battery, M-BUS socket, LED bar, and three more GROVE Ports. The bottom part is a charge table, which can be connected to the M5GO base via POGO pins.
M5Stack development boards are highly efficient, covered with industrial-grade case and ESP32-based development board. It integrates with Wi-Fi & Bluetooth modules and contains a dual-core and 16MB of SPI Flash. Together with 30+ M5Stack stackable modules, 40+ extendable units, and different levels of program language, you can create and verify your IoT product in a very short time.
It supports programming in Arduino, Blockly language with UIFlow, Micropython.
====================================================================
Preparing M5StackFIRE for the project
Download Arduino IDE
- Open up your browser, and visit Arduino's official website.
- Download and install the version according to your operating system.
Install ESP32 Boards Manager
- Open up the Arduino IDE, and navigate to File -> Preferences -> Settings
- Add the following ESP32 Boards Manager URL to Additional Boards Manager: https://dl.espressif.com/dl/package_esp32_index.json
- Hit OK
- Navigate to Tools -> Board: -> Boards Manager
- Search ESP32 in the Boards Manager window, and click Install
Install M5Stack Library
- Open Arduino IDE, then Select Sketch->Include Library->Manage Libraries
- Search M5Stack and install it
For Windows machines, an additional USB to serial Driver needs to be installed.
Arduino port Configuration
- Choose the correct board, baud rate, and serial port.
- Once set up, you can try an example sketch to verify if everything is working.
- Click Upload, to flash the code to the device
- Once successfully flashed, the M5Stack module will show the corresponding output on its display as well as Arduino Serial Monitor.
====================================================================
Gesture Control MethodGesture Commands
In order to control our Tello drone using the M5Stack module, we will be using gesture detection. 6 basic gestures are considered for the control (idle, takeoff/land, forward, back, left, right).
Idle
No commands are issued when the module is not moving.
Takeoff/Land
A Takeoff or Land command is issued by moving the module up and down as shown in the figure.
Forward
A Forward command is issued by moving and tilting the module to the front as shown in the figure.
Back
A Backward command is issued by moving and tilting the module to the back as shown in the figure.
Left
A Left command is issued by moving and tilting the module to the left as shown in the figure.
Right
A Right command is issued by moving and tilting the module to the right as shown in the figure.
====================================================================
Using accelerometer to recognize various gestures
An accelerator looks like a simple circuit for some larger electronic devices such as our smartphones. Despite its humble appearance, the accelerometer consists of many different parts and works in many ways, two of which are the piezoelectric effect and the capacitance sensor.
The piezoelectric effect is the most common form of accelerometer and uses microscopic crystal structures that become stressed due to accelerative forces. These crystals create a voltage from the stress, and the accelerometer interprets the voltage to determine velocity and orientation.
The capacitance accelerometer senses changes in capacitance between microstructures located next to the device. If an accelerative force moves one of these structures, the capacitance will change and the accelerometer will translate that capacitance to voltage for interpretation.
Typical accelerometers are made up of multiple axes, two to determine most two-dimensional movement with the option of a third for 3D positioning. Most smartphones typically make use of three-axis models, whereas cars simply use only a two-axis to determine the moment of impact. The sensitivity of these devices is quite high as they’re intended to measure even very minute shifts in acceleration. The more sensitive the accelerometer, the more easily it can measure acceleration.
====================================================================
Gesture Recognition using Edge ImpulseWe will use machine learning to build a gesture recognition system that runs on a microcontroller, with the help of Edge Impulse Studio.
Preparing Edge Impulse Studio for the project
- Log in to https://www.edgeimpulse.com/
- Click Create Project.
- Give Project name and click Create.
- Head over to the "Devices" tab from the left menu and choose "Connect a new device".
- You will be greeted with a variety of device options.
- To make things simple, let's connect our smartphone device. Since all modern smartphones have onboard accelerometers, it will be easy-peasy.
- Next, you will be given a QR code and a link to allow the collection of data from your smartphone.
- Scan this QR code or open the link via your smartphone device.
- Once the link is opened via your smartphone, the smartphone will show up in the "Devices" section.
====================================================================
Data collection
For collecting the data for our machine learning model, we will use the 3-axis accelerometer sensor present onboard our smartphone.
- Once the smartphone is connected to Edge Impulse, head over to the "
Data Acquisition
" tab.
Select our phone
from the device
, give labels (let's start with 'idle'), Sample length
of 10000
(10s), Sensor
as accelerometer
, and a Frequency
of 62.5Hz
.
- Click on "
Start Sampling
" to begin sampling for the set sample length.
- In about two seconds the device should complete sampling and upload the file back to Edge Impulse. Once sampled, the data will appear in the data acquisition.
- You see a new line appear under '
Collected data
' in the studio. - When you click it you now see the raw data graphed out. As the accelerometer on the development board has three axes you'll notice three different lines, one for each axis.
- Repeat this process to collect as many samples as we can.
- Repeat for the other labels takeoff/land, forward, back, left, and right.
- Make sure to perform variations on the motions. E.g. do both slow and fast movements and slightly vary the orientation of the board. You'll never know how your user will use the device.
- Once sufficient data is collected, they will be shown under the same tab.
- Click on each data row to view their raw value graph for 10s sample length.
Idle
takeoff/land
forward
back
left
right
- Now that we have sufficient data, we need to split the data into a
training dataset
and atest dataset.
- Don't worry. The Edge Impulse Studio makes that easy for us too.
- Head over to the "
Dashboard section
" and scroll down to the "Danger Zone
". - Click in "
Rebalance datasheet
" to automatically split the dataset into training and test with a ratio of 80/20.
- Now we have acquired and set up our training data for further processing.
====================================================================
Gesture Model Training
Since we have acquired all the data, it's time for us to train the dataset to fit a gesture model and Edge Impulse makes it very easier for us to generate a model without writing a single line of code.
With the training set in place, we can design an impulse. An impulse takes the raw data, slices it up in smaller windows, uses signal processing blocks to extract features, and then uses a learning block to classify new data. Signal processing blocks always return the same values for the same input and are used to make raw data easier to process, while learning blocks learn from past experiences.
- Head over to the "
Impulse Design
" tab. - We will already have the
Time series data
section populated for us.
- Select a
window size
of2000
(2s) and awindow increase
of80ms
. - Now click
Add a processing block
and selectSpectral Analysis
. - The parameters will be auto-populated for us.
- This block applies a filter, performs spectral analysis on the signal, and extracts frequency and spectral power data.
- Now click Add a learning block and select Neural Network (Keras).
- The parameters will be auto-populated for us.
- This block takes these spectral features and learns to distinguish between the six (idle, takeoff, forward, back, left, right) classes.
- The
Output features
block will have all the labels that we have acquired. - Now click on
Save Impulse
to save the configuration.
- Head over to the
Spectral Features
tab. - This will show you the raw data on top of the screen (you can select other files via the drop-down menu), and the results of the signal processing through graphs on the right.
- For the spectral features block you'll see the following graphs:
After filter
- the signal after applying a low-pass filter. This will remove noise.
Frequency domain
- the frequency at which signal is repeating (e.g. making one wave movement per second will show a peak at 1 Hz).
Spectral power
- the amount of power that went into the signal at each frequency.
- A good signal processing block will yield similar results for similar data. If you move the sliding window (on the raw data graph) around, the graphs should remain similar.
- Also, when we switch to another file with the same label, you should see similar graphs, even if the orientation of the device was different.
- Click
Save parameters
. This will send you to theFeature generation
screen.
- Here we will split all raw data up in windows (based on the window size and the window increase selected in creating impulse step) and apply the spectral features block on all these windows.
- Click
Generate features
.
- The
Feature explorer
will load. This is a plot of all the extracted features against all the generated windows. - You can use this graph to compare your complete data set. E.g. by plotting the height of the first peak on the X-axis against the spectral power between 0.5 Hz and 1 Hz on the Y-axis.
- A good rule of thumb is that if you can visually separate the data on a number of axes, then the machine learning model will be able to do so as well.
- For our dataset, the feature data are more or less separated which is a good sign. In case your features are overlapping, it is better to acquire more data.
- The page also shows the expected on-device performance with processing time and peak RAM usage for calculating features.
With all data processed it's time to start training a neural network. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. The network that we're training here will take the signal processing data as an input, and try to map this to one of the four classes.
So how does a neural network know what to predict? A neural network consists of layers of neurons, all interconnected, and each connection has a weight. One such neuron in the input layer would be the height of the first peak of the X-axis (from the signal processing block), and one such neuron in the output layer would be takeoff
(one of the classes). When defining the neural network all these connections are initialized randomly, and thus the neural network will make random predictions. During training, we then take all the raw data, ask the network to make a prediction, and then make tiny alterations to the weights depending on the outcome (this is why labeling raw data is important).
This way, after a lot of iterations, the neural network learns; and will eventually become much better at predicting new data.
- Head over to the
NN classifier
tab.
- Set
Number of training cycles
to80
,Learning rate
to0.0005
, andMinimum confidence rating
to0.60
. You can play around with these values to adjust the accuracy of the trained model. - Leave the other parameters default for now and click
Start training
. - Now the
Training Output
section gets populated.
- It displays the accuracy of the network and a confusion matrix. This matrix shows when the network made correct and incorrect decisions. You see that
idle
andtakeoff
are relatively easy to predict. - It also shows the expected On-device performance for this model.
- Now that we have generated the model, we need to test it.
====================================================================
Gesture Model Testing
- Head over to the
Model Testing
tab.
We can see our training dataset here. Click Classify all
.
- This will generate the model validation outcome using the training data that was unknown to the model. We can see that our trained model was able to classify with an accuracy of 74.70% which is quite good considering the small amount of training data fed to the model in the training section.
- It also shows which labels were incorrectly predicted.
- By checking these results in
Feature explorer
, we can understand which labels were misclassified and use more training data to re-train our model for better classification of those data. - You can also do a live classification of data from the smartphone from the
Live classification
tab. Your device should show as online underClassify new data
. Set the 'Sample length' to 2000
(2 seconds), clickStart sampling
, and start doing movements. - Afterward, you'll get a full report on what the network thought that you did.
- Now that we have trained and tested our model, let's deploy it in our M5Stack module.
====================================================================
Gesture Model Deployment
With the impulse designed, trained, and verified you can deploy this model back to your device. This makes the model run without an internet connection, minimizes latency, and runs with minimum power consumption. Edge Impulse can package up the complete impulse - including the signal processing code, neural network weights, and classification code - up in a single C++ library that you can include in your embedded software.
- Head over to the
Deployment
tab. - Select
Arduino library
. - If you need the build for a specific Edge Impulse supported hardware, under
Build firmware
select your development board - Click
Build
. This will export the impulse, and build a library that will run on the development board in a single step. - We will see a pop-up with text and video instructions on how to deploy the model to our device.
- After the build is completed you'll get prompted to download the library zip file.
- Save the zip file to our project directory.
====================================================================
InterfacingNow that we have prepared our drone, M5Stack module, and the gesture model, let's interface everything together in code.
The complete interfacing code is provided in the Code
section of this project tutorial.
Flash the code into the M5Stack module.
For user interaction:
- We use the TFT screen on M5Stack to show the wifi connection status and the gesture detection status.
- Long-pressing the push-button A on the M5Stack module enables gesture control and Long-pressing push-button B disables it.
- Move the M5Stack module to make various gestures.
====================================================================
TestingLet us now test the gesture control and see how well it works.
====================================================================
ConclusionAlthough the inference engine was not able to classify between left and right accurately, overall the performance was satisfying. Also, there were some cases where the gesture command needed to be applied multiple times to be able to detect and classify.
We believe these issues can be better addressed by adding more training datasets and making the model more flexible.
====================================================================
What next !!- Training the model with more test data for more accurate classification.
- Audio feedback depicting the gesture detected.
====================================================================
Referenceshttps://arc.aiaa.org/doi/10.2514/6.2018-4231
https://arc.aiaa.org/doi/10.2514/6.2019-3363
https://dl-cdn.ryzerobotics.com/downloads/Tello/Tello%20SDK%202.0%20User%20Guide.pdf
https://store.dji.com/product/tello
https://github.com/akshayvernekar/telloArduino
https://docs.m5stack.com/en/quick_start/m5core/m5stack_core_get_started_Arduino_Windows
https://www.arduino.cc/reference/en/libraries/m5stack/
https://shop.m5stack.com/products/fire-iot-development-kit?variant=16804798169178
https://docs.edgeimpulse.com/docs/continuous-motion-recognition
Comments