Quadro is a DJI Tello drone upfitted with autonomous technology, and designed for companionship. Quadro can provide you company as you walk around, as well as being able start conversations by noticing your facial emotions.
I've always been fascinated with computers, robots, humans, and the way all three interact and will shape the 21st century. I get excited thinking about how us humans will interact with robots in a social context, and how there is a ample opportunity for symbiosis - as we work to improve robots, robots will work to improve us. Of course, we have decades of experience with "dumb" robots in heavy industry - those robots have helped us greatly in terms of capital production. What's more interesting, in my opinion, will be how the next generation of robots improve our mental and emotional abilities and understanding of ourselves - this space will be truly transformative. Many startups have begun on their quest, which is a good sign for how quickly this tech will improve in the next decade. Imagine having an omniscient, unbiased therapist that is your best friend, knows everything about you and how to help you achieve your goals and keep you away from trouble. (Hopefully they don't cost as much as a real therapist :0) Lofty wanderings aside, there is a big problem in mental health in the world, where many people live alone, marriage birth rates are going down (in the west), pet ownership is going up, and many people generally don't understand underlying mental problems and rough patches they might have. I think this is a worthwhile endeavor to go upon, and I chose to create Quadro, the Drone Companion as a start in that direction.
The ProjectThis project can be split up into three essential parts. Note: I'll be using drone/Tello/Quadro interchangeably.
- Part 1 - Soft-hacking the Tello drone to be able to have it talk to a computer. The Tello streams 5MP video over WiFI and accepts flight command primitives over the same network. Nodes talk over ROS (Robotic Operating System).
- Part 2 - ROS nodes for following behavior. Essentially, the Quadro's video stream is ingested by an object detector. The detector will look for nearby humans, and marks a location near the sternum to be followed. A PID controller gives commands to the drone, in order to move it to a desired position - y, z, and yaw are controlled.
- Part 3 - Quadro's video feed is also ingested by an emotion detection algorithm, that detects the user's emotional state based on their facial expression. This classification is then fed to a smart 'emotion_manager' that responds to the user appropriately.
At a high-level, Quadro operates as a Client to a more powerful Server in an NVIDIA GPU-powered laptop. The laptop runs Ubuntu 16.04, ROS Kinetic, and has a bluetooth dongle for an XBOX360 controller which is on the ROS network via the xboxdrv
linux driver and joy
ros package. The follow-me code and emotion detector are run as separate ros nodes that take in video feed and output either movement commands or speech through the computer's speakers.
Getting the drone to do what we want sounds simple, right? Well ... it's not so. The DJI Tello is made to work only with the Tello app for mobile phones. One can install the app, switch WiFi to the AP on the drone, and then control it that way.
It's interesting that Tello even has an official SDK that allows developers to make simple APIs to control limited parts of the drone. Unfortunately streaming video is not supported (streaming and socket programming can get complicated, so that's understandable). On github, there are several attempts at getting this SDK working - and they're all plagued with issues. One github-er reversed engineered parts of the Tello protocol to get more functionality, including streaming working. Unfortunately my computer could not get the h.264 and pyav interfaces working .. However, after days of trying, I finally figured out a work-around that involves me not having to use pyav - "hooray", I screamed!
Now that I had a basic interface going over ROS to the drone, I added support for the 360 controller, as well as custom commands for toggling video, taking pictures, and toggling following behavior. NOTE: In the demo videos below, I am holding the control for safety, and not pressing any buttons except for takeoff/land.
Part 2 - Follow-Me BehaviorI'm allergic to cats and dogs (fur and dander causes my eyes to swell ...) but always liked the idea of having a companion pet at my side. Thus I wanted to have the next best thing by having a drone act as a companion :)
To get Quadro to follow me, I first had to tell it exactly where I am. To do this, the drone's video stream is ingested by an object detector - I chose ssd_mobilenet_v1_coco
as a model, which is thankfully pre-trained on humans in addition to 90 other classes. The output of the network yields bounding boxes of each detected object, as well as their centers in pixel (u,v) coordinates.
Knowing the center of the human in the image, it makes sense to align the drone so that the center of the human is close to the center of the image (at least center of the width, as one might wish to have the height a bit higher). To do this, we must first make sure that the human is actually in the image. If not, that a find_human()
function is run that turns the drone around in an attempt to find a human. On first boot, the drone will always turn right, however, after, that, it has a memory of which direction it saw you last - so if you move to the right of the drone and outside the FOV, the drone will know to move clockwise.
For forward and backward movement, I employ a simple PID controller. This makes it such that when the drone is farther from you, it will move faster to get up to you - this makes for a nice exponential decay in speed. An example of the PID implementation is below. Note that UPPERCASE_VARS are defined elsewhere.
# from follow_human.py
...
# lx is linear x -> closer and farther from human
# locked on, move towards or away from human
if bound_width < (DISTANCE_KEEP_AWAY-DISTANCE_KEEPING_RANGE):
error = abs(bound_width -
(DISTANCE_KEEP_AWAY-DISTANCE_KEEPING_RANGE))
ucontrol = KP * error
lx = ucontrol
self.cmd = 'move forward '+str(ucontrol)
elif bound_width > (DISTANCE_KEEP_AWAY+DISTANCE_KEEPING_RANGE):
error = abs(bound_width -
(DISTANCE_KEEP_AWAY+DISTANCE_KEEPING_RANGE))
ucontrol = KP * error
lx = -ucontrol
self.cmd = 'move backward '+str(ucontrol)
The image above is a debug output frame showing a) detected human (in green), b) center of human (small white rect), c) acceptable bounding box for drone yaw, and d) command being given to the drone (move forward 0.605m - we're too far away!)
Watch the full follower demo below
Part 3 - Emotion Recognition and Empathic ResponseIn addition to being at your side, a faithful companion also offers you emotional support and general conversation. Quadro offers this by way of facial emotion recognition. When Quadro detects that you're feeling sad, for example, it will say: "Hi Phil. You seem sad. Wanna talk about it?". Or, if you're angry, it will say: "Deep breaths, Phil, or you might hurt yourself".
Emotion Detection is done via a 2-step process.
- First, the video stream is sent to a haar cascade detector - this looks for possible regions for human faces within an image.
- Next, when an acceptable region has been found, that region is sent to an emotion detector which outputs one of five classes -
{'angry', 'happy', 'neutral', 'sad', 'surprised'}
.
The emotion detector model was created in Keras with a tensorflow backend, using the VGG16 architecture and trained on the FER2013 Face Dataset for a Kaggle competition. It works well enough for our purposes.
For each frame, the detected emotion is sent to the emotion manager, where counters are managed for how many recent frames contained a given emotion. When enough recent frames contain a given emotion (e.g. the user has been smiling for a long enough time), the emotion manager will play a sound file corresponding to that emotion. The manager also manages how many times a given emotion will be recognized in an n-minute-long time frame, such that common emotions (such as neutral) do not prompt the playing of sound files too often.
See the demo below. Unfortunately, I could not record the output sound, but the phrase is shown on the screen as well.
ConclusionThis hackathon gives us hackers a prompt: can we solve specific social and livelihood problems? Quadro was meant to tackle the unseen problems that many of us have - that is lack of companionship. Particularly in times of pandemic or transitions in life, one may feel lonely or just need someone to talk to. With Quadro, you have a friend that will always be by your side, and always willing to strike up a conversation.
This was a fun exercise for me, and I hope you all enjoyed my project!
Comments