The COVID-19 pandemic has severely changed various aspects of human life on earth. We've seen a lot of improvements in healthcare technology as well as medical and biological advances with research all over the world.
Our daily lives had a big impact as well, the way in which we interact with the world and each other has changed. One main point of concern is Cross-Contamination while using public devices. Many public places use touchscreen kiosks in order to provide a digital interface where people can make requests and check for information. However, touchscreen devices can be extremely unhealthy (https://bmcinfectdis.biomedcentral.com/articles/10.1186/s12879-021-06379-y) and their daily use by many people increases the risk of contamination, as well as cleaning routines that need to be performed more frequently.
What is it?This project consists of a Gesture-Controlled Self Service Kiosk. This device can be placed in public spaces (E.g.: Malls) in people can use in order to utilize the establishment's services. With a touchless kiosk, people can check in, study the available products and make a request without the need to physically interact with a screen.
Current touchless kiosks use motion capture sensors that are limited to simple hand patterns, compromising the user experience as a whole. In this project, with the aid of a camera and computer-vision-based applications (powerful edge hardware and software), the kiosk is capable of recognizing a wide range of gesture patterns providing better and more natural interactions. Using this technology stack also has other inherent benefits as being able to identify if the person is properly wearing a mask for example.
This solution provides a low-cost, easy-to-implement device to be used in public spaces in order to avoid cross-contamination and improve the efficiency (as well the interaction) of nowadays public kiosks.
What was used?- Kria KV260 AI Vision Starter Kit: The Kria KV260 FPGA is the brain of this project, as it acts as the main computer receiving inputs and displaying the graphics for the kiosk. This platform is designed to handle advanced vision applications on the edge without the need for an expert-level understanding of FPGAs. Just plug-and-make!
- Webcam: The webcam's role is to receive video input for the computer-vision model.
- React app: ReactJS is a Javascript framework intended to easily prototype and develop frontend applications. In this project, ReactJS along with Bootstrap was used in order to make the User Interface;
- Handtrack.js: In order to detect hands on the video stream and also specific gestures like an open/closed hand, Handtrackjs was the library of choice. It is based on Tensorflow models to detect hands and faces. Another advantage is that it's optimized and works well on Kria;
- Ubuntu: The SO running on Kria KV260 is Ubuntu. The image used was Ubuntu on Xilinx. It can be easily burnt into an SD card and for usage on Kria. For more information on how to do it, check this helpful article.
The Handtrack javascript library gives us some out-of-the-box functionality like gesture detection (E.g: Detecting an open or closed hand). This was used so the user can select or deselect a certain product, by just holding a closed-hand position for a little amount of time, as you can see in the gif below:
To change from one item to the other, 10 consecutive measures of the open-hand position are registered. After getting all 10 measurements, a Dx parameter is calculated by subtracting the 10th x value from the very first one. If the value is negative, it was a Right-Swipe, if it was positive it means that the user wants to make a Left-Swipe. You can see this functionality below:
Happy hacking!
Comments