Lucas Fernando Builds a Computer Vision Model for One Task: Spotting Naruto Hand Signs
Built using a custom-trained YOLO model, this computer vision system could prove a ninja's best friend.
Maker Lucas Fernando has put machine learning to work on a very serious topic: detecting and translating Naruto hand signs, for the ninja in a hurry.
"If you've ever watched Naruto, you probably remember the hand seals," Fernando explains, referring to the popular and long-running anime series based on the best-selling manga of the same name. "It's a concept of controlling the five elements using hands gestures. During my AI studies, I thought: what if I could create a computer vision system that recognizes these hand seals in real-time? Well, that’s exactly what I did."
Fernando's seal-recognition system is based on the original 12 signs from the series, each created with a particular unique two-handed gesture and inspired by the signs of the zodiac: dragon, tiger, dog, rat, ram, horse, monkey, bird, ox, serpent, hare, and boar. First, Fernando gathered his training data — a somewhat laborious process in which a Python script captured the maker performing each hand gesture 100 times to provide a 1,200-strong dataset.
"Next, I needed to label the data. When working with object recognition, you must tell the model where the object is located in each image. For this task, I used a special software called CVAT," Fernando explains. "I uploaded the images to CVAT, labeled them according to the hand seal it represented, and then exported the annotations into the YOLO [You Only Look Once] format."
These labeled data, along with more images created by rotating or otherwise modifying the original source images, were used to train a model capable of detecting the presence of a seal in a video stream or still image. Throw a seal up in front of the webcam and the model will locate it and identify which seal it is.
"It was a cool project that taught me a lot about computer vision and how AI [Artificial Intelligence] models are built," Fernando concludes, though it does not appear to have provided mystical powers. "It took about a week to be finished, but I spent most of the time studying and writing code that didn't work. With the guidance I provided [in the tutorial], you should be able to create your own model in a few hours."
The project is documented in full on GitHub, with code, training data, and the model published under the reciprocal GNU General Public License 3.