Let Your Fingers Do the Talking
SpellRing is a thumb-worn device that uses sonar and AI to translate American Sign Language fingerspelling into text in real-time.
It has been estimated that roughly a half-million people in the U.S. use American Sign Language (ASL) to communicate. ASL is a highly expressive language that gives a voice to deaf and hard-of-hearing people that would otherwise be isolated by their inability to communicate verbally. Despite the fact that a lot of people can understand this language, the number of people who do not understand ASL dwarfs those that do. As such, signers still face many communication barriers in their everyday lives.
Of course things would be easier if we all knew sign language, but a great deal of effort and practice goes into signing, which makes that goal unrealistic. Better yet would be a device that can translate signs into text or speech so that anyone can understand what a signer is trying to convey. And many such systems do exist — but they are not usually very practical for real-world use. They often rely on cameras, and this sort of data requires a lot of resources to process. That makes many solutions bulky and cumbersome for portable use. Furthermore, camera-based solutions also present many privacy-related concerns to their users.
A team led by researchers at Cornell University has come up with a new solution called the SpellRing. No, this is not a prop from Harry Potter, but rather a ring that leverages a privacy-preserving sensor to decode the hand motions associated with ASL in a computationally efficient manner.
The SpellRing is a small, 3D-printed device worn on the thumb that uses micro-sonar technology to track hand and finger movements. Unlike camera-based systems, which require significant processing power and constant visual monitoring, SpellRing utilizes a combination of sound waves and motion sensors to interpret ASL fingerspelling in real time. The device emits inaudible sound waves from a built-in speaker, which bounce off the fingers and are then detected by an embedded microphone. By analyzing the way these waves change as the fingers move, and integrating gyroscopic motion data, SpellRing can reconstruct hand shapes and gestures without the need for visual input.
The key to SpellRing’s functionality is a deep learning algorithm designed to process the sonar and motion data, translating continuous ASL fingerspelling into text. This approach allows the system to recognize words as they are spelled out, without requiring users to pause between letters — an important consideration, as natural fingerspelling in ASL is fluid and dynamic. Previous technologies often forced signers to adapt their signing style to accommodate recognition limitations, but SpellRing aims to allow for natural communication.
The research team conducted extensive testing to ensure SpellRing's accuracy and usability. A group of 20 ASL signers, both fluent users and learners, participated in the evaluation process, contributing over 20,000 words to the study. The results were promising, with accuracy rates ranging from 82% to 92%, depending on the complexity of the words being spelled. These performance levels are comparable to bulkier, less practical systems, demonstrating that SpellRing offers a viable alternative for fingerspelling recognition.
While fingerspelling is a crucial part of ASL, it is only a fraction of the full language, which also includes gestures, facial expressions, and body movements. Future iterations of the technology may incorporate additional sensors, possibly embedded in eyeglasses or other wearable devices, to capture a broader range of ASL communication elements. For the deaf and hard-of-hearing community, these advances represent a move toward greater accessibility and independence.