Smart Glasses See Without Vision
Cornell University's sonar-based tech aims to integrate eye tracking and facial recognition into glasses, eliminating the bulk of cameras.
In the wearables market, smartwatches and other wristbands have taken the early lead. With so many people already accustomed to wearing traditional wristwatches, it is the most natural place to fit some electronics onto the body. And these devices have proven to be very useful, easily handling tasks like health and fitness tracking, and providing quick access to text messages, emails, and other notifications. Furthermore, with the hands being so heavily involved in our daily activities, they have the potential to integrate even more deeply into our lives in the future.
However, to make the most of today’s powerful sensing and processing equipment, these wearable devices need to move beyond the wrist. Considering the importance of vision in virtually everything that we do, adding some intelligence to eyeglasses is the next obvious target. But that is easier said than done. Packing processors, sensors, and batteries into eyeglass frames can make them bulky and awkward to the point that few people would choose to wear them in their daily lives. This is especially true when the glasses require cameras for operation, which increases the requirements for computational resources and battery capacity — which further increases their bulk.
But by leveraging some lower-tech sensing options, Cornell University researchers have demonstrated that the components may be able to transparently fit into something more like standard frames. They took on the challenge of building two different sensing capabilities into smart glasses — eye tracking and facial expression recognition. These functions are traditionally supported by cameras, but the researchers showed that a sonar-like technology could also do the job. And it could do so at a fraction of the energy budget, size, and computational requirements.
Eye tracking has numerous practical applications, ranging from assistive technologies that can help disabled individuals use computers to gaming, virtual reality, and medical diagnosis. Facial expression recognition is similarly useful in applications like emotion detection, communications systems, and virtual reality.
In order to achieve their goals, the team developed two different, yet related, technologies. GazeTrak is their eye tracking solution, while EyeEcho handles facial expression tracking. In both cases, speakers are used to direct inaudible sound waves in the direction of the face. These signals are then reflected back in the direction that they came from, but only after being modulated in a specific way as a result of their interaction with the face. These modulated signals are captured by microphones positioned near the speakers. A Teensy 4.1 board provides onboard processing for GazeTrack, while a Nordic Semiconductor nRF52840 powers EyeEcho. The other primary difference in these systems is the locations of the speaker and microphone — GazeTrak is pointed at the eyeballs, while EyeEcho points down from the frames towards the rest of the face.
The key to these innovations involved the development of machine learning algorithms that can make sense of the reflected audio signals. Each system has its own deep learning pipeline that can infer either the direction of an individual’s gaze, or the facial expression that they are making. Experimentation showed that EyeEcho excelled at expression recognition. It can continuously track changes in expression where most existing systems can only recognize a few specific expressions. GazeTrak also showed a great deal of promise in eye tracking, however, it is not as accurate as present state of the art systems. With future optimizations, however, the researchers believe that they will be able to catch up.
The team will be presenting their work at the ACM CHI conference on Human Factors in Computing Systems next month. Perhaps after the team further improves their technology it will be integrated into a commercial product that we can all make use of.