(Our 3 Minute Submission Video is all the way at the bottom. Here's a quick link)
Problem StatementIn today’s world, designers often create environments that fundamentally rely on their users being able to see. Accessibility to those without this crucial ability is an afterthought, if it is even considered at all. For the 285 million visually impaired people around the world, tasks like finding one’s keys or walking along busy sidewalks become arduous or even impossible. Vision difficulties are often associated with aging, and with a growing population, the World Health Organization predicts that the number of visually impaired people will triple by the year 2050.
Despite the pace of medical improvement in recent years, a permanent cure for blindness remains elusive, and even the most promising current treatments are highly experimental and extremely expensive. Blind people today rely on sighted guides, seeing-eye dogs, and canes even a century after the introduction of these solutions, with obvious associated limitations of functionality. Even so-called "cutting-edge" vision technologies are only able to describe a blind person's environment. Crucially, these devices fail absolutely when it comes to actually interacting with the environment. Without a functional, cost-effective way to actively engage in the world around them, people with visual disabilities are relegated to spectator status in our society.
To address this pressing need, our team has designed blindsight
, a wearable device that greatly increases the autonomy of the blind.
Before embarking on this ambitious project, our team devised a careful plan for solution development. In order to ensure that blindsight
would be a significant improvement in the field of vision technologies, we conducted an extensive review of existing solutions, a selection of which is presented here:
- The KNFB Reader is a text reader application for the blind. The user utilizes a mobile device to take a picture of a document or other text source, and the app subsequently converts that text into speech or Braille. Limitations: Since this technology is only available in app form, a blind user must continuously hold their device out in front of them, an awkward pose to maintain for long periods of time.
- TapTapSee is an open source image recognition software than can tell blind users what objects are in a certain picture. By moving a mobile phone around, the blind user can scan through individual items in their vicinity. Limitations: The range and FOV of a phone's camera are limited through this app, which means that it functions well only in small, neatly-ordered environments that are consequently unrepresentative of the real world.
- LookTel Money Reader is a service that calculates the amount of currency present in an image of paper money. Limitations: This app functions well in its specific use case, but its inability to handle anything other than currency means it is a contributor to the "app overload" blind users often experience.
From our analysis of these and other technologies, we were able to generalize the shortcomings of current solutions:
1. Slow and expensive: Many of these solutions rely heavily on offloading the actual processing to a server. Without an elegant way to handle the latency and provide rapid feedback to the user, these technologies will remain imperfect.
2. Uncomfortable and awkward: Since the majority of current solutions are apps, they require a user to continuously direct their mobile phone towards what they wish to interact with. This position is straining on the hands and also particularly taxing for the elderly, a generally weaker population that is much more susceptible to vision loss.
3. Passive and incomplete: Most significantly, no current solution provides a simple and effective way to directly engage the blind user with their environment. Using existing technologies essentially tethers a user to their phone, significantly impacting their overall autonomy.
Project ProposalBy developing blindsight
, our team aspires to substantially improve the quality-of-life of the visually impaired in today's world. Our armband product will integrate a camera to view the user's environment and implement machine learning to interpret this data as a collection of objects and features of interest. Through companion iOS and Android apps equipped with headphones, users will issue verbal commands to the device. blindsight
will also utilize a novel directional haptic feedback system, leveraging the increased sensitivity to touch that often accompanies a loss of vision. By applying vibrational stimuli to specific regions of a user's arm, our device will precisely guide the hand towards a target item, enabling interactions with objects that are otherwise invisible.
With our project proposal defined, we began the construction of our MK1 blindsight
device. For the purposes of this first prototype, our device will have four main components:
- Control Module: This module contains the Raspberry Pi Zero W, which serves as the computer controlling the entire device. The module contains the attached Raspberry Pi Camera along with a capacitive touch sensor, all housed in a 3D printed housing.
- Battery Modules (x2): These modules each contain 2xAA batteries to power the entire device, and are encased in 3D printed housings.
- Stretchable Armband: The armband itself is the fundamental structure of the device. For MK1, our team utilized the stretching fabric of a black sock. The armband is embedded with 8 vibration motors, which are the essential components of our unique directional haptic feedback system. The other modules are all sewn directly onto the armband to complete the device.
MK1 Functionality Summary
blindsight
is an assistive vision technology that helps the blind interact with their environment more effectively. Our prototype includes a Raspberry Pi ("RPI") Zero Wireless, an RPI camera module, haptic feedback motors, a Node.JS Express server, and an Android application. We created an armband by attaching the RPI, motors, and AA batteries into 3D-printed cases. These modules were then sewn onto a wrist sleeve.
When turned on, the Raspberry Pi waits for a wake command from the user. This can come from either saying the wakeword "Hey George" into the microphone of an Android phone running our application, or by placing a finger on the easy-to-find divot in the casing of the device. The user then verbalizes their command into the phone's microphone, or long-presses the capacitive touch sensor to trigger a default scan of the environment. This command is sent to the server to be relayed to the Raspberry Pi.
Once the Raspberry Pi receives the command from the server, it begins to execute the required action. First, the Raspberry Pi captures an image from the camera and converts it into a base64 string to be sent back to the server. The server loads a TensorFlow instance to describe the scene or pinpoint a specific object.
If the user simply wants a general description of the surrounding environment, the server sends its result back to the Android app, in order for the answer to be spoken aloud to the user. If the user instead wants to locate a specific object, the process is more complex. The server returns a bounding box of the target object to the Raspberry Pi. The RPi determines the offset between the object location and the user's hand (whose location is known by the fixed attachment of the camera). Utilizing OpenCV's motion vectors and the vibration motors, the Raspberry Pi vibrates in the direction that the user's hand must travel to reach the target object, and pulses once the user has successfully reached the target.
With this process, blindsight
acts as a "new pair of eyes" for blind users, enabling them to directly engage with their world. Our MK1 prototype is fully functional, but had some limitations and areas of improvement. The reliance on a server is primarily because of the difficulty we encountered in running the TensorFlow instances on the mobile device itself, as well as for transfer of images between the RPi and the processing device. Even though we are using a server, which does slow down the overall process, the RPi remains capable of real-time object tracking through OpenCV, proving that an eventual server-free version is definitely in the realm of possibility.
The following are assorted videos and pictures documenting our MK1 build:
MK2 represented the next step forward in our design journey. Recognizing the success of the first prototype, our team decided to focus on reducing the functional barriers that would inhibit successful use of blindsight. Our targeted areas for improvement were:
- Long-term powering and charging solution
- Camera improvements
- Additional user features implementation
- Physical housing improvements
- Mobile app improvements
Long-term Powering Solution
In MK1, power for the entire system was provided solely through 4 AA batteries enclosed in a dedicated case. We had chosen to use these batteries for their convenience in our early prototyping, but we understood that this was one of the most inefficient areas of our prototype. AA batteries have a finite lifespan and also have a low charge-to-weight ratio. Lithium Polymer batteries (“LiPo”) are rechargeable, and much more compact for the same charge. Because of the fact that the band needed to wrap around the user’s arm, we designed our device to use two smaller batteries wired in parallel, instead of a single, larger LiPo. We purchased 2 of the 2000mAH LiPos listed earlier in our documentation and used these for powering our device.
A major advantage of LiPo batteries was the ability to safely recharge the batteries without physically replacing them. Typically, LiPo batteries are charged with a specialized JST connector attached to a LiPo-specific charger. Since we recognized that people with visual impairments often struggle with messes of cables, we instead opted to pursue a wireless charging solution. Through the integration of the Qi charging receiver and PowerBoost 500C, blindsight can be charged simply by placing it on top of a generic Qi wireless charging pad. Qi is the industry standard wireless charging technology for smartphones and similar devices, so it was an obvious choice for our product.
Camera Improvements
A different problem we encountered with our MK1 prototype was that the camera’s Field of View (FOV) was very narrow. During our testing, we realized that the official Raspberry Pi camera we had originally purchased could only see objects that were already within reach of the blind user’s hand. Obviously, this made the object tracking functionality effectively useless. Through our upgraded SainSmart camera, however, the Raspberry Pi is able to track objects with a 160-degree FOV, more than double the initial view. This allows the user to point blindsight towards a cluttered table and be able to scan the entire desk at once, making our device much easier to use.
Additional User Features Implementation
During our initial research of existing solutions, our team had recognized the issue of "app clutter" - the blind are often reliant on a multitude of apps that are each tailored to a singular and limited use-case, drastically detracting from the user experience. To address this better in our MK2, we focused on incorporating other standard features that a blind person would find useful.
Since not all text encountered in the real world is presented with a Braille equivalent, visually impaired individuals are often unable to read signs or posters bearing important information. By incorporating Optical Character Recognition ("OCR"), blindsight
can effectively translate text scanned by the camera into spoken words through the user's headphones.
An additional challenge for blind people is the ability to recognize other people in front of them. Typically, blind people rely on gradually memorizing specific details of a person's voice in order to identify them, but this process takes time and is imperfect. By implementing keywords to train a facial recognition model on a new acquaintance's face and later identify someone in front of the user, blindsight
addresses this specific problem.
Physical Housing Improvements
After assembling and wearing the MK1, we immediately recognized that the design was not particularly comfortable for long-term use. Its uneven weight distribution made the band prone to slippage, and the central module jutted out a considerable height from the otherwise sleek band. As we moved forward with blindsight
, we also understood that we needed a better method of constructing the actual stretchable armband.
In part because of the numerous electronics hardware changes mentioned earlier, the original style of externally-visible 3D printed modules was replaced with a cleaner system of fully embedding electronics into the armband. Better-designed CAD models of these parts are included as attachments. Unfortunately, due to difficulties in accessing a 3D printer, our team was unable to print the parts we required, resulting in a less-than-elegant implementation.
The band itself was now produced with fabric purchased from a local store. Since we could fully manipulate the fabric, we were able to better position the motors and other components throughout the armband, leading to a significantly improved weight distribution. Additionally, Velcro dots added into the design allowed for easy maintenance as we ironed out bugs in our code.
Mobile App Improvements
Our primary improvement was to develop a blindsight
companion app for the iOS platform as well, since the majority of visually impaired people actually utilize iPhones and other Apple devices. More superficial UI changes were also made to the original Android app. Functionality-related improvements included transitioning the name of our virtual assistant from George to Christy, and beginning the process of developing this assistant into a complete AI persona of the likes of Siri or Google Assistant.
MK2 Functionality Summary:
In almost all respects, the functionality of the MK1 prototype has been significantly improved in the MK2 iteration. The additional videos and images below demonstrate the features we have worked to add into our device:
Next StepsWhile MK2 represents a definite improvement over MK1, there remain areas that must be worked on to bring blindsight
closer to implementation in the real world:
- 3D print the designed parts: Since we had been unable to print the parts we designed in time for the MK2 prototype, this remains an important area where we must improve the overall aesthetic appeal of the device.
- Localize the processing: In order to avoid relying on a server in the long-term, we need to move the processing onto the user's phone instead, saving time and money.
- Create a Printed Circuit Board: By designing a custom PCB with the minimal required elements for our device, we can reduce the current draw from the RPi and also achieve a better form factor for our use case.
- Improve the Christy virtual assistant: There are a number of additional specific scenarios where our technology can provide a significant benefit to the blind, and so we will continue adding capabilities to our smart assistant.
- Implement night vision: By utilizing a special camera module without an InfraRed (IR) filter and then utilizing IR LEDs to shine on the camera's target, we can grant the blind night vision that, intriguingly, is superior to that of people without a visual disability.
- Incorporate depth perception: If our device is able to determine exactly how far away the desired target object is through stereoscopy, we will be able to more carefully direct the blind user's hand towards the target, and can also implement features for collision avoidance.
- Add gesture control: Since voice input is not always possible in extremely loud or quiet surroundings, gesture control through electromyography (EMG) sensors will add an alternative input method for users.
The subsequent sections provide deeper insights into our design and implementation of specific subsystems of blindsight
.
Machine Learning
There were several deep learning models used as a part of blindsight
. Using AWS and Nvidia GPUs, we ran multiple convolutional neural networks with TensorFlow. We ran the Show-Attend-Tell model, which is capable of accurately describing a complex scene to the user. We also ran the TensorFlow object detection API with a frozen inference model and a custom trained model, allowing for over a thousand objects to be recognized. To reduce the time required to load in the model into a GPU, we wrote scripts that preloaded the model and ran it as a MongoDB listener. Whenever a new image needs to be recognized, the script runs it against the preloaded model, instead of taking 20 seconds to load the model into the VRAM.
Using OpenCV, the Raspberry Pi could take a picture of an image and request for a bounding box. The Raspberry Pi would then use a Median Flow Tracker to track the bounding box, instead of repeatedly calling the server for new bounding boxes. This approach avoids wasting server processing power and time, while also being more fault tolerant. In the future, we plan to run the TensorFlow models onto the user’s phone instead of relying on expensive cloud computing services. We also plan to create a distributed network of phone devices to share compute power to users with less GPU power on their device. We used the Google Cloud Vision API for OCR along with a facial recognition library to detect faces locally on the Raspberry Pi. However, due to the limited compute power of the Raspberry Pi, the face detection runs slowly. We plan to use open source OCR alternatives to Google Cloud Vision and consider offloading the facial recognition onto the user’s phone device instead.
CAD and 3D Printing
The physical plastic parts we designed for blindsight
served important structural purposes. The first component was a two-part central module to encompass the Raspberry Pi and camera module. This case was difficult to make as it had to ensure the individual electrical pieces did not slide around, while also ensuring tolerances were high enough to be safely 3D printed. The elasticity of the jumper wires making connections to the Raspberry Pi also had to be considered, to ensure that repeated motion of the band would not wear down the wires and potentially cause a catastrophic short.
The second part was a LED case. This case was designed to hold eight LEDs which would represent the eight motors within the armband. This serves as an indication for spectators that blindsight
does, in fact, function correctly. Since outside observers cannot feel the vibrations from the device, the LED indicators provide appropriate verification.
The final part was a backing for the Qi receiver. The Qi receiver is a specially shaped coil of wire designed for induction to wirelessly charge. Its thin structure makes its incorporation into the armband easy, but also means that the receiver is prone to damage. A 3D printed backing piece was designed to reinforce the coil, while also allowing the PowerBoost 500C chip to be easily attached to the receiver itself, for easier cable management.
Electronics
While designing the electrical circuit to power and run the blindsight wristband, we took into account everyday usability and reliability. We expect blindsight
to be in sleep mode for approximately 8 hours per day (only going into high-power mode on wake, for perhaps 90 minutes of total usage across the day). Since the device will not need to be used while the user sleeps, it can recharge overnight.
On MK1, 4xAA batteries were used to power blindsight
, though they could not be recharged and had a bulky form factor. For MK2, we needed a battery that was lightweight, relatively thin, and could be recharged overnight. For our battery we chose to use two lithium polymer 3.7V 1s 2000mAh batteries connected together in a parallel configuration for a overall battery capacity of 4000mAh. We chose to use a lithium polymer chemistry for its higher energy density and minimal capacity loss over time compared to lead or nickel batteries. Additionally we chose to use a 1s 3.7 battery over a 2s 7.4V battery for its lighter and thinner design and because it does not require a voltage regulator.
The Raspberry Pi, vibration motors, and LEDs all require a 5V connection, so we used a PowerBoost 500C. This chip converts the 3.7V connection to 5V, charges the LiPo battery, and provides short circuit and overdischarge protection to the battery. One important factor of blindsight
is ease of use, and one potential issue that we anticipated was finding a user-friendly method charging the band. Instead of using a traditional barrel plug or micro-USB charger, we opted for a wireless charging solution, so that user can charge the band on any Qi universal wireless charging pad. When charging on a Qi pad at 2A, the band can go from 10% charge to 100% in as little as seven hours.
Mobile Apps
In both our Android and iOS apps, we allow the user to access speech recognition and text-to-speech technologies. The Android app uses CMU’s PocketSphinx for wake word detection and Google speech recognition, and the iOS app uses the built-in Apple speech recognition. When the user says the “wake word” or presses a button on the armband device, speech recognition is activated. After the user’s speech is processed in the app, the command is sent to our server, which activates the camera on the armband device. After the server processes the image, information is sent back to the app, where it is read out through text-to-speech.
Demo VideosBelow are demonstration videos of the MK1 and MK2 iterations. Our 3 minute video submission is the second of the two videos, labelled "MK2 Demo."
MK1 Demo
MK2 Demo - Actual Contest Submission Video
NOTE: Since viewers cannot feel the haptic vibrations shown in the video, we have included LED indicator lights that display the same pattern. The red LED represents the motor on the armband that rests against the forearm's upper surface, with the ring of LEDs paralleling the ring of motors throughout the band. We hope this will serve as a proof of our technology's functionality.
Works CitedMariotti, Silvio P. “Global Data on Visual Impairments 2010.” World Health Organization, www.who.int/blindness/GLOBALDATAFINALforweb.pdf.
- Most recent global report on blindness from World Health Organization
Riccobono, Mark. “Blindness Statistics.” National Federation of the Blind, 12 Mar. 2018, nfb.org/blindness-statistics.
- Shows demographics and basic information on the blind population in America
Kurzwell, Ray. “What Can KNFB Reader Do for You?” About KNFB Reader | KNFB Reader, KNFB Reader, knfbreader.com/the-app.
Soon-Shiong, Patrick. “LookTel Products.” LookTel Money Reader, LookTel, www.looktel.com/moneyreader.
Jensen, Alexander. “Press Resources.” Be My Eyes - Bringing Sight to Blind and People with Low Vision, Be My Eyes, www.bemyeyes.com/press.
“TapTapSee.” TapTapSee - Blind and Visually Impaired Assistive Technology - Powered by CloudSight.ai Image Recognition API, MIT, taptapseeapp.com/.
Comments