The most difficult part of this project as all the members of Team FutureVision will agree was deciding how our work could bring a change in our lives, but more importantly in the lives of those around us. Then, one of us suggested that AI and Robotics are useless if not helpful and unavailable to all. Hence, we
came up with the project "ThirdEye" considering the social impact it could have.
As of 2020, there are 43 million visually impaired people worldwide. But what if they could use their other senses to "see"? As prosthetics have become common replacements for absent limbs, we hope to visualize a similar concept using glasses with a camera, a microphone and earphones.
With money and time constraints, we decided to use an ESP32 camera for a portable camera solution as it is comparatively smaller than other cheaper alternatives. The feed from the camera is picked up using the OpenCV library and sent to the Kria KR260 FGPA board. On the board, we used the Gemini API to implement a object and voice detection algorithm.
Figure 1 shows the KRIA in action with the following connections:
- Ethernet
- Microphone
- Web Camera
- Display Port
- Power Source
The algorithm runs in a loop to detect a keyword spoken in a sentence, which in this case was "Product". Upon detecting the keyword, the OpenCV library captures a single frame and runs the object detection model on the image. Thereafter, a detailed report about the product in the image is read out through the headphones to the user enabling them to get a gist of the obstacles in front of them.
Since the algorithm runs in a loop, it waits for the keyword again.
For a demonstration of the project, kindly visit the following URL.
Comments