This is my entry in the Swimming category for the Build2gether Inclusive Innovation Challenge.
"What were the needs or pain points that you attended to and identified when you were solving problems faced by the Contest Masters?"
I have been pondering what is the thing that I need to see in the swimming pool that I would be disadvantaged by if I could not. Obviously there is where you are going, but I still can’t bring myself to open my eyes under water and have not crashed into the wall yet (there is still time though ;-) ). The thing I do look at is the clock and the matrix information board.
Granted things like the water temperature are not very useful, and I can normally remember the session times, but knowing the time is useful.
This project is to build a device that scans across a pool and looks for a gesture. When it sees this gesture it will make a verbal announcement through the PC system.
Initially there will be one gesture and that will read out the time, but other gestures can be added in the future.
BuildThis project is mostly software, but you will need a device with a Tensor Processor Unit (TPU) in order to do the required body tracking in real time. This project uses the Google Coral Dev Board Mini with an external webcam.
First, we must follow the setup instructions on the Google Coral website so we can log in across the network.
Second, we need to install project-posenet. Running the command “git clone https://github.com/google-coral/project-posenet.git” on the Google Coral will download it locally for us. We then then run “cd project-posenet” and “sh install_requirements.sh” will finish installing all the requirements for you. Running “python3 simple_pose.py” will test the code.
Third, copy the code from this project page into a file called swimming.py in the project-posenet directory.
Fourth, we need to install espeak to do the test to speech part of the project. Run the command “sudo apt install espeak python3-espeak” to install all the required parts.
Firth, we need to plug some speakers or a PA system into the headphone socket.
That is us set up. Just run the code by using the command “python3 swimming.py” and it will be running.
Code breakdownFirst we import all the required libraries and initialise the espeak library ready for when we need it.
import cv2
from pose_engine import PoseEngine, KeypointType
from PIL import Image
from PIL import ImageDraw
from espeak import espeak
from time import sleep
import datetime
import sys
import os
espeak.set_voice("en")
Now we initialise the video capture using OpenCV2 and the pose detection engine.
cap = cv2.VideoCapture()
cap.open(1, apiPreference=cv2.CAP_V4L2)
cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc('Y', 'U', 'Y', '2'))
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1024)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 768)
cap.set(cv2.CAP_PROP_FPS, 10.0)
if not cap.isOpened():
sys.exit('Could not open video device')
engine = PoseEngine('models/mobilenet/posenet_mobilenet_v1_075_481_641_quant_decoder_edgetpu.tflite')
Now we start the main loop
while True:
First thing in the loop we capture a video image
ret, frame = cap.read()
If we got an image we convert it into a useable format and detect any people
if ret:
pil_image = Image.fromarray(frame)
poses, inference_time = engine.DetectPosesInImage(pil_image)
Next we loop through each of the people that were detected as we may have detected several and want to activate if any of them make the pose.
for pose in poses:
If we are not certain about this pose we skip it. We don’t want to randomly announce things all the time when no one asked.
if pose.score < 0.4:
continue
Now we gather all the positioning we are interested in and check if both wrists are above the nose. I was using shoulders but found this less reliable. I will have to tid this code up when I get a moment.
left_hand = pose.keypoints[KeypointType.LEFT_WRIST]
right_hand = pose.keypoints[KeypointType.RIGHT_WRIST]
left_shoulder = pose.keypoints[KeypointType.NOSE]
right_shoulder = pose.keypoints[KeypointType.NOSE]
if ( left_hand.point[1] < left_shoulder.point[1] ) and ( right_hand.point[1] < right_shoulder.point[1] ):
If we have detected the gesture then we simply work out the time and say it.
now = datetime.datetime.now()
espeak.synth("The time is " + now.strftime("%-I %M %p and %-S seconds"))
We don’t want to detect for a second time while we are reading the current time, so we wait here until that has finished. White this is happened we are also capturing and discarding video and we capture and ignore that. If we don't do this it will trigger several times.
while espeak.is_playing():
cap.grab()
pass
TestingAs mentioned before, the example code here identifies the gesture of both arms being held up above a person's head. In this video you can see me walking around the lab and making the gesture, and in the background you will hear the time being read out when I do.
Here is a photograph of the test setup that is running this. The Google Corel board is hanging in front of the white cupboard door, the webcam is on the tripod, and the sound comes from the sound system on top of the white cupboard.
If you were wondering how I filed this when I was already using the tripod, well, duct tape is a wonderful thing and I taped my phone to the cupboard door. :-)
Comments