Introduction
The Goal
Standing on the Shoulders of Giants
The Proposed Solution
Gathering Data (2023/08/31
Annotating Data (2023/08/31
Viewing Data (2023/09/03
Identifying the Climber of Interest (2023/09/13
Tracking the Climber (2023/09/15
Revision

Published August 29, 2023 © Apache-2.0

The mechanics of Climbing

An exploration in pose estimation for rock climbing.

IntermediateWork in progress16 hours544

Things used in this project

Software apps and online services

OpenCV – Open Source Computer Vision Library OpenCV

YOLOv7

Story

Introduction

When rock climbing was first introduced as a discipline in the Olympics a few years back, I wondered if the current state of pose estimation algorithms could detect the incredible poses that humans take while performing this sport.

Applying AI to Climbing - The Long Road to the Tokyo Olympics

My first attempts were really bad, and it was clear that the data sets used to train these algorithms did not include these human poses.

Applying AI to Climbing - Deep Learning Meets the Odd Human Data Set

More that a year later, LearnOpenCV published an article that caught my attention. They described the new YOLOv7 model, which not only drastically increased the capability of detection rock climbing poses, but also provided keypoint detection.

Applying AI to Climbing - Pose Estimation with YOLOv7

This was a great breakthrough, but I did not find time to pursue the project further.

I did, however, manually create this cool video with a combination of video editing and python scripting:

https://youtu.be/0Md1AuPgAsA

This type of montage allows climbers to analyze their performance, and compare themselves to other climbers. It should be obvious in the video that a 5'8" climber (me, on the left) will not go about it the same way as a 5'2" climber (my wife, on the right).

For elite climbers, they can compare their performance with previous performances to track progress, and identify areas that need improvements.

The Goal

Creating a side-by-side comparison view of two or more climbs is a tedious process.

A good strategy for this type of project is to start small and focused. My hope is to create a useful tool for climbing coaches that could use this type of montage as a training tool for their climbers (perhaps the next Olympic candidates).

Therefore, the first goal will be to align the videos of two climbs in order to perform a side-by-side comparison.

Automating this process with computer vision and machine learning is a great challenge to take on before the next Olympics being held in Paris in July 2024.

The best way to get a project done in time is to establish a deadline.

For this, I will enter this project in the new OpenCV AI Competition, being hosted on Hackster.io.

An international open source competition on computer vision & AI by OpenCV Foundation.

With up to $48, 000+ in prizes, it is a great opportunity to flex your computer vision skills 😃OpenCV AI Competition 2023

Whether or not my project is "accepted" in the competition, I will move forward with the project and use the competition deadlines and milestones as motivation 😃😃. Hopefully, it will also inspire others.

Standing on the Shoulders of Giants

I am a big fan of LearnOpenCV, from their blogs to their on-line courses.

LearnOpenCV

The LearnOpenCV team provides tutorials and documentation for all the components and that you may need to create your own project.

They also have on-line courses for every budget, from the free bootcamp courses to the premium in-depth courses.

https://opencv.org/university/free-courses/

https://opencv.org/university/cvdl-master/

I will leverage (and acknowledge) their content to implement this project.

The Proposed Solution

This project will be broken down into two main components:

Video Annotation
Side-by-Side Viewing

The Video Annotation will use pose estimation to annotate the video. Since I will want to validate and edit the annotations, an interactive tool will be ideal. It will provide the option to correct the annotations (if required).

The starting point for this component will be the open-source annotation tool from OpenCV : pyOpenAnnotate

Building An Automated Image Annotation Tool: PyOpenAnnotate

Roadmap To an Automated Image Annotation Tool Using OpenCV Python

Several features make this tool interesting for reuse and customization:

ability to annotate each frame of video
ability to manually correct annotations

The following modifications will be made for the purpose of this project:

replace existing thresholding based annotations (bounding boxes) with pose estimation annotations (bounding box + keypoints)
identify climber of interest and track climber in video
filter out unwanted annotations (ie. not climber of interest)

The Side-by-Side Viewer will use the pre-calculated annotations to view multiple climbs together. For this purpose, the following features will also need to be created:

creation of synchronization points of climber on route
video stretching to align climbs together
identify background image of route (used for viewing options)

The Viewer will also provide several viewing options:

side-by-side or overlay
climber or stick-man

Gathering Data (2023/08/31)

It all starts with the data. For the purpose of this project, I gathered several videos of climbing footage of me and my wife Josie-Anne.

This combines videos of several scenarios:

6 indoor videos / 42 outdoor videos
mostly up-climbs, a few down-climbs (ie. climbing backwards)
one climb with two different view points

The videos were taken with an iPhone 14 Pro, in "accelerated mode".

The motivations (and advantages) of using this mode are:

less battery consumption (ie. important for outdoor climbs)
less storage requirements
smaller video files

Some disadvantages of using this mode are:

delay of 10 sec between frames, which is too long for climbing (ie. a lot of movements are lost)
a lot of frames have blurry climber (ie. less than ideal for pose estimation)

Another technique of gathering data will have to be considered, but for now, I will make use of these videos, which have between 600 and 1200 frames per video.

Annotating Data (2023/08/31)

For this first iteration, the pyOpenAnnotate utility was modified as follows:

modify the write/read annotation to include 1 bounding box + 17 keypoints per person, including the confidence scores
modify the annotation tool to reuse annotations if present

The annotation data contains the following information:

bbox.id (integer) : unused for now (0)
bbox.x (float) : normalized x coordinate for center of bounding box
bbox.y (float) : normalized y coordinate for center of bounding box
bbox.w (float) : normalized width for bounding box
bbox.h (float) : normalized height for bounding box
bbox.c (float) : confidence score for bounding box
keypoint[0].x (float) - normalized x coordinate for keypoint[0]
keypoint[0].y (float) - normalized y coordinate for keypoint[0]
keypoint[0].c (float) - confidence : confidence score for keypoint[0]
...
keypoint[17].x (float) - normalized x coordinate for keypoint[17]
keypoint[17].y (float) - normalized y coordinate for keypoint[17]
keypoint[17].c (float) - confidence : confidence score for keypoint[17]

Note that the pose landmarks, or keypoints, correspond to the COCO dataset:

COCO dataset landmark points

On a PC (without GPU card), it took 20 hours of processing time to annotate the ~50 videos, taking approximately 30min per video.

1 / 2

The processing time per frame was approximately 2 sec. The second graph shows this as fairly stable 1.8sec per frame for the first videos, then an unexpected range of 1.5-3.0sec per frame near the end.

Viewing Data (2023/09/03)

The following video illustrates the current status of the project:

Annotating the Videos

The open-source annotation tool (pyOpenAnnotate) has been modified to use the YOLOv7 pose estimation model to annotate the video frames.

As seen previously, the annotation process is quite long, so executed in batch mode. The annotation tool has been modified to reuse the pre-calculated annotations for viewing and analysis.

Three new slides (track bars) have been added to the annotation tool:

frameNum : quickly navigate in video to identify areas of interest
threshBBOX : confidence threshold for bounding boxes
threshKPTS : confidence threshold for keypoints

Although the ultimate goal is to identify the following frames automatically, the current annotation tool allows the user to manually specify:

background image
start of climb
end of climb
cruxes (difficult parts of climb, areas of interest)

Note that the video is not showing the final viewer, but rather two instances of the annotation tool, giving a glimpse of what we will want to achieve with our side-by-side viewer.

Identifying the Climber of Interest (2023/09/13)

It is not always obvious which person is the climber of interest. The following series of four frames was taken from a video where the camera was placed on a trail which had a lot of activity, in addition to another climber. Can you guess which climber is of interest ?

1 / 4 • climber of interest : ID 0

Notice how the IDs of the detected people change at each frame. This is due to lack of tracking in the current annotation tool.

In order to identify the climber of interest, it may be necessary for the user to specify the climber at the start of the climb, and possibly use tracking to follow the climber throughout the video.

The following LearnOpenCV article, describing DeepSort tracking, will be used to "follow" a climber in the sequence of frames in the videos:

LearnOpenCV - Real Time Deep SORT with Torchvision Detectors

Tracking the Climber (2023/09/15)

My first attempt to use the Deep SORT algorithm to track the climber of interest was not successful.

To be fair, this video is fairly complex, as there are a lot of people in the scene. When I start climbing, Deep SORT has allocated the following IDs:

belayer (my wife, Josie-Anne) : ID 20
climber (myself, Mario) : ID 21

Tracking the Climber

What is very impressive is that despite the continual flow of people moving around the belayer, the ID 20 remains the same throughout the video. This is a solid success !

For the climber, however, the ID changes from 21, to 45, to 47, to 49, to 56, to 57, to 61, to 68, to 69, to 72, to 74, to 75, to 76, to 83, to 92, etc... This is an epic failure !

But why ? What is going on ?

For this test, I used the "mobilenet" re-identification model. I wanted to also test the "torchreid" and "clip" re-id models, but could not get those working ...

The tracking is lost almost every time that I fall ... which results in a sudden change in position. Since these videos were taken in "accelerated" mode, the changes is position are perhaps too great for the algorithm to work correctly.

To test this first theory, I will have to re-try with a normal video (ie. 30 frames/sec, instead of 0.1 frames/sec).

Another theory is that the re-id model has its attention on the rock features instead of the climber. One way to test this could be to do background subtraction, to remove the rock textures from the video being processed by the Deep SORT algorithm.

Revision

2023/08/25 - Project Draft

2023/08/29 - Project Entered in OpenCV AI Competition 2023

2023/08/31 - Update on Gathering and Annotating Data

2023/09/03 - Update on Viewing Data

2023/09/13 - Update on Isolating the Climber of Interest

2023/09/15 - Update on Tracking the Climber

Credits

Mario Bergeron

54 projects • 292 followers

Mario Bergeron is a Technical Marketing Engineer working at Tria, specializing in embedded vision and machine learning.

Contact

Comments

Please log in or sign up to comment.

The mechanics of Climbing

Things used in this project

Software apps and online services

Story

Introduction

The Goal

Standing on the Shoulders of Giants

The Proposed Solution

Gathering Data (2023/08/31)

Annotating Data (2023/08/31)

Viewing Data (2023/09/03)

Identifying the Climber of Interest (2023/09/13)

Tracking the Climber (2023/09/15)

Revision

Code

pyClimbAnnotate

Credits

Mario Bergeron

Comments

Embed the widget on your own site

The mechanics of Climbing

The mechanics of Climbing

Things used in this project

Software apps and online services

Story

Introduction

The Goal

Standing on the Shoulders of Giants

The Proposed Solution

Gathering Data (2023/08/31)

Annotating Data (2023/08/31)

Viewing Data (2023/09/03)

Identifying the Climber of Interest (2023/09/13)

Tracking the Climber (2023/09/15)

Revision

Code

pyClimbAnnotate

Credits

Mario Bergeron

Comments

Related channels and tags