•

Shreyashree Neogi

•

Gulafsa Bano

Created December 6, 2023

Video Analytics on Football Match Using AMD Radeon Pro GPU

Things used in this project

Hardware components

AMD Radeon Pro W7900 GPU

Software apps and online services

AMD ROCm™ Software

Pytorch Framework

Ultralytics Framework

Roboflow DataSet Repository

Story

The System Configuration we have used is as below:

Intel Core i9-13900KF 13th Gen. Processor (24 Core, 32 Thread, 36MB Cache) Motherboard Z790 Chipset Corsair 64GB (2x32GB) DDR5 6000MHz
WD 1 TB Blue NVME x 2 CPU Air Cooler Deepcool AG620
Power Supply 850 Watt
Cabinet Full Tower Antesports 410TG

Prior to the start of the implementation of Video Analytics, we installed all the AMD Radeon W7900 Pro GPU related libraries and Pytorch-Tensorflow frameworks and tested the workings of all the libraries.

N.B.:- Our Main Project Problem Statement was related to VLM (Visual Language Model) on customized medical dataset. But due to new entry of the AMD in GPU Market, there was some issues related to implementation of some of the necessary libraries which are still under development. Time Limitation to successfully complete that task was also an issue because of late shipment+delivery of the AMD Radeon W7900 Pro GPU.
So, here we going going to submit an other relatively simpler application solution. i.e. Video Analytics for Football Match.

We downloaded some already annotated FootBall Live Match related datasets from ROBOFLOW.
Dataset used: https://universe.roboflow.com/nikhil-chapre-xgndf/detect-players-dgxz0

We use the Ultralytics library, which maintains all the Yolo models:

pip install ultralytics

We define the customized data related yaml file as below

train: /home/smss/datasets/football_yolov8/train/images
val: /home/smss/datasets/football_yolov8/valid/images
test: /home/smss/datasets/football_yolov8/test/images

nc: 3
names: ['Ball', 'Player', 'Refree']

roboflow:
  workspace: nikhil-chapre-xgndf
  project: detect-players-dgxz0
  version: 7
  license: CC BY 4.0
  url: https://universe.roboflow.com/nikhil-chapre-xgndf/detect-players-dgxz0/dataset/7

Next, we train the downloaded dataset using the CLI command.

yolo detect train data=data.yaml model=yolov8n.yaml epochs=100 imgsz=640

The command-line output & batch training results are shown below:-

/home/smss/.local/lib/python3.10/site-packages/matplotlib/projections/__init__.py:63: UserWarning: Unable to import Axes3D. This may be due to multiple versions of Matplotlib being installed (e.g. as a system package and as a pip package). As a result, the 3D projection is not available.
  warnings.warn("Unable to import Axes3D. This may be due to multiple versions of "
Ultralytics YOLOv8.2.70 🚀 Python-3.10.12 torch-2.3.1+rocm6.0 CUDA:0 (AMD Radeon PRO W7900, 46064MiB)
engine/trainer: task=detect, mode=train, model=yolov8n.yaml, data=data.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train3, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs/detect/train3
Overriding model.yaml nc=80 with nc=3

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]              
  8                  -1  1    460288  ultralytics.nn.modules.block.C2f             [256, 256, 1, True]           
  9                  -1  1    164608  ultralytics.nn.modules.block.SPPF            [256, 256, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]                 
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  1     37248  ultralytics.nn.modules.block.C2f             [192, 64, 1]                  
 16                  -1  1     36992  ultralytics.nn.modules.conv.Conv             [64, 64, 3, 2]                
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  1    123648  ultralytics.nn.modules.block.C2f             [192, 128, 1]                 
 19                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]                 
 22        [15, 18, 21]  1    751897  ultralytics.nn.modules.head.Detect           [3, [64, 128, 256]]           
YOLOv8n summary: 225 layers, 3,011,433 parameters, 3,011,417 gradients, 8.2 GFLOPs

TensorBoard: Start with 'tensorboard --logdir runs/detect/train3', view at http://localhost:6006/
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed ✅
train: Scanning /home/smss/datasets/football_yolov8/train/labels.cache... 1023 images, 4 backgrounds,
val: Scanning /home/smss/datasets/football_yolov8/valid/labels.cache... 293 images, 0 backgrounds, 0 
Plotting labels to runs/detect/train3/labels.jpg... 
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
optimizer: AdamW(lr=0.001429, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
TensorBoard: model graph visualization added ✅
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/train3
Starting training for 100 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      1/100     0.388G      4.832      4.859      4.046        530        640: 100%|██████████| 64/64
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████
                   all        293       4381   0.000511    0.00269    0.00028   0.000106

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      2/100      2.58G      2.965      2.017      2.256        336        640: 100%|██████████| 64/64
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████
                   all        293       4381      0.797      0.101     0.0862     0.0258
........
........
........
........
........
........
........
Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
     98/100      2.17G      1.136      0.543     0.9401        224        640: 100%|██████████| 64/64
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████
                   all        293       4381       0.83      0.666      0.711      0.437

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
     99/100      2.21G      1.136     0.5415     0.9413        180        640: 100%|██████████| 64/64
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████
                   all        293       4381      0.861      0.655      0.712       0.44

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
    100/100      2.21G       1.13     0.5428     0.9424        199        640: 100%|██████████| 64/64
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████
                   all        293       4381      0.864      0.661      0.715       0.44

100 epochs completed in 0.186 hours.
Optimizer stripped from runs/detect/train3/weights/last.pt, 6.2MB
Optimizer stripped from runs/detect/train3/weights/best.pt, 6.2MB

Validating runs/detect/train3/weights/best.pt...
Ultralytics YOLOv8.2.70 🚀 Python-3.10.12 torch-2.3.1+rocm6.0 CUDA:0 (AMD Radeon PRO W7900, 46064MiB)
YOLOv8n summary (fused): 168 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████
                   all        293       4381      0.865      0.662      0.715      0.439
                  Ball        214        215      0.883      0.293      0.346      0.148
                Player        293       3811      0.924      0.954      0.974       0.65
                Player        220        355      0.789      0.739      0.826      0.519
Speed: 0.0ms preprocess, 0.4ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/train3
💡 Learn more at https://docs.ultralytics.com/modes/train

Train Batch Results

1 / 6

After the training is done, we predict the player and ball detection using the trained best model from the custom dataset on a different downloaded non-trained type of video using the CLI command.

yolo detect predict model=/home/smss/datasets/football_yolov8/runs/detect/train3/weights/best.pt source=video.mp4 show=True

The command-line output & output prediction result video is shown below:-

/home/smss/.local/lib/python3.10/site-packages/matplotlib/projections/__init__.py:63: UserWarning: Unable to import Axes3D. This may be due to multiple versions of Matplotlib being installed (e.g. as a system package and as a pip package). As a result, the 3D projection is not available.
  warnings.warn("Unable to import Axes3D. This may be due to multiple versions of "
Warning: Ignoring XDG_SESSION_TYPE=wayland on Gnome. Use QT_QPA_PLATFORM=wayland to run on Wayland anyway.
Ultralytics YOLOv8.2.70 🚀 Python-3.10.12 torch-2.3.1+rocm6.0 CUDA:0 (AMD Radeon PRO W7900, 46064MiB)
YOLOv8n summary (fused): 168 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs

video 1/1 (frame 1/258) /home/smss/datasets/football_yolov8/video.mp4: 384x640 8 Players, 1 Player, 5.3ms
video 1/1 (frame 2/258) /home/smss/datasets/football_yolov8/video.mp4: 384x640 7 Players, 1 Player, 2.8ms
........
........
........
........
........
........
........
video 1/1 (frame 256/258) /home/smss/datasets/football_yolov8/video.mp4: 384x640 10 Players, 3 Players, 2.8ms
video 1/1 (frame 257/258) /home/smss/datasets/football_yolov8/video.mp4: 384x640 10 Players, 3 Players, 2.7ms
video 1/1 (frame 258/258) /home/smss/datasets/football_yolov8/video.mp4: 384x640 11 Players, 4 Players, 2.7ms
Speed: 1.3ms preprocess, 2.8ms inference, 0.9ms postprocess per image at shape (1, 3, 384, 640)
Results saved to runs/detect/predict14
💡 Learn more at https://docs.ultralytics.com/modes/predict

rocm-smi terminal command used to show the utilization details of the AMD Radeon Pro W7900 GPU. Initially the percentage usuage of GPU is 7% which increased to 99% drastically as soon as the train command run which supports that the training is supported via the AMD Radeon Pro W7900 GPU.

smss@smss:~/datasets/football_yolov8$ rocm-smi


========================================== ROCm System Management Interface ==========================================
==================================================== Concise Info ====================================================
Device  Node  IDs              Temp    Power  Partitions          SCLK    MCLK     Fan     Perf  PwrCap  VRAM%  GPU%  
              (DID,     GUID)  (Edge)  (Avg)  (Mem, Compute, ID)                                                      
======================================================================================================================
0       1     0x7448,   19246  33.0°C  30.0W  N/A, N/A, 0         158Mhz  1124Mhz  20.78%  auto  241.0W  30%    7%    
======================================================================================================================
================================================ End of ROCm SMI Log =================================================
smss@smss:~/datasets/football_yolov8$ rocm-smi


=========================================== ROCm System Management Interface ===========================================
===================================================== Concise Info =====================================================
Device  Node  IDs              Temp    Power   Partitions          SCLK     MCLK     Fan     Perf  PwrCap  VRAM%  GPU%  
              (DID,     GUID)  (Edge)  (Avg)   (Mem, Compute, ID)                                                       
========================================================================================================================
0       1     0x7448,   19246  68.0°C  241.0W  N/A, N/A, 0         2042Mhz  1124Mhz  48.63%  auto  241.0W  30%    99%   
========================================================================================================================
================================================= End of ROCm SMI Log ==================================================

Code

Credits

Comments

Please log in or sign up to comment.

Video Analytics on Football Match Using AMD Radeon Pro GPU

Things used in this project

Hardware components

Software apps and online services

Story

Schematics

AMD Radeon W7900 Pro GPU installed System

Code

Code for training & prediction on Video Data

Credits

Parikshit Saha

Shreyashree Neogi

Gulafsa Bano

Comments

Embed the widget on your own site

Video Analytics on Football Match Using AMD Radeon Pro GPU

Video Analytics on Football Match Using AMD Radeon Pro GPU

Things used in this project

Hardware components

Software apps and online services

Story

Schematics

AMD Radeon W7900 Pro GPU installed System

Code

Code for training & prediction on Video Data

Credits

Parikshit Saha

Shreyashree Neogi

Gulafsa Bano

Comments