How it works
Steps
1. Prepare your system, install pre-requisites and dependencies
2. Install Ryzen AI Software
3. Build the program based on YOLO v8 object detection
4. Install Huggingface Optimum AMD pipeline and Ryzen AI’s YOLOv8
5. Modify the program with YOLOv8m from Ryzen AI Model Zoo
Conclusion

Published July 24, 2024 © Apache-2.0

Traffic Analysis using optimized YOLOv8 with AMD Ryzen AI

An object detection model deployed in the Vehicle Traffic Analysis use case which can track objects (car) location, speed and direction.

IntermediateFull instructions provided8 hours1,380

Traffic Analysis using optimized YOLOv8 with AMD Ryzen AI

Things used in this project

Hardware components

AMD Ryzen AI Laptop/Mini PC - eg. Minisforum UM790 Pro with Ryzen 9 7940HS

Display, Keyboard and Mouse

Webcam, Logitech® HD Pro

Software apps and online services

Microsoft Windows 11

AMD Ryzen AI SW

PyTorch

Anaconda/Miniconda

Python >=3.9

Microsoft Visual Studio 2019

OpenCV

Ultralytic YOLOv8

Huggingface optimum-amd for Ryzen AI

Story

How it works:

AMD Ryzen AI laptop/MiniPC - Minisforum UM790 Pro

AMD’s Ryzen AI family of laptop processors now integrate with Neural Processing Unit (NPU), this frees up CPU and GPU to do another tasks and resulting in power efficiency. This possible based on Ryzen AI tech built on XDNA architecture, purpose-built to run AI workload locally. By using this advantage, I will simulate a traffic analysis tool that requires real-time speed in an object detection where coordinates and time stamp data of the bounding boxes as an input for this Python program that will make it as vehicle speed detection, counter and direction and will be useful in collecting data for traffic arrangements or enforcement. With this efficiency and real time result obtained in its local operations on an AMD’s laptop/ minipc, it is very flexible and cost efficient to apply this project to real world applications.

To make the process easier, we will use Huggingface YOLOv8 pre-trained model which has been optimized for the AMD’s IPU/NPU.

Steps:

1. Prepare your system, install pre-requisites and dependencies

Check whether your Ryzen AI laptop is IPU enabled or disabled?

Recently (in May 2023) AMD has launched its products with a dedicated AI engine that complements Windows x86 processors. This special AI chip will be explored in this object detection project. We use Minisforum UM790 Pro with AMD Ryzen 9 7940HS, which must be enabled in the IPU/NPU settings. To check whether your AMD Ryzen AI laptop/mini PC is enabled or disabled, follow these instructions: From Windows Search, enter ”Device Manager”, expand “System Device” and look for “AMD IPU Device”. If it doesn’t appear in the list, you’ll need to enable it with reboot from Recovery setting.

In Windows Search, enter “Advanced Startup” > “Recovery Options”> clik “Restart Now”. After PC reboots, select “Troubleshoot” > “ Advanced options” > “UEFI Firmware Settings” then Restart > “Advanced” > “CPU Configuration” > In IPU Control choose “Enabled” > “Save & Exit”. After reboot, download the NPU driver from this link NPU Driver and then extract the downloaded zip file. Open command prompt in admin mode and execute the bat file

.\amd_install_kipudrv.bat

Then Ensure that the NPU driver is installed from Device Manager -> System Devices -> AMD IPU Device as shown in the following image.

The next step is to make sure we have the dependencies to install Ryzen AI SW: Visual Studio 2019, CMake >=3.26, Python >=3.9, latest Anaconda/Miniconda.

Now, ensure that all the pre-requisites outlined previously have been met and that the Windows PATH variable is properly set for each component. For example, Anaconda/Miniconda requires following paths to be set in the PATH variable

path\to\anaconda3\
path\to\anaconda3\Scripts\
path\to\anaconda3\Lib\bin\

2. Install Ryzen AI Software

Now download the Ryzen AI SW package and extract it.

Then, open command prompt in admin mode, navigate to the extracted folder and install Ryzen AI SW with:

.\install.bat -env <env name>

which will install conda environment, install Vitis AI Quantizer ONNX, ONNX Runtime, Vitis AI EP. Now we activate the Conda environment:

conda activate <env name>

Then run the test with:

cd ryzen-ai-sw-1.1\quicktest
python quicktest.py

This test will be carried out with a simple CNN model. If successfully run, the output will be as below. This indicates the model is running on the NPU and the Ryzen AI SW installation was successful.

[Vitis AI EP] No. of Operators :  CPU  2 IPU  398 99.50%
[Vitis AI EP] No. of Subgraphs :  CPU  1 IPU  1 Actually running on IPU  1
...
Test Passed
...

After everything installed, we can configure IPU/NPU execution profiles before running the program from a new environment. Follow this Runtime setup: https://ryzenai.docs.amd.com/en/latest/runtime_setup.html till you get the vaip_config.json. It is recommended to create a copy of the vaip_config.json. file in your project directory and point to this copy when initializing the inference session.

3. Build the program based on YOLO v8 object detection

For the next steps, we will use YOLO object detection (car) which is commonly used and in Huggingface's model zoo there is a YOLOv8 model which has been quantized and optimized for AMD Ryzen AI, so we don't need to create and optimize our own ONNX model; we will download it (in Step 4). The next stage is to create a Python program that will detect cars from video on a 2-way highway, using non-optimize for AMD's IPU model (Ultralytic's YOLOv8) to ensure the program works as desired.

This Python code a combination of OpenCV, YOLOv8 model from Ultralytics, and a custom tracker to perform object detection and tracking on a video. The primary goal is to detect and count the number of cars passing certain lines in both directions.

To build this program, we need to install dependencies: Python >=3.9 (done), opencv-python for webcam/video capture, then Ultralytics library for YOLOv8.

Open command prompt then run the following commands (1 by 1) to create conda environment and install the required library

conda create -n YOLOv8Env
conda activate YOLOv8Env
pip install opencv-python
pip install ultralytics==8.0.0

Now open VS Code build your custom program or check my Python code carcounter.py and tracker.py (in Code section), download then run it from command prompt:

python carcounter.py

While program running for the first time, YOLOv8 PyTorch file will be built in the same folder.

4. Install Huggingface Optimum AMD pipeline and Ryzen AI’s YOLOv8.

To install the Ryzen AI environment to use this library, make sure you have completed stages 1 & 2. And to install optimum-amd, we run:

git clone https://github.com/huggingface/optimum-amd.git
cd optimum-amd
pip install -e .[ryzenai]
pip install Pillow

For reference on using Optimum-amd pipeline for Ryzen AI’s Yolo, you can check this Huggingface link

Now we will try to see if all the pre-requisites are complete and we will run this Python code:

import platform
import sys
import cpuinfo
from pprint import pprint
print("Python version:", sys.version)
print("Platform:", platform.platform())
print("Processor Architecture:", platform.architecture())
print("Machine:", platform.machine())
print("System:", platform.system())
cpu_info = cpuinfo.get_cpu_info()
print("Processor:", cpu_info["brand_raw"])

Results:

Python version: 3.9.20 (main, Jul 23 2024, 18:19:13) [MSC v.1916 64 bit (AMD64)]
Platform: Windows-10-10.0.22634-SP0
Processor Architecture: ('64bit', 'WindowsPE')
Machine: AMD64
System: Windows
Processor: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics

If successful, we can download yolov8m.onnx from: https://huggingface.co/amd/yolov8m/tree/main and download image of cars in the road, then name it vehicles.png in the same directory.

Then run the following Python code:

from optimum.amd.ryzenai import pipeline
from optimum.amd.ryzenai.utils import plot_bbox
model_id = "amd/yolov8m"
detector = pipeline("object-detection", model=model_id, model_type="yolov8")
import requests
from PIL import Image
# Load an image
image = Image.open("vehicles.png")
outputs = detector(image)
pprint(outputs)
plot_bbox(image.copy(), outputs)

After that we will get our image complete with bounding boxes that detect cars like the example below:

For more details, you can check this link as a reference from the Huggingface team.

5. Modify the program with YOLOv8m from Ryzen AI Model Zoo

Before creating this program, make sure you have a sample video (.mp4) for example: traffic settings in road intersection or 2 ways highway (try to search for free download - traffic for object detection). At this stage we will modify the Python program that we have created in Step 3, with replacing the library and making several adjustments from Ultralytics to the Huggingface optimum-amd pipeline that we have learned in Step 4. We also added Speed detection by utilizing the location of the bounding box (coordinates) and time-stamp.

For details, see file: speedncount_amd.py in the Code Section.

The following is a video demo that shows our program running by detecting the number of cars in and out by knowing the direction, and also the speed of each car.

Conclusion:

In this project we have configured the necessary settings and tested the capabilities of AMD chips that have integrated IPU/NPU (on Ryzen AI PCs/laptops) which can reduce CPU workload. In this project, we successfully detected car speed, direction, and count using Ultralytics YOLOv8. This all utilizes Ultralytics YOLOv8 which in the next stage we modify with the Ryzen AI version of YOLOv8 and libraries from Huggingface Optimum-amd. By using this method of dividing the workload on the IPU & CPU/GPU, in the future energy-efficient and low-latency solutions can be applied to more use-case.

import cv2
from ultralytics import YOLO

# Load YOLOv8 model
model = YOLO('yolov8n.pt')  # You can choose other versions like 'yolov8s.pt', 'yolov8m.pt', etc.

# Path to the video file
video_path = r'C:\Users\jalls\Documents\yolo-cars\YOLOv8\carvideo3.mp4'

# Initialize video capture
cap = cv2.VideoCapture(video_path)

if not cap.isOpened():
    print("Error: Could not open video file.")
    exit()

# Initialize car counts and tracking dictionary
car_out_count = 0
car_in_count = 0
tracked_cars = {}
next_car_id = 0

while True:
    ret, frame = cap.read()
    if not ret:
        print("End of video file.")
        break

    # Perform detection
    results = model(frame)

    # Process the results
    current_frame_cars = []

    for result in results:
        # Iterate over detected objects
        for obj in result.boxes:
            # Check if the detected object is a car (class 2 in COCO dataset)
            cls_id = int(obj.cls[0])  # Ensure it's an integer
            if cls_id == 2:
                # Get bounding box coordinates
                x1, y1, x2, y2 = map(int, obj.xyxy[0])  # Extract coordinates from tensor
                bbox = (x1, y1, x2, y2)

                # Check if this car is already being tracked
                car_found = False
                for car_id, car_data in tracked_cars.items():
                    prev_bbox = car_data['bbox']
                    if abs(prev_bbox[0] - x1) < 50 and abs(prev_bbox[1] - y1) < 50:
                        tracked_cars[car_id]['bbox'] = bbox
                        car_found = True
                        
                        # Check for "car-out" crossing
                        if car_data['crossed_out'] == False and y1 < 640 and y2 >=640 and 0 <= x1 <= 600:
                            car_out_count += 1
                            tracked_cars[car_id]['crossed_out'] = True
                        
                        # Check for "car-in" crossing
                        if car_data['crossed_in'] == False and y1 < 640 and y2 >= 640 and 680 <= x1 <= 1280:
                            car_in_count += 1
                            tracked_cars[car_id]['crossed_in'] = True
                        
                        break

                if not car_found:
                    tracked_cars[next_car_id] = {'bbox': bbox, 'crossed_out': False, 'crossed_in': False}
                    next_car_id += 1

                current_frame_cars.append(bbox)

                # Draw bounding box
                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                # Put label
                label = f'{model.names[cls_id]} {obj.conf[0]:.2f}'
                cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Remove cars not detected in the current frame from the tracking dictionary
    tracked_cars = {car_id: car_data for car_id, car_data in tracked_cars.items() if car_data['bbox'] in current_frame_cars}

    # Display car-out count on the frame (upper left)
    car_out_label = f'Mobil Keluar: {car_out_count}'
    cv2.putText(frame, car_out_label, (20, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (139, 0 , 0), 2)

    # Display car-in count on the frame (upper right)
    car_in_label = f'Mobil Masuk: {car_in_count}'
    cv2.putText(frame, car_in_label, (frame.shape[1] - 300, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (139, 0 , 0), 2)

    # Draw the y-coordinate line
    cv2.line(frame, (0, 640), (frame.shape[1], 640), (0, 0, 255), 3)

    # Display the frame with detections
    cv2.imshow('YOLOv8 Object Detection', frame)

    # Break the loop if 'q' key is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the video capture and close windows
cap.release()
cv2.destroyAllWindows()

import math


class Tracker:
    def __init__(self):
        # Store the center positions of the objects
        self.center_points = {}
        # Keep the count of the IDs
        # each time a new object id detected, the count will increase by one
        self.id_count = 0


    def update(self, objects_rect):
        # Objects boxes and ids
        objects_bbs_ids = []

        # Get center point of new object
        for rect in objects_rect:
            x, y, w, h = rect
            cx = (x + x + w) // 2
            cy = (y + y + h) // 2

            # Find out if that object was detected already
            same_object_detected = False
            for id, pt in self.center_points.items():
                dist = math.hypot(cx - pt[0], cy - pt[1])

                if dist < 35:
                    self.center_points[id] = (cx, cy)
#                    print(self.center_points)
                    objects_bbs_ids.append([x, y, w, h, id])
                    same_object_detected = True
                    break

            # New object is detected we assign the ID to that object
            if same_object_detected is False:
                self.center_points[self.id_count] = (cx, cy)
                objects_bbs_ids.append([x, y, w, h, self.id_count])
                self.id_count += 1

        # Clean the dictionary by center points to remove IDS not used anymore
        new_center_points = {}
        for obj_bb_id in objects_bbs_ids:
            _, _, _, _, object_id = obj_bb_id
            center = self.center_points[object_id]
            new_center_points[object_id] = center

        # Update dictionary with IDs not used removed
        self.center_points = new_center_points.copy()
        return objects_bbs_ids

import cv2
import pandas as pd
import numpy as np
from ultralytics import YOLO
from tracker import Tracker
import time
from math import dist

model = YOLO('yolov8s.pt')

def RGB(event, x, y, flags, param):
    if event == cv2.EVENT_MOUSEMOVE:
        colorsBGR = [x, y]
        print(colorsBGR)

cv2.namedWindow('RGB')
cv2.setMouseCallback('RGB', RGB)

# Video path
video_path = r'C:\Users\s4mue\Documents\yolo-cars\YOLOv8\carvideo4.mp4'

# Video capture
cap = cv2.VideoCapture(video_path)

my_file = open("coco.txt", "r")
data = my_file.read()
class_list = data.split("\n")

count = 0
tracker = Tracker()

cy1 = 300
cy2 = 360
offset = 4

vh_down = {}
counter = []

vh_up = {}
counter1 = []

vehicle_speeds = {}  # Dictionary to store calculated speeds
speed_display_timestamps = {}  # Dictionary to store timestamps for speed display

while True:
    ret, frame = cap.read()
    if not ret:
        break
    count += 1
    if count % 3 != 0:
        continue
    frame = cv2.resize(frame, (1020, 500))

    results = model.predict(frame)
    a = results[0].boxes.data
    px = pd.DataFrame(a).astype("float")
    list = []

    for index, row in px.iterrows():
        x1 = int(row[0])
        y1 = int(row[1])
        x2 = int(row[2])
        y2 = int(row[3])
        d = int(row[5])
        c = class_list[d]
        if 'car' in c:
            list.append([x1, y1, x2, y2])
    bbox_id = tracker.update(list)
    for bbox in bbox_id:
        x3, y3, x4, y4, id = bbox
        cx = int(x3 + x4) // 2
        cy = int(y3 + y4) // 2

        cv2.rectangle(frame, (x3, y3), (x4, y4), (0, 255, 0), 1)

        # Vehicle going down
        if cy2 < (cy + offset) and cy2 > (cy - offset):
            vh_down[id] = time.time()
        if id in vh_down:
            if cy1 < (cy + offset) and cy1 > (cy - offset):
                elapsed_time = time.time() - vh_down[id]
                if counter.count(id) == 0:
                    counter.append(id)
                    distance = 20  # meters
                    a_speed_ms = distance / elapsed_time
                    a_speed_kh = a_speed_ms * 3.6
                    vehicle_speeds[id] = a_speed_kh
                    speed_display_timestamps[id] = time.time()  # Store timestamp for speed display

        # Vehicle going up
        if cy1 < (cy + offset) and cy1 > (cy - offset):
            vh_up[id] = time.time()
        if id in vh_up:
            if cy2 < (cy + offset) and cy2 > (cy - offset):
                elapsed1_time = time.time() - vh_up[id]
                if counter1.count(id) == 0:
                    counter1.append(id)
                    distance1 = 15  # meters
                    a_speed_ms1 = distance1 / elapsed1_time
                    a_speed_kh1 = a_speed_ms1 * 3.6
                    vehicle_speeds[id] = a_speed_kh1
                    speed_display_timestamps[id] = time.time()  # Store timestamp for speed display

        # Display speed information with bounding box for 2 seconds
        if id in vehicle_speeds and time.time() - speed_display_timestamps[id] <= 2:
            speed_text = f"{int(vehicle_speeds[id])} kph"
            cv2.putText(frame, speed_text, (x3, y3 - 20), cv2.FONT_HERSHEY_COMPLEX, 0.7, (10, 255, 0), 2)

           

    cv2.line(frame,(274,cy1),(814,cy1),(200,200,150),1)

    cv2.putText(frame,('L1'),(277,295),cv2.FONT_HERSHEY_COMPLEX,0.5,(150,150,255),1)


    cv2.line(frame,(177,cy2),(927,cy2),(200,200,150),1)
 
    cv2.putText(frame,('L2'),(180,380),cv2.FONT_HERSHEY_COMPLEX,0.5,(150,150,255),1)
    d=(len(counter))
    u=(len(counter1))
    cv2.putText(frame,('keluar: ')+str(d),(40,60),cv2.FONT_HERSHEY_COMPLEX,0.6,(0,255,0),2)

    cv2.putText(frame,('masuk: ')+str(u),(40,100),cv2.FONT_HERSHEY_COMPLEX,0.6,(0,255,0),2)
    cv2.imshow("RGB", frame)
    if cv2.waitKey(1)&0xFF==27:
        break
cap.release()
cv2.destroyAllWindows()

import cv2
import pandas as pd
import numpy as np
from optimum.amd.ryzenai import pipeline  # Use Huggingface optimum pipeline
from tracker import Tracker
import time
from math import dist

model_id = "amd/yolov8m"
# Load Huggingface object detection pipeline
# model = pipeline('object-detection', model='amd/yolox-s')
detector = pipeline("object-detection", model=model_id, model_type="yolov8")

def RGB(event, x, y, flags, param):
    if event == cv2.EVENT_MOUSEMOVE:
        colorsBGR = [x, y]
        print(colorsBGR)

cv2.namedWindow('RGB')
cv2.setMouseCallback('RGB', RGB)

# Video path
video_path = r'C:\Users\s4mue\Documents\yolo-cars\YOLOv8\optimum-amd\carvideo4.mp4'

# Video capture
cap = cv2.VideoCapture(video_path)

my_file = open("coco.txt", "r")
data = my_file.read()
class_list = data.split("\n")

count = 0
tracker = Tracker()

cy1 = 300
cy2 = 360
offset = 4

vh_down = {}
counter = []

vh_up = {}
counter1 = []

vehicle_speeds = {}  # Dictionary to store calculated speeds
speed_display_timestamps = {}  # Dictionary to store timestamps for speed display

while True:
    ret, frame = cap.read()
    if not ret:
        break
    count += 1
    if count % 3 != 0:
        continue
    frame = cv2.resize(frame, (1020, 500))

    # Perform detection
    results = detector(frame)

    list = []

    for result in results:
        if result['label'] == 'car':
            x1, y1, x2, y2 = map(int, result['box'])
            list.append([x1, y1, x2, y2])
    
    bbox_id = tracker.update(list)
    for bbox in bbox_id:
        x3, y3, x4, y4, id = bbox
        cx = int(x3 + x4) // 2
        cy = int(y3 + y4) // 2

        cv2.rectangle(frame, (x3, y3), (x4, y4), (0, 255, 0), 1)

        # Vehicle going down
        if cy2 < (cy + offset) and cy2 > (cy - offset):
            vh_down[id] = time.time()
        if id in vh_down:
            if cy1 < (cy + offset) and cy1 > (cy - offset):
                elapsed_time = time.time() - vh_down[id]
                if counter.count(id) == 0:
                    counter.append(id)
                    distance = 20  # meters
                    a_speed_ms = distance / elapsed_time
                    a_speed_kh = a_speed_ms * 3.6
                    vehicle_speeds[id] = a_speed_kh
                    speed_display_timestamps[id] = time.time()  # Store timestamp for speed display

        # Vehicle going up
        if cy1 < (cy + offset) and cy1 > (cy - offset):
            vh_up[id] = time.time()
        if id in vh_up:
            if cy2 < (cy + offset) and cy2 > (cy - offset):
                elapsed1_time = time.time() - vh_up[id]
                if counter1.count(id) == 0:
                    counter1.append(id)
                    distance1 = 15  # meters
                    a_speed_ms1 = distance1 / elapsed1_time
                    a_speed_kh1 = a_speed_ms1 * 3.6
                    vehicle_speeds[id] = a_speed_kh1
                    speed_display_timestamps[id] = time.time()  # Store timestamp for speed display

        # Display speed information with bounding box for 2 seconds
        if id in vehicle_speeds and time.time() - speed_display_timestamps[id] <= 2:
            speed_text = f"{int(vehicle_speeds[id])} kph"
            cv2.putText(frame, speed_text, (x3, y3 - 20), cv2.FONT_HERSHEY_COMPLEX, 0.7, (10, 255, 0), 2)

    cv2.line(frame, (274, cy1), (814, cy1), (200, 200, 150), 1)
    cv2.putText(frame, 'L1', (277, 295), cv2.FONT_HERSHEY_COMPLEX, 0.5, (150, 150, 255), 1)

    cv2.line(frame, (177, cy2), (927, cy2), (200, 200, 150), 1)
    cv2.putText(frame, 'L2', (180, 380), cv2.FONT_HERSHEY_COMPLEX, 0.5, (150, 150, 255), 1)
    d = len(counter)
    u = len(counter1)
    cv2.putText(frame, 'keluar: ' + str(d), (40, 60), cv2.FONT_HERSHEY_COMPLEX, 0.6, (0, 255, 0), 2)
    cv2.putText(frame, 'masuk: ' + str(u), (40, 100), cv2.FONT_HERSHEY_COMPLEX, 0.6, (0, 255, 0), 2)
    
    cv2.imshow("RGB", frame)
    if cv2.waitKey(1) & 0xFF == 27:
        break

cap.release()
cv2.destroyAllWindows()