Published August 1, 2024 © Apache-2.0

EdgeVision: Wireless Real-Time Video Processing with KR260

Harness the power of seamless wireless connectivity to unlock real-time video processing capabilities on the KR260.

IntermediateWork in progress16 hours215

EdgeVision: Wireless Real-Time Video Processing with KR260

Things used in this project

Hardware components

AMD Kria™ KR260 Robotics Starter Kit

The KR260 is a powerful FPGA-based development kit designed for rapid prototyping and deployment of edge AI and real-time video processing applications

Seeed Studio XIAO ESP32S3 Sense

Used as a wireless IP camera to acquire the video feed that needs to be processed

Software apps and online services

AMD PYNQ Framework

Jupyter Notebook

AMD Vivado Design Suite

Arduino IDE

Story

Introduction

EdgeVision is an innovative project that leverages the power of edge computing and wireless technology to enable real-time video processing. This system utilizes a Seeed Xiao ESP32S3 Sense as the video input device and the AMD-Xilinx KR260 development kit as the processing hub.

Objectives

Implement a wireless video streaming solution from the ESP32S3 Sense to the KR260.
Develop efficient video processing algorithms on the KR260's FPGA.
Achieve real-time performance for selected video processing tasks.
Create a flexible, scalable system for various video analytics applications.

Working

The EdgeVision system begins its operation at the Seeed XIAO ESP32S3 Sense, where video capture and preprocessing take place. The onboard OV2640 camera captures raw image data at 30 fps in 720p resolution, with the ESP32-S3 microcontroller managing the camera through I2C control and parallel data transfer. Once captured, the raw image data undergoes preprocessing. It's converted to RGB565 format for efficiency, then compressed using a lightweight JPEG algorithm to reduce transmission size. Each frame is assigned a timestamp and unique identifier. To ensure smooth capture and transmission, a double-buffering system is implemented, allowing one buffer to be filled with a new frame while the other is being transmitted.

The wireless transmission phase leverages the ESP32S3's WiFi capabilities to send preprocessed frames to the KR260. The ESP32S3 maintains a persistent WiFi connection on the same network as the KR260, continuously monitoring connection quality and automatically reconnecting if needed. A custom UDP-based protocol facilitates low-latency transmission, with each frame split into multiple packets containing headers with frame ID, packet sequence, and timestamp information. The system employs adaptive bit rate control, adjusting compression levels based on network conditions, and synchronizes the transmission rate with frame capture to maintain real-time performance.

On the receiving end, the KR260 manages reception and frame reconstruction. It runs a custom network stack optimized for high-speed packet processing, with incoming packets handled by a high-priority dedicated network thread. The system reassembles frames by sorting and combining packets based on their frame ID and sequence numbers, using a jitter buffer to manage varying network latencies. Incomplete frames due to packet loss are either discarded or reconstructed using forward error correction when available. Once reassembled, the JPEG data is decoded using a hardware-accelerated decoder implemented in the FPGA, with decoded frames stored in the KR260's DDR4 memory.

Video processing takes place within the KR260's FPGA, which implements a pipelined architecture for efficient processing. Multiple processing stages operate concurrently on different frames, executing a variety of algorithms. These include Canny edge detection for identifying edges in video frames and modules for image enhancement such as noise reduction and contrast enhancement. The FPGA utilizes dynamic partial reconfiguration to switch between different processing algorithms based on application needs, while a custom scheduler optimizes resource utilization across various processing tasks. After processing, the system manages result output and display. Processed data is stored in a results buffer and transferred to the ARM cores via a DMA engine for further handling. The system can output processed frames through DVI for real-time monitoring. Additionally, processing results are logged to onboard storage for later analysis, with an option for real-time transmission to a central server in multi-node setups.

Key Components and Implementation:

Video Capture Device: Seeed Xiao ESP32S3 Sense

Seed XIAO ESP32S3 Sense enclosed in a 3D printed enclosure

Successfully configured to capture high-quality video frames using the OV2640 camera.
Implemented efficient frame preprocessing to optimize transmission.
Achieved reliable WiFi transmission of frame data with minimal latency.

Processing Hub: AMD-Xilinx KR260 Development Kit

Configured for seamless wireless video stream reception using a dual band USB WiFi Adapter.
Implemented multiple FPGA-based video processing algorithms.
Optimized for real-time processing and output of results.

Wireless Communication:

Developed a custom WiFi-based frame transmission protocol.
Achieved an optimal balance between low-latency and reliability.

Video Processing Algorithms:

Successfully implemented and tested canny edge detection algorithm.

Conclusion

This comprehensive system architecture demonstrates EdgeVision's sophisticated approach to real-time wireless video processing, showcasing the intricate interplay between its hardware and software components. From efficient video capture and wireless transmission to powerful FPGA-based processing and flexible output options, EdgeVision represents a cutting-edge solution for edge-based video analytics applications.

Future Work

Integration with LiDAR: Adding LiDAR sensors to enhance spatial awareness and depth perception.
Chassis Integration with Robotic Arm: Integrating the KR260 with a mobile chassis and robotic arm for expanded operational capabilities.