Welcome to “EagleEye AI”! In this project, we’ll demonstrate how to build an advanced security surveillance system using the KRIA™ KR260 Robotics StarterKit. Learn to integrate AI algorithms with a pan-tilt camera for real-time facial tracking. Follow our detailed instructions to harness the power of DIY security surveillance at home.
BackgroundInformation
Many existing IP cameras, despite being equipped with Pan-Tilt rotators for versatile coverage, lack intelligent control mechanisms. The absence of AI capabilities limits their ability to autonomously track and respond to dynamic scenarios. This project addresses this gap by introducing the “EagleEye AI” system, leveraging the KRIA™ KR260 Robotics Starter Kit. The goal is to provide an accessible and seamless upgrade for already installed IP cameras, enhancing their functionality with AI-driven face recognition and robust robotic control (ROS2) through RS-485 interface. By doing so, we aim to empower security camera installations with intelligent, automated surveillance capabilities, bridging the gap between traditional setups and cutting-edge AI-enhanced solutions.
Solution Workflow
1. Camera Stream Integration:
The IP Outdoor Security Camera mounted on the Pan-Tilt rotator streams video to the KRIA™ KR260 Robotics Starter Kit via Ethernet.
2. Face Recognition AI Model:
The “Smartcam” demo application, adapted for the project, employs a face recognition AI model running on the DPU IP of the KRIA™ KR260’s FPGA. The model is integrated with VVAS (Vitis Video Analytic SDK), leveraging the video acceleration capabilities.
3. Overlay and Marking:
When a face is detected in the video stream, the system overlays the stream with marked frames around the detected face, enhancing situational awareness.
4. Orientation Calculation:
The system calculates the approximate location of the detected person and determines the desired camera orientation for optimal framing.
5. Real-time Adjustment with ROS2:
Using ROS2, the system dynamically adjusts the Pan-Tilt rotator to follow the detected face, ensuring that the person remains centered in the video stream.
6. PELCOD-D Protocol Commands:
Python code generates PELCOD-D protocol commands based on the calculated orientation, and sends them through the RS-485 interface. This communication controls the movement of the Pan-Tilt rotator.
7. Feedback Integration (Optional):
Optionally, an RS-485 Accelerometer + Magnetometer sensor may be used for additional feedback on the camera’s orientation, enhancing precision.
8. KRIA™ KR260 Hardware Utilization
The KRIA™ KR260 Robotics Starter Kit serves as the central hub, leveraging its FPGA capabilities and DPU for accelerated AI processing. The VVAS SDK enhances video analytics, while the RS-485 interface facilitates communication with the Pan-Tilt rotator. The modular design supports flexibility in hardware interfaces, enabling seamless integration with existing camera setups.
Quick Launch and Testing the ApplicationIf you have already set up your KRIA™ KR260 Robotic Starter Kit with the Ubuntu image and installed Docker for running the official demo applications, you can quickly test our Eagle-Eye-AI application, even without the pan-tilt rotator and PMOD module setup.
This section is designed for quickly testing the software, building the Docker image, installing, and launching it with the appropriate FPGA firmware from our GitHub repository. Additionally, if you have an IP camera and a Pan-Tilt camera rotator with an RS-485 interface connected, the application will be fully functional.
Simply follow the steps below. Alternatively, you can explore our comprehensive tutorial, which builds everything from scratch step-by-step and explains every procedure in detail. Feel free to skip this quick test section if you prefer a more thorough walkthrough.
Steps to Follow
To build and install the application and firmware, execute the following commands on the running Kria board. The build process will take approximately 10 minutes:
# Get the Eagle-Eye-AI repository from GitHub
git clone https://github.com/s59mz/eagle-eye-ai.git
cd eagle-eye-ai
# Install Firmware Binaries
cp fpga-firmware/firmware-kr260-eagle-eye.deb /tmp
sudo apt install /tmp/firmware-kr260-eagle-eye.deb
# Build the docker image - takes about 10 min on Kria Board
./build.sh
Once the build process on the Kria board is complete, you should see an output resembling the one shown below:
The entire build and installation process is demonstrated in this video:
Lunch the Application
- Connect an IP camera that supports RTSP and set the output resolution to 1920x1080. Ensure both the camera and the Kria board are on the same local network, using a standard 1 Gbps Ethernet switch.
- Before launching the application, connect a high-resolution monitor to the DisplayPort of the Kria board. If your monitor has an HDMI input, use an Active DisplayPort to HDMI adapter (passive adapters will not work).
- Execute the following commands on the running Kria board:
# Load the FPGA firmware
sudo xmutil unloadapp
sudo xmutil loadapp kr260-eagle-eye
sudo xmutil desktop_disable
# go to the Eagle-Eye-AI app git repository
cd ~/eagle-eye-ai
# Launch the eagle-eye-ai docker image
./run.sh
# Launch the app with your camera URL
./run_eagle_eye_ai.sh rtsp://192.168.1.11:554/stream1
# To Exit press Ctrl-C
Expected Output
When you run the Eagle-Eye-AI app inside the Docker container, you should see an output similar to the example below. You can safely ignore GStreamer warning messages like “GLib-GObject-CRITICAL **, ” which only appear the first time the application is launched.
To exit, press Ctrl-C.
You should see the camera’s captured images on the monitor connected to the board.
When a face is detected, a blue box will appear around it, tracking its movement. At the bottom of the video screen, a white status text shows the camera’s current orientation (Azimuth and Elevation), updated in real-time from the camera’s inclinometer data.
The camera rotator will adjust to keep the closest detected face centered on the screen, with the blue box remaining around the face.
An orange signal lamp on top of the camera lights up when a face is detected. The red and blue LEDs on the RS-485 PMOD module light up when the pan or tilt motor of the rotator is active, respectively.
The complete process for starting the application is demonstrated in this video:
Note
If the camera rotator with RS-485 PMOD Module are not set up yet, you can listen for motor command messages on the ROS2 topic named /motor_control
from your host PC connected to the same local network as the Kria board.
ros2 topic echo /motor_control
When a face is detected, tracked, and already centered, the output should look like this. Otherwise, you will see different motor speed values, depending on the face’s position on the screen.
pan_speed: 0
tilt_speed: 0
---
pan_speed: 0
tilt_speed: 0
---
Ensure that ROS2 is installed on your host PC and that you have built both the eagle_eye_interfaces
and rotator_interfaces
using “colcon build” command. For more information, refer to our detailed tutorial that follows.
Complete Demonstration Video
This video demonstrates our Eagle-Eye-AI face-tracking camera system using an IP camera and a pan-tilt rotator, managed by the KRIA™ KR260 Robotics Kit.
When a face is detected, it is enclosed in a blue boundary box, and the pan-tilt rotator adjusts the camera to keep the face centered on the screen.
An orange signal lamp atop the camera, visible in the large mirror behind the scene, illuminates when a face is detected. Observe how the orange lamp turns on and off as a face is covered or uncovered.
At the bottom of the video screen, a white status text displays the camera's current orientation (Azimuth and Elevation), updated in real-time from the camera's inclinometer data. This status text only appears when a face is being tracked. Notice how the AZ (azimuth) parameter changes as the camera rotates.
This behind-the-scenes video reveals the Eagle-Eye-AI system in action, featuring both the Eagle-Eye-AI camera and the demonstrator. Unlike the previous video, which shows only the camera's view, this footage captures the entire setup from the background.
You'll see the camera tracking the demonstrator's movements, demonstrating the system's responsiveness.
The orange signal lamp atop the camera lights up when a face is detected. Watch as the lamp turns on and off as the face is covered or uncovered.
Welcome to our comprehensive tutorial, where we guide you through building everything step-by-step from scratch, explaining each procedure in detail. If you get stuck at any point, feel free to check or use the code from our GitHub repository. Enjoy!
Hardware SetupTo start assembling the hardware for our “Eagle Eye AI” project, attach the Pan-Tilt rotator to a stable base using screws. Make sure the rotator is firmly secured to the base to avoid any movement during operation. In this demo application, we are using a tripod that is sturdy enough to support both the rotator and the camera.
Next, mount the IP camera onto the rotator. We use a general purpose White bracket that holds the camera on the rotator. Take care to align the camera properly to achieve optimal viewing angles for surveillance purposes.
Optionally, you can mount a 12V signaling warning light on the same bracket that holds the IP camera. Connect it to the same 12V power supply cable that powers the rotator and camera through the auxiliary switch integrated into the rotator.
When a person is detected by the AI, the Kria board sends a Pelco-D command through the RS-485 bus to the rotator, activating the auxiliary switch and turning on the warning signaling light. Ensure that the light is waterproof to withstand outdoor conditions and enhance visibility in surveillance scenarios.
Mount an inclinometer on the same base as the camera to enable monitoring and feedback on the camera’s orientation, such as azimuth and elevation.
Use a 5.1 mm power jack to connect the camera to a 12V power source.
Schematic DiagramTo connect the IP camera, signal lamp, and inclinometer, all devices can share the same power cable and RS-485 bus used by the pan-tilt rotator. This setup simplifies the wiring. The schematic diagram is shown below:
Using the 6-wire cable provided by the rotator, which includes connections for 12V power, RS-485 bus, and the auxiliary switch, simplifies the setup.
Pinout of the Rotator’s 6-wire cable:
- Red: +12V power line
- Black: GND power line
- Orange: Line-A of RS-485 bus
- Yellow: Line-B of RS-485 bus
- Green and Blue: Auxiliary switch for controlling the signal lamp.
Mounting a small junction box on the back side of the bracket holding the camera provides a centralized location for wiring connections and the inclinometer. Cable glands can be used to secure the cables and prevent any water ingress.
Inside the junction box, connect the camera’s power cable to the 12V power lines. Additionally, connect the inclinometer to the same power lines and the RS-485 bus, as illustrated in the schematic diagram above.
Additionally, wire the signaling lamp to the 12V power supply through the auxiliary switch wires located within the junction box. This streamlined approach ensures efficient connectivity and operation of both the IP camera and the inclinometer within the surveillance system.
When you finish all connections in the junction box, mount the bracket that holds everything together on the top of the rotator.
Now, connect the 4-wire cable to the main connector of the pan-tilt rotator. This cable serves a dual purpose by providing both power and communication for the rotator. Two wires supply 12V power, while the other two carry RS-485 signals (A and B) for communication with the Kria board.
At the other end of the 4-wire cable from the rotator, mount another small junction box to serve as a splitter. This junction box divides the cable into two connections: one for the 12V external power supply and the other for the RS-485 A and B lines, which connect to the RS-485 module attached to the Kria board.
Connect the RS-485 bus from the splitter to the Kria board either through a USB to RS-485 adapter plugged into one of the Kria’s USB ports, or through a custom-made RS-485 PMOD module connected to one of the available PMOD interfaces on the Kria board. Users have the option to choose between these two methods based on their preference and requirements.
We use a custom-made RS-485 PMOD module for the Kria board and a USB to RS-485 adapter plugged into the host PC for debugging purposes. Both RS-485 adapters are connected to the same RS-485 A and B lines. The USB adapter acts as a sniffer on the host PC, providing a convenient way to debug Pelco-D and Modbus protocol messages.
Make a seamless connectivity by using an Ethernet cable long enough to reach from the IP camera to the Ethernet switch. This cable guarantees reliable communication between the camera and the Kria board, facilitating smooth operation and data transfer for your surveillance system.
Connect both the Ethernet cable from the IP camera and one of the RJ-45 connectors on the Kria board’s PS side, specifically J10C or J10D, to the same local network. We used a standard 1G Ethernet switch for this purpose.
Ensure that the Kria board is powered using its included power supply, while the pan tilt rotator requires an external 12V power supply rated at 4A for optimal performance. Additionally, connect an external HDMI monitor to the Kria board using an active DisplayPort to HDMI adapter to facilitate visualization and configuration during setup and operation. Passive DP to HDMI won‘t work, be sure you have an active adapter.
With the hardware setup complete, our Eagle-Eye-AI project is ready to move on to the software configuration and implementation phase.
Before moving on to the next project phase (software configuration), we need to conduct some preliminary tests to ensure everything is connected properly. We’ll start by testing the rotator connectivity and then proceed to test the camera.
Testing the Rotator Connectivity:1. Power On:
Apply external 12V power at the splitter end (Red & Black wires) to power the rotator and camera.
2. Self-Test:
The rotator should begin moving in all directions for about 2 minutes, performing a self-test to ensure it is functioning correctly.
Testing and Configuring the Camera:1. Power Check
The camera should turn on. Verify this by checking the blinking LED on the Ethernet cable connector of the camera.
2. Configuring the Camera
Before we can use the camera’s RTSP stream, we need to configure it properly. Most IP cameras provide two streams with different resolutions: a main stream with higher resolution and a sub-stream with lower resolution. It’s important to set both video resolutions to values compatible with the RTSP decoder in the Kria’s Smartcam demo application. We found that the resolutions 1920x1080 and 640x480 work properly with the decoder.
Configure the IP camera so that the main stream is set to 1920x1080 with a frame rate of 25, and the sub-stream is set to 640x480 with a frame rate of 15. Both streams should use H.265 encoding.
Most IP cameras have a web interface available for configuration. Configuring and finding the correct URLs for both RTSP streams on our inexpensive Chinese camera without a user manual or default login credentials was challenging. After some research and trial and error, we discovered the following login credentials for our SIMLINK IP camera from AliExpress:
Web Interface
- url: http://192.168.1.11
- username: admin
- password: 123456
Main-Stream RTSP url
- rtsp://192.168.1.11/stream1
Sub-Stream RTSP url
- rtsp://192.168.1.11/stream2
After entering the provided URL in the internet browser, a login window appears. Enter the login credentials.
Next, navigate to the Configure tab and select the Stream Manager option from the left-hand side menu. For Stream Type, choose Main Stream and set the resolution to 1920x1080 with a frame rate of 25. Ensure the Encoding Format is set to H265, then click the Save button.
On the same page, select the Sub Stream option. Set the resolution to 640x480, the frame rate to 15, and the encoding format to H265. Click the Save button.
The configuration of the camera is now complete. Log out by clicking the button in the upper right corner.
To test the camera’s RTSP streams, open the VLC media player on your host computer. Follow these steps to connect to the camera stream:
- Press Ctrl-N to open the “Open Media” dialog.
- Enter the following URL (replacing with the correct IP address of your camera):
rtsp://192.168.1.11:554/stream1
- Click “Play.”
You should now see a real-time video feed from the camera. To verify the video resolution, go to Tools -> Codec Information.
Repeat the process with the other stream by using the URL:
rtsp://192.168.1.11:554/stream2
Again, check the video resolution to ensure it matches the configured settings.
Testing the RotatorTo test the rotator, we will use a host PC connected to the rotator’s RS-485 interface through a USB to RS-485 dongle from AliExpress. This setup allows us to send simple, hardcoded Pelco-D hexadecimal messages through the USB port to the rotator. By doing so, we can verify if the rotator responds correctly and starts moving according to the commands sent. This preliminary test ensures that our hardware connections are properly configured before integrating the rotator with the Kria board.
Follow these steps:
1. Find the correct serial device for the RS485-USB adapter:
Open the Terminal program on the host computer, then detach and reattach the RS-485 sniffer dongle. Run the dmesg Linux command to identify the correct serial device for the RS485-USB dongle.
$ dmesg | tail
[18773.603338] usb 2-1.3: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[18773.603344] usb 2-1.3: Product: USB2.0-Ser!
[18773.604060] ch341 2-1.3:1.0: ch341-uart converter detected
[18773.604931] ch341-uart ttyUSB1: break control not supported, using simulated break
[18773.605110] usb 2-1.3: ch341-uart converter now attached to ttyUSB1
In our case, it was detected as /dev/ttyUSB1, which will be used in the following commands.
We will use the echo Linux command to send simple Pelco-D protocol messages to the rotator through the RS485-USB dongle and verify if the rotator responds correctly. Ensure that the rotator’s baud rate is set correctly by checking the dip switches on the rotator according to the rotator’s user manual. We observed that my host computer defaults to a baud rate of 9600 when using the echo command, so we set the rotator to the same baud rate of 9600.
2. Command Test:
Send some Pelco-D protocol messages from the terminal to the rotator. Coding Pelco-D messages is beyond the scope of this project. Instead, we provide some hardcoded messages that work with this rotator. For more details on the Pelco-D protocol, refer to this tutorial:
https://www.commfront.com/pages/pelco-d-protocol-tutorial
- Send a command to turn the signal lamp on the top of rotator on and off.
# turn on
echo -en '\xff\x01\x00\x09\x00\x01\x0b' > /dev/ttyUSB1
# turn off
echo -en '\xff\x01\x00\x0b\x00\x01\x0d' > /dev/ttyUSB1
- Send a command to pan left and then right.
# pan left
echo -en '\xff\x01\x00\x04\xff\x00\x04' > /dev/ttyUSB1
# pan right
echo -en '\xff\x01\x00\x02\xff\x00\x02' > /dev/ttyUSB1
- Send a command to tilt up and then down.
# tilt up
echo -en '\xff\x01\x00\x08\x00\x3f\x48' > /dev/ttyUSB1
# tilt down
echo -en '\xff\x01\x00\x10\x00\x3f\x50' > /dev/ttyUSB1
- To stop any motor activity at any given time use this command.
# stop
echo -en '\xff\x01\x00\x00\x00\x00\x01' > /dev/ttyUSB1
These tests should be conducted with the Kria board switched off. We are only verifying the proper connection and functionality of the rotator and camera to the external power supply, Ethernet, and RS-485 bus.
You can repeat the Command Tests from the Kria board after setting up the Ubuntu software on the Kria board. However, this can only be done through the attached USB to RS-485 adapter by using the patched CH340 serial driver, and not the PMOD module. To use the PMOD module, we need to build a new hardware platform that supports it, which will be covered in a later section.
Once these tests are successfully completed, we can proceed to prepare the Ubuntu image for the Kria board.
Preparing the Ubuntu Image for the Kria BoardIn this section, we will prepare the official Ubuntu image for the Kria board, using the Kria’s Smartcam demo application as a basis. We will also install Docker, ROS2, and other necessary tools for our project development.
Booting the Kria Board with LinuxTo get started, boot the Kria board with Linux by following the official guide provided here.
Steps to Follow:
1. Download and Prepare the Ubuntu Image:
- Download the Ubuntu image for the Kria board.
- Flash the image to an SD card with Balena Etcher or similar program.
- Insert the SD card into the Kria board and boot it up.
2. Update and Install Required Software:
- Add necessary repositories and update the Ubuntu system on the board. This process can last more than an hour.
sudo add-apt-repository ppa:xilinx-apps --yes &&
sudo add-apt-repository ppa:ubuntu-xilinx/sdk --yes &&
sudo add-apt-repository ppa:xilinx-apps/xilinx-drivers --yes &&
sudo apt update --yes &&
sudo apt upgrade --yes
- Install Docker to manage and deploy applications in a containerized environment. Follow the instructions from docker.com to install Docker.
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
# install it
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# set user groups
sudo groupadd docker
sudo usermod -a -G docker $USER
# verify it
sudo docker run hello-world
Install ROS2 Humble desktopTo proceed, install ROS2 on the Kria board. ROS2 provides the essential tools and libraries required for developing and running robot applications.
# Install ROS 2 humble
# get GPG key
sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg
# and install it
echo "deb [arch=arm64 signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2-testing/ubuntu jammy main" | sudo tee /etc/apt/sources.list.d/ros2.list > /dev/null
# update the system
sudo apt update
sudo apt upgrade
# install ROS2 humble desktop version
sudo apt install ros-humble-desktop
# install additional modules
sudo apt install lttng-modules-dkms lttng-tools ros-humble-tracetools-launch
# install network manager packages
sudo apt install lldpad ethtool
# instal ros2 interfaces
sudo apt install ros-humble-example-interfaces
# install ros2 compilers
sudo apt install ros-dev-tools
# edit ~/.bashrc startup script and add the following at the end:
source /opt/ros/humble/setup.bash
source /usr/share/colcon_argcomplete/hook/colcon-argcomplete.bash
Note
These steps are required only for debugging ROS2 nodes and listening to ROS2 topic messages on the Kria board. The final application only needs ROS2 installed in the application Docker container.
Setup the timezone and locale# set timezone
sudo timedatectl set-ntp true
sudo timedatectl set-timezone Europe/Ljubljana
timedatectl
# set up locale
# check for UTF-8
locale
sudo apt update && sudo apt install locales
sudo locale-gen en_US en_US.UTF-8
sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
export LANG=en_US.UTF-8
# verify settings
locale
Install some other programs1. Install xrt zocl
# instal xrt zocl
sudo apt install xrt-dkms
2. Install the patched serial driver CH340 needed for using USB-RS485 dongle only.
git clone https://github.com/juliagoda/CH341SER
cd CH341SER
make clean
make
sudo make load
sudo rmmod ch34x
lsmod | grep ch34
# plug & unplug the USB device
sudo dmesg | tail
Install the Official Smartcam Demo ApplicationNext, we will install the Smartcam demo application by following the official guide here.
Steps to Follow:
1. Install FPGA Firmware:
sudo apt install xlnx-firmware-kv260-smartcam
- Disable the Genome desktop environment, because the Smartcam application uses the same monitor.
sudo xmutil desktop_disable
- Load the FPGA firmware required for the Smartcam application.
sudo xmutil unloadapp
sudo xmutil loadapp kv260-smartcam
2. Get the demo video file suitable for the application (on your host machine)
You can download a video file from the following link, which is in MP4 format:
Then, you need to transcode it to the H264 file (named alley.nv12.h264), which is one supported input format.
ffmpeg -i input-video.mp4 -c:v libx264 -pix_fmt nv12 -vf scale=1920:1080 -r 30 alley.nv12.h264
Finally, upload or copy these transcoded H264 file to the board and place it to somewhere under /tmp/Videos, which will mapped to /tmp in the Docker container too.
3. Deploy the Smartcam Application (on the target Kria board):
- Install the Smartcam app as a Docker image, ensuring that it runs smoothly on the Kria board.
docker pull xilinx/smartcam:2022.1
- To start the docker image run this command:
docker run \
--rm \
--env="DISPLAY" \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--net=host \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it xilinx/smartcam:2022.1 bash
Run the Smartcam demo application within the Docker container using the provided demo video file to verify that everything is working correctly. Press Ctrl-C to exit.
smartcam --file tmp/Videos/alley.nv12.h264 -i h264 -W 1920 -H 1080 -r 30 --target dp
Introduction to GStreamerGStreamer is a powerful multimedia framework used for building applications that handle audio and video processing. It provides a pipeline-based architecture, allowing developers to create complex media-handling workflows using simple command-line tools or more sophisticated programming interfaces. For an in-depth understanding and official tutorials, you can visit the GStreamer documentation.
In this project, we will use the gst-launch-1.0 command-line tool extensively. gst-launch-1.0 allows us to build and run GStreamer pipelines directly from the command line, enabling us to test various configurations and ensure our IP camera streams are processed correctly.
Using gst-launch-1.0, we can attach our RTSP IP camera as an input source and verify that the GStreamer pipeline processes the video stream as expected. This hands-on approach lets us experiment with different pipeline elements and understand how the Smartcam application works, making it easier to troubleshoot and fine-tune our setup.
In the tests that follow, we will use gst-launch-1.0 to connect to our IP camera, decode the video stream, and process it using various GStreamer elements. This practical approach provides immediate feedback and helps us ensure that all components of the Smartcam application are functioning correctly.
Testing the Smartcam Demo Application with Our IP Camera.First, verify if you can run the following GStreamer pipeline (within the same Docker container for the Smartcam application) without any issues. This pipeline uses the same video file and streams the converted video directly to the attached DisplayPort monitor, without utilizing any hardware accelerations. Press Ctrl-C to exit.
# run gstreamer pipeline on target - file as input
gst-launch-1.0 filesrc location=tmp/Videos/alley.nv12.h264 ! h264parse ! omxh264dec ! video/x-raw, width=1920, height=1080, format=NV12, framerate=30/1 ! queue ! kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true
Now, we will replace the input file source with our IP camera’s RTSP stream to verify if the stream can be decoded properly. We will use the camera’s low-resolution sub-stream for this test. As before, we will not use any hardware accelerators in the GStreamer pipeline.
# run gstreamer pipeline on target - RTSP camera stream 2 as input
# low resolution, no accelerators
gst-launch-1.0 rtspsrc location=rtsp://192.168.1.11:554/stream2 ! rtph265depay ! h265parse ! omxh265dec ! videoconvert ! video/x-raw, format=NV12 ! kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true
Finally, we verify the full functionality of the Smartcam demo application using the high-resolution main stream from our IP camera as the input, while utilizing hardware accelerators and the DPU for face detection.
This GStreamer pipeline was extracted from the official Smartcam demo application via the Jupyter Notebook version. We added a simple print function in the notebook to display the pipeline, and then modified it by replacing the input MIPI or USB camera with our RTSP IP camera.
# run gstreamer pipeline on target - RTSP camera stream as an input
# high resolution, using DPU for inference and accelerators
gst-launch-1.0 rtspsrc location=rtsp://192.168.1.11:554/stream1 ! rtph265depay ! h265parse ! omxh265dec ! videorate ! videoconvert ! video/x-raw, framerate=30/1, format=NV12 ! tee name=t ! queue ! vvas_xmultisrc kconfig="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json" ! queue ! vvas_xfilter kernels-config="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json" ! ima.sink_master vvas_xmetaaffixer name=ima ima.src_master ! fakesink t. ! queue max-size-buffers=1 leaky=2 ! ima.sink_slave_0 ima.src_slave_0 ! queue ! vvas_xfilter kernels-config="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json" ! queue ! kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true
Verify that the application can detect faces in the video stream. All detected faces should be outlined with a blue box. Check the latency, which should be approximately half a second in our case.
By completing these steps, you will have a fully functional Ubuntu environment on your Kria board, equipped with Docker, ROS2, the Smartcam demo application, and the IP camera as an input, all ready for further development.
Understanding the GStreamer PipelineThe previous pipeline works as follows:
- rtspsrc: Connects to the RTSP stream from the IP camera.
- rtph265depay: Depayloads the RTP packets to extract the H.265 video stream.
- h265parse: Parses the H.265 stream to prepare it for decoding.
- omxh265dec: Uses hardware acceleration to decode the H.265 video stream.
- videorate: Sets the framerate to 30 fps.
- videoconvert: Converts the video format to NV12.
- tee: Splits the video stream into multiple branches for parallel processing.
- vvas_xmultisrc: Pre-processes the video frames using the configuration in preprocess.json.
- vvas_xfilter: Performs AI inference using the DPU with the configuration in aiinference.json.
- vvas_xmetaaffixer: Affixes metadata (like detected faces) to the video stream.
- vvas_xfilter: Draws results (e.g., bounding boxes around detected faces) using the configuration in drawresult.json.
- kmssink: Displays the final video stream on the screen with low latency.
By using this pipeline, we ensure that the face detection runs efficiently, leveraging the hardware capabilities of the Kria board for real-time processing and display.
Data Flow Explanation1. Video Capture: The video is captured from the IP camera using rtpsrc.
2. Decoding: The video is parsed and decoded from H.265 format to raw video frames.
3. Conversion: The raw video frames are converted to the NV12 format and the framerate is fixed to 30 fps.
4. Tee Element: The video stream is split into two branches using the tee element. The first branch processes the video for AI inference, affixes metadata to the frames, and sends it to a fakesink. The second branch receives the same video frames, processes them to draw the AI inference results (e.g., bounding boxes), and displays the final video on the monitor.
Branch 1 (Inference Path):
• The video frames are preprocessed and passed through an AI inference model using vvas_xfilter configured with aiinference.json.
• The inference results (e.g., face detection bounding boxes) are merged with the video frames as metadata using vvas_xmetaaffixer.
• The frames are sent to a fakesink (just to complete the pipeline; actual display is not needed here).
Branch 2 (Display Path):
• The video frames, along with the metadata from the inference path, are sent to another instance of vvas_xfilter configured with drawresult.json.
• This element draws bounding boxes around the detected faces using the metadata.
• Finally, the video frames with drawn bounding boxes are displayed on the monitor using kmssink.
By following this setup, the pipeline ensures that metadata generated during AI inference is accurately applied to the correct frames before they are displayed. This setup also allows for efficient parallel processing of video streams.
Detailed Explanation of the GStreamer PipelineThe GStreamer pipeline provided is used to process and analyze video streams from an IP camera, applying AI inference for face detection, and displaying the results on a DisplayPort monitor. Here is a detailed breakdown of each element in the pipeline:
gst-launch-1.0 rtspsrc location=rtsp://192.168.1.11:554/stream1 ! rtph265depay ! h265parse ! omxh265dec ! videorate ! videoconvert ! video/x-raw, framerate=30/1, format=NV12 ! tee name=t ! queue ! vvas_xmultisrc kconfig="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json" ! queue ! vvas_xfilter kernels-config="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json" ! ima.sink_master vvas_xmetaaffixer name=ima ima.src_master ! fakesink t. ! queue max-size-buffers=1 leaky=2 ! ima.sink_slave_0 ima.src_slave_0 ! queue ! vvas_xfilter kernels-config="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json" ! queue ! kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true
Breakdown of Elements
1. rtspsrc location=rtsp://192.168.1.11:554/stream1: This element fetches the RTSP stream from the specified URL.
2. rtph265depay: Depayloads the RTP packets into raw H.265 encoded data.
3. h265parse: Parses the H.265 stream to ensure it is correctly formatted for decoding.
4. omxh265dec: Hardware-accelerated decoder for H.265 video.
5. videorate and videoconvert: Converts the video into a raw format and sets the framerate.
6. video/x-raw, framerate=30/1, format=NV12: Specifies the format NV12 of the raw video with the framerate set to 30 fps.
7. tee name=t: This element named as “t” duplicates the video stream into two separate branches for parallel processing.
Branch-1 (Inference Path):
1. queue: Buffers data to ensure smooth processing in each branch.
2. vvas_xmultisrc kconfig=”/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json”: This element pre-processes the video for AI inference based on the provided configuration file.
3. queue: Buffers data again to ensure smooth processing in each branch.
4. vvas_xfilter kernels-config=”/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json”: Performs AI inference using the preprocessed video and the specified AI kernel configuration.
5. ima.sink_master vvas_xmetaaffixer name=ima ima.src_master: This element, named ima, affixes metadata to the video frames. Metadata includes information like detected faces and their coordinates. It has four ports, but here in the first branch (Inference Path) they are used only master ports:
- ima.ink_master: Sends data to vvas_xmetaaffixer.
- ima.src_master: Outputs the video frames from vvas_xmetaaffixer with metadata.
6. fakesink: Consumes the video stream from ima.src_master and discards it. This is used when you need to process data but don’t need to display it.
Branch-2 (Data Path):
1. t. : The beginning of the second branch of the tee.
2. queue max-size-buffers=1 leaky=2: It buffers data with a maximum of 1 buffer and discards (leaks) buffers if they are not processed in time. This helps manage latency.
3. ima.sink_slave_0 ima.src_slave_0: This is the same vvas_xmetaafixer element named ”ima” that we used in the first branch. But here in the second branch (Data Path) we uses other two slave ports:
- ima.sink_slave_0: Receives data from the main video stream.
- ima.src_slave_0: Outputs the video frames affixed with the metadata extracted from the master port, handled in the first branch (Data Path).
4. queue: Buffers data
5. vvas_xfilter kernels-config=”/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json”: Draws bounding boxes around detected faces based on the AI inference results (metadata, previously affixed in the ima’s slave path).
6. queue: Buffers data again.
7. kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true: Renders the final video with overlaid bounding boxes on the DisplayPort monitor using the Xilinx KMS (Kernel Mode Setting) driver.
Explanation of vvas_xmetaaffixerThe vvas_xmetaaffixer element is crucial for managing metadata within the GStreamer pipeline. It takes video frames and metadata (e.g., detected faces) and ensures that this metadata is properly associated with the correct frames as they move through the pipeline.
Ports
- sink_master: Receives frames that have undergone AI inference and contain metadata. Frames have the same resolution as AI inference model needed, normally lower than main stream from video capture device.
- src_master: Outputs frames to a placeholder sink (fakesink), allowing metadata to be generated and handled.
- sink_slave_0: Receives the duplicated video frames from the second branch of the tee. These video frames can have much larger resolution than frames from AI inference.
- src_slave_0: Outputs frames with metadata (extracted from frames received on the master port) to the next element, which draws the detection results on the frames.
To control the rotator of the IP camera, we need to extract metadata from inference frames. Specifically, we need to extract the bounding box coordinates and sizes of detected faces. This involves installing a probe on the selected pad in the GStreamer pipeline and adding a callback function to process the metadata.
Steps to Extract Metadata
1. Install a Probe on the Selected Pad
We install a probe on the master source pad of the vvas_xmetaaffixer element. This allows us to intercept frames passing through this pad and extract the metadata.
2. Add a Callback Function
In the callback function, we call the get_meta() function to extract the metadata of the correct type (GstVvasInferenceMetaAPI). We then process this metadata to extract and print the bounding box coordinates of detected faces.
C++ Code Implementation
Below is the complete C++ code that demonstrates how to set up the probe, extract metadata, and print the bounding box coordinates of detected faces. This code runs in the Docker container from the Smartcam demo application provided by AMD Xilinx.
Create two files in a new test folder inside the running Docker container:
- eagleeye.cpp
- gst_inference_meta.h
Filename: eagleeye.cpp
#include <gst/gst.h>
#include <iostream>
#include "gst_inference_meta.h"
// Function to recursively traverse predictions and print bounding boxes
void print_bounding_box(GstInferencePrediction * prediction) {
if (!prediction)
return;
BoundingBox bbox = prediction->bbox;
std::cout << "Bounding Box Coordinates: " << std::endl;
std::cout << "x: " << bbox.x << ", y: " << bbox.y << std::endl;
std::cout << "width: " << bbox.width << ", height: " << bbox.height << std::endl;
std::cout << " " << std::endl;
}
void handle_inference_meta(GstMeta *meta) {
static unsigned int frame_count = 0;
// handle only one frame per second and skip the others to limit debug prints
if ((frame_count++) % 25)
return;
GstInferenceMeta * inference_meta = reinterpret_cast<GstInferenceMeta *>(meta);
if (inference_meta && inference_meta->prediction) {
GstInferencePrediction * prediction = inference_meta->prediction;
std::cout << "=========== Metadata found ===========" << std::endl;
std::cout << "Frame No: " << frame_count << std::endl;
std::cout << " " << std::endl;
print_bounding_box(prediction);
std::cout << "-------------------------" << std::endl;
GNode * predictions = prediction->predictions;
if (predictions) {
for (GNode * node = predictions->children; node; node = node->next) {
GstInferencePrediction * child_pred = (GstInferencePrediction *) node->data;
print_bounding_box(child_pred);
}
}
}
}
// Callback function to handle buffer probes and extract metadata
GstPadProbeReturn probe_callback(GstPad *pad, GstPadProbeInfo *info, gpointer user_data) {
GstBuffer *buffer = GST_PAD_PROBE_INFO_BUFFER(info);
if (!buffer) {
return GST_PAD_PROBE_OK;
}
static GType api_type = g_type_from_name("GstVvasInferenceMetaAPI");
GstMeta * meta = gst_buffer_get_meta(buffer, api_type);
// Print or process the overlay metadata
if (meta) {
handle_inference_meta(meta);
}
return GST_PAD_PROBE_OK;
}
int main(int argc, char *argv[]) {
GstElement *pipeline, *ima;
GstBus *bus;
GstMessage *msg;
// Initialize GStreamer
gst_init(&argc, &argv);
// Build the pipeline
pipeline = gst_parse_launch("rtspsrc location=rtsp://192.168.1.11:554/stream1 ! "
"rtph265depay ! h265parse ! omxh265dec ! videorate ! videoconvert ! "
"video/x-raw, framerate=30/1, format=NV12 ! tee name=t ! queue ! "
"vvas_xmultisrc kconfig=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json\" ! queue ! "
"vvas_xfilter kernels-config=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json\" ! "
"ima.sink_master vvas_xmetaaffixer name=ima ima.src_master ! fakesink "
"t. ! queue max-size-buffers=1 leaky=2 ! ima.sink_slave_0 ima.src_slave_0 ! queue ! "
"vvas_xfilter name=\"draw\" kernels-config=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json\" ! queue ! "
"kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true", nullptr);
if (!pipeline) {
std::cerr << "Failed to create pipeline" << std::endl;
return -1;
}
// Get the vvas_xmetaaffixer element
ima = gst_bin_get_by_name(GST_BIN(pipeline), "ima");
if (!ima) {
std::cerr << "Failed to get ima element" << std::endl;
gst_object_unref(pipeline);
return -1;
}
// Attach a probe to the src pad of the ima element
GstPad *src_pad = gst_element_get_static_pad(ima, "src_master");
if (!src_pad) {
std::cerr << "Failed to get src_master pad from ima" << std::endl;
gst_object_unref(ima);
gst_object_unref(pipeline);
return -1;
}
gst_pad_add_probe(src_pad, GST_PAD_PROBE_TYPE_BUFFER, probe_callback, nullptr, nullptr);
gst_object_unref(src_pad);
// Start playing
gst_element_set_state(pipeline, GST_STATE_PLAYING);
// Wait until error or EOS
bus = gst_element_get_bus(pipeline);
msg = gst_bus_timed_pop_filtered(bus, GST_CLOCK_TIME_NONE,
static_cast<GstMessageType>(GST_MESSAGE_ERROR | GST_MESSAGE_EOS));
// Parse message
if (msg != nullptr) {
GError *err;
gchar *debug_info;
switch (GST_MESSAGE_TYPE(msg)) {
case GST_MESSAGE_ERROR:
gst_message_parse_error(msg, &err, &debug_info);
std::cerr << "Error received from element " << GST_OBJECT_NAME(msg->src) << ": " << err->message << std::endl;
std::cerr << "Debugging information: " << (debug_info ? debug_info : "none") << std::endl;
g_clear_error(&err);
g_free(debug_info);
break;
case GST_MESSAGE_EOS:
std::cout << "End-Of-Stream reached." << std::endl;
break;
default:
// We should not reach here because we only asked for ERRORs and EOS
std::cerr << "Unexpected message received." << std::endl;
break;
}
gst_message_unref(msg);
}
// Free resources
gst_object_unref(bus);
gst_element_set_state(pipeline, GST_STATE_NULL);
gst_object_unref(ima);
gst_object_unref(pipeline);
return 0;
}
And also the included header file that defines the GstInferenceMeta data structure. For more information about the Inference Meta data structures, please consult the official VVAS 2.0 documentation from AMD Xilinx.
Filename gst_inference_meta.h:
#define MAX_SEGOUTFMT_LEN 6
enum seg_type
{
SEMANTIC = 0,
MEDICAL,
SEG3D,
};
typedef struct _Segmentation {
enum seg_type type;
uint32_t width;
uint32_t height;
char fmt[MAX_SEGOUTFMT_LEN];
void *data;
bool (*free) (void *);
bool (*copy) (const void *, void *);
} Segmentation;
typedef struct _VvasColorMetadata {
uint8_t red;
uint8_t green;
uint8_t blue;
uint8_t alpha;
} VvasColorMetadata;
typedef struct _BoundingBox
{
gint x;
gint y;
guint width;
guint height;
VvasColorMetadata box_color;
} BoundingBox;
/**
* GstInferencePrediction:
* @prediction_id: unique id for this specific prediction
* @enabled: flag indicating whether or not this prediction should be
* used for further inference
* @bbox: the BoundingBox for this specific prediction
* @classifications: a linked list of GstInfereferenceClassification
* associated to this prediction
* @predictions: a n-ary tree of child predictions within this
* specific prediction. It is recommended to access the tree
* directly, but to use this module's API to interact with the
* children.
* @sub_buffer: A buffer created from the main buffer.
* @bbox_scaled: bbox co-ordinates scaled to root node resolution or not
* @segmentation: contains output of segmentation model wrapped in GstBuffer
* @obj_track_label: This is currently unused, kept for future use
* Abstraction that represents a prediction
*/
typedef struct _GstInferencePrediction GstInferencePrediction;
struct _GstInferencePrediction
{
/*<private>*/
GstMiniObject base;
GMutex mutex;
/*<public>*/
guint64 prediction_id;
gboolean enabled;
BoundingBox bbox;
GList * classifications;
GNode * predictions;
GstBuffer *sub_buffer;
gboolean bbox_scaled; /* bbox co-ordinates scaled to root node resolution or not */
Segmentation segmentation;
gchar *obj_track_label;
/* for future extension */
void * reserved_1;
void * reserved_2;
void * reserved_3;
void * reserved_4;
void * reserved_5;
};
/**
* Implements the placeholder for inference information.
*/
typedef struct _GstInferenceMeta GstInferenceMeta;
struct _GstInferenceMeta
{
GstMeta meta;
GstInferencePrediction *prediction;
gchar *stream_id;
};
Step-by-Step Instructions
Ensure the Smartcam firmware is loaded on the FPGA before starting the Docker container.
# Disable the desktop environment
sudo xmutil desktop_disable
# load smartcam firmware
sudo xmutil unloadapp
sudo xmutil loadapp kv260-smartcam
1. Start the Smartcam Docker Container (if it’s not already running)
docker run \
--rm \
--env="DISPLAY" \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--net=host \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it xilinx/smartcam:2022.1 bash
2. Create New Files
Create new files named eagleeye.cpp and gst_inference_meta.h and copy the above C++ code into it.
To edit files we need to first install a file editor. We used VIM in this example. We also need to instal the g++ compiler to compile the code.
apt update
apt install vim g++
mkdir test
cd test
touch eagleeye.cpp
touch gst_inference_meta.h
Copy the content above to these newely created files.
3. Compile the Code
Go the directory where the eagleeye.cpp is located and run the following command to compile the code:
g++ -o eagleeye eagleeye.cpp `pkg-config --cflags --libs gstreamer-1.0`
This command uses pkg-config to fetch the necessary flags and libraries for GStreamer.
4. Run the Program
After compiling, run the program using the following command:
./eagleeye
This will start the GStreamer pipeline, and the program will begin processing the frames, printing the coordinates of the bounding boxes for detected faces to the console approximately once per second.
=========== Metadata found ===========
Frame No: 601
Bounding Box Coordinates:
x: 0, y: 0
width: 640, height: 360
-------------------------
Bounding Box Coordinates:
x: 396, y: 3
width: 131, height: 100
Bounding Box Coordinates:
x: 98, y: 236
width: 24, height: 30
Please note that the first printed bounding box coordinates are always 0, 0 with the width and height set to the inference frame resolution of 640x360. These values are extracted from the GstInferencePrediction structure, which serves as the parent node of the linked list containing all the bounding box data structures for the detected faces. This resolution, along with the bounding box positions relative to the frame, will be needed later to calculate the required camera angle adjustments.
Explanation
• Pipeline Setup: The GStreamer pipeline is configured to process an RTSP video stream, perform face detection, and output the results.
• Probe Installation: A probe is installed on the master source pad of the vvas_xmetaaffixer element to intercept the frames and extract metadata.
• Metadata Handling: The probe_callback function retrieves the inference metadata from the buffer and passes it to handle_inference_meta for processing. The bounding box coordinates are printed for each detected face.
• Compilation and Execution: The code is compiled using g++ and run to test the functionality. For now, it only prints the detected bounding boxes without any further processing.
This tutorial section provides a clear and detailed explanation of how to extract and handle metadata in a GStreamer pipeline using C++. By following these instructions, readers can set up and test the functionality on their own systems.
Selecting the Biggest Bounding BoxIn this section, we will extend the previous example to select the largest bounding box from the detected faces. The selection will be based on the width of the bounding box for simplicity. Additionally, we will add a queue element named “probe” with the leaky=2 parameter before the fakesink element in the GStreamer pipeline to handle latency by skipping frames if necessary, and attach the probe to that queue element.
We also rearrange the bounding box print function to prepare for ROS2 integration. This function will be replaced with a ROS2 function to publish the topic about the closest or largest detected face. Currently, we print out only the largest bounding box once per second for debugging purposes. This interval is sufficient given that our video stream runs at 25 fps. We also handle cases where no faces are detected.
Below is the updated eagleeye.cpp file:
#include <gst/gst.h>
#include <iostream>
#include "gst_inference_meta.h"
// Print the bounding box coordinates
void publish_max_bounding_box(const BoundingBox *res_bbox, const BoundingBox *face_bbox) {
static unsigned int frame_count = 0;
// Handle only one frame per second and skip the others
if ((frame_count++) % 25)
return;
if (face_bbox != nullptr) {
std::cout << "=========== Face Detected ===========" << std::endl;
std::cout << "Frame No: " << frame_count << std::endl;
std::cout << " " << std::endl;
std::cout << "Inference Frame Resolution: " << std::endl;
std::cout << "width: " << res_bbox->width << ", height: " << res_bbox->height << std::endl;
std::cout << " " << std::endl;
std::cout << "Largest Bounding Box:" << std::endl;
std::cout << "x: " << face_bbox->x << ", y: " << face_bbox->y << std::endl;
std::cout << "width: " << face_bbox->width << ", height: " << face_bbox->height << std::endl;
std::cout << " " << std::endl;
} else {
std::cout << "------------- No Face ---------------" << std::endl;
std::cout << "Frame No: " << frame_count << std::endl;
std::cout << " " << std::endl;
}
}
// Handle the metadata
void handle_inference_meta(GstMeta *meta) {
GstInferenceMeta *inference_meta = reinterpret_cast<GstInferenceMeta *>(meta);
if (inference_meta && inference_meta->prediction) {
GstInferencePrediction *prediction = inference_meta->prediction;
// Parent bbox holds the resolution of the inference frame
BoundingBox *parent_bbox = &prediction->bbox;
// Note that holds all detected faces (child nodes)
GNode *predictions = prediction->predictions;
if (predictions) {
GstInferencePrediction *largest_child_pred = nullptr;
guint max_width = 0;
for (GNode *node = predictions->children; node; node = node->next) {
GstInferencePrediction *child_pred = (GstInferencePrediction *) node->data;
if (child_pred->bbox.width > max_width) {
max_width = child_pred->bbox.width;
largest_child_pred = child_pred;
}
}
if (largest_child_pred) {
publish_max_bounding_box(parent_bbox, &largest_child_pred->bbox);
}
}
}
}
// Callback function to handle buffer probes and extract metadata
GstPadProbeReturn probe_callback(GstPad *pad, GstPadProbeInfo *info, gpointer user_data) {
GstBuffer *buffer = GST_PAD_PROBE_INFO_BUFFER(info);
if (!buffer) {
return GST_PAD_PROBE_OK;
}
static GType api_type = g_type_from_name("GstVvasInferenceMetaAPI");
GstMeta *meta = gst_buffer_get_meta(buffer, api_type);
// Print or process the overlay metadata
if (meta) {
handle_inference_meta(meta);
} else {
publish_max_bounding_box(nullptr, nullptr);
}
return GST_PAD_PROBE_OK;
}
int main(int argc, char *argv[]) {
GstElement *pipeline, *probe;
GstBus *bus;
GstMessage *msg;
// Initialize GStreamer
gst_init(&argc, &argv);
// Build the pipeline
pipeline = gst_parse_launch("rtspsrc location=rtsp://192.168.1.11:554/stream1 ! "
"rtph265depay ! h265parse ! omxh265dec ! videorate ! videoconvert ! "
"video/x-raw, framerate=30/1, format=NV12 ! tee name=t ! queue ! "
"vvas_xmultisrc kconfig=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json\" ! queue ! "
"vvas_xfilter kernels-config=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json\" ! "
"ima.sink_master vvas_xmetaaffixer name=ima ima.src_master ! queue name=probe max-size-buffers=1 leaky=2 ! fakesink "
"t. ! queue max-size-buffers=1 leaky=2 ! ima.sink_slave_0 ima.src_slave_0 ! queue ! "
"vvas_xfilter name=\"draw\" kernels-config=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json\" ! queue ! "
"kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true", nullptr);
if (!pipeline) {
std::cerr << "Failed to create pipeline" << std::endl;
return -1;
}
// Get the vvas_xmetaaffixer element
probe = gst_bin_get_by_name(GST_BIN(pipeline), "probe");
if (!probe) {
std::cerr << "Failed to get probe element" << std::endl;
gst_object_unref(pipeline);
return -1;
}
// Attach a probe to the src pad of the probe element
GstPad *src_pad = gst_element_get_static_pad(probe, "src");
if (!src_pad) {
std::cerr << "Failed to get src pad from probe" << std::endl;
gst_object_unref(probe);
gst_object_unref(pipeline);
return -1;
}
gst_pad_add_probe(src_pad, GST_PAD_PROBE_TYPE_BUFFER, probe_callback, nullptr, nullptr);
gst_object_unref(src_pad);
// Start playing
gst_element_set_state(pipeline, GST_STATE_PLAYING);
// Wait until error or EOS
bus = gst_element_get_bus(pipeline);
msg = gst_bus_timed_pop_filtered(bus, GST_CLOCK_TIME_NONE,
static_cast<GstMessageType>(GST_MESSAGE_ERROR | GST_MESSAGE_EOS));
// Parse message
if (msg != nullptr) {
GError *err;
gchar *debug_info;
switch (GST_MESSAGE_TYPE(msg)) {
case GST_MESSAGE_ERROR:
gst_message_parse_error(msg, &err, &debug_info);
std::cerr << "Error received from element " << GST_OBJECT_NAME(msg->src) << ": " << err->message << std::endl;
std::cerr << "Debugging information: " << (debug_info ? debug_info : "none") << std::endl;
g_clear_error(&err);
g_free(debug_info);
break;
case GST_MESSAGE_EOS:
std::cout << "End-Of-Stream reached." << std::endl;
break;
default:
// We should not reach here because we only asked for ERRORs and EOS
std::cerr << "Unexpected message received." << std::endl;
break;
}
gst_message_unref(msg);
}
// Free resources
gst_object_unref(bus);
gst_element_set_state(pipeline, GST_STATE_NULL);
gst_object_unref(probe);
gst_object_unref(pipeline);
return 0;
}
Explanation
In this code:
1. Publish Function: The publish_max_bounding_box function is designed to publish the bounding box of the largest detected face. It prints the bounding box coordinates once per second to reduce output frequency and manage frame rate.
2. Handling Metadata: The handle_inference_meta function processes the metadata and identifies the largest bounding box by comparing widths of all detected faces. This function is crucial for selecting the face to track.
3. Probe Callback: The probe_callback function is attached to the probe pad and extracts the metadata from the buffer. It calls handle_inference_meta to process the metadata.
4. GStreamer Pipeline: The GStreamer pipeline is set up with a queue element having leaky=2 to handle frame skipping. This ensures the system does not get bogged down by high latency.
5. Execution: The code is compiled (in the Docker container) with the command:
g++ -o eagleeye eagleeye.cpp `pkg-config --cflags --libs gstreamer-1.0`
5. and run with:
./eagleeye
This will start the GStreamer pipeline, and the program will begin processing the frames, printing only the coordinates of the largest bounding box of detected faces to the console. If no faces are detected, it will print a corresponding message, all approximately once per second.
------------- No Face ---------------
Frame No: 176
=========== Face Detected ===========
Frame No: 201
Inference Frame Resolution:
width: 640, height: 360
Largest Bounding Box:
x: 99, y: 236
width: 23, height: 29
This section prepares us for integrating ROS2 for publishing the position of the largest detected face. We handle frame skipping and print out bounding box coordinates to ensure functionality before moving on to further processing.
NOTE
Ensure you save your source files in a safe location before exiting the Docker environment, or you will lose everything. The safe place is the /home/ubuntu folder on your Kria board, but it is not accessible from the Docker container. To access the home folder, first copy the needed files to the /tmp folder, which is mapped to the Ubuntu environment. Importantly, before you restart your Kria board, move these files from the /tmp folder to the home directory, as the /tmp directory is usually cleaned after a restart. You can also commit a Docker image for later use, as demonstrated in the next section when we install the ROS2 platform.
Preparing the ROS2 Environment in DockerProblem Statement
To control the camera rotator based on the detected face’s bounding box coordinates, we need to send this information from our GStreamer application to a different process running in a ROS2 environment. Our current setup runs the GStreamer application in a Docker container, so we need to install ROS2 Humble within the Docker environment to facilitate this communication.
Installing ROS2 Humble in Docker
We’ll install the base version of ROS2 Humble inside the Docker container to save resources. The base version provides the necessary tools and libraries without the additional overhead of a full desktop installation.
Benefits of Modularity
One of the key benefits of ROS2 is its modularity. In our example, we already have ROS2 installed on the Kria board, but not in the Docker environment. We will use both ROS2 environments:
- Docker Environment: Runs the GStreamer pipeline and publishes the detected face’s relative position.
- Ubuntu Environment on Kria Board: Listens for the detected face data and controls the camera rotator.
This modularity offers several benefits:
1. Separation of Concerns: Each ROS2 node can be developed and run separately, facilitating easier debugging and maintenance.
2. Cross-Platform Communication: ROS2 nodes can communicate across different environments and machines, making it easier to distribute and manage tasks.
3. Future Extensions: This setup allows for future extensions, such as having a central computer controlling multiple rotating cameras, with each Kria board handling its own video stream.
Network Configuration for Docker
To ensure that the ROS2 node in Docker can communicate with the external world, we need to run the Docker container with the “--net=host” option. This creates a network bridge between the Docker container and the base Ubuntu environment on the Kria board, assigning the same IP address to the container as the board.
However, when the Docker container and the host share the same IP address, we encounter issues with ROS2 communication between two nodes, one in the Docker container and the other on the host. ROS2 mistakenly assumes that the nodes are on the same machine and attempts to use shared memory instead of UDP, which is not available between the Docker container and the host. This prevents communication. To resolve this, we need to start the Docker container without the “–net=host” option, forcing ROS2 to use UDP for communication with the host. This, however, prevents communication between nodes in the Docker container and a remote PC on the same local network.
In the following section, after installing ROS2 and starting to use it, we will run the Docker container without the “–net=host” option to enable communication between the container and the host Ubuntu on the Kria board.
Step-by-Step Instructions
We will use the same Docker container provided by AMD Xilinx in their Smartcam demo application as a basis, as it already contains everything needed for face detection, including VVAS with GStreamer, hardware acceleration, and DPU support on the FPGA. We have already successfully run our face extraction application on it.
Ensure the Smartcam firmware is loaded on the FPGA before starting the Docker container.
# Disable the desktop environment
sudo xmutil desktop_disable
# load smartcam firmware
sudo xmutil unloadapp
sudo xmutil loadapp kv260-smartcam
1. Start anew Docker Container:
- Start the Docker container for the Smartcam demo application:
docker run \
--rm \
--env="DISPLAY" \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--net=host \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it xilinx/smartcam:2022.1 bash
2. Install ROS2 Humble in Docker:
- Inside the running Docker container, install ROS2 Humble base version and colcon:
# Install ROS 2 humble base
# update the system first
apt update
# and install some needed tools first
apt install curl vim git
# get GPG key
curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg
# and install it
echo "deb [arch=arm64 signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2-testing/ubuntu jammy main" | tee /etc/apt/sources.list.d/ros2.list > /dev/null
# update the system again
apt update
# install ROS2 humble base version
apt install ros-humble-ros-base
# you’ll be asked for some location info during installation,
# enter the corresponding numbers
# install ros2 compilers
apt install ros-dev-tools
# setup the ROS2 environment
# edit ~/.bashrc startup script
vi ~/.bashrc
# and add the following at the end of file:
# ROS2 setup
source /opt/ros/humble/setup.bash
source /usr/share/colcon_argcomplete/hook/colcon-argcomplete.bash
# run the startup script
source ~/.bashrc
# check if everything works
ros2 —-help
colcon build --help
3. Install PySerial in Docker:
- To communicate with the rotator, we need the PySerial library, which can be installed using the pip command:
pip3 install pyserial
pip3 install pymodbus
- We also install the PyModbus library at this stage, as it will be required later for reading data from the inclinometer.
4. Save the Docker image:
- Open a new SSH session to your Kria board while the Docker container is still running.
- Get the running container ID using:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
924f0290ec9d xilinx/smartcam:2022.1 "bash" 25 minutes ago Up 25 minutes ecstatic_brown
Commit the Docker image to save it on the Kria’s SD card (it‘ll take some time):
docker commit <container_id> eagle-eye-ros-humble:1.0
sha256:f87fcadfb59aa69b9e6ebb61133b0d76f7e06b971…..
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
eagle-eye-ros-humble 1.0 f87fcadfb59a 49 seconds ago 1.96GB
By following these steps, you will have a Docker image with ROS2 Humble installed and ready to facilitate the communication between the GStreamer application and the ROS2 environment. This ensures a modular and scalable system setup, allowing for flexible development and deployment of various components in your application.
Each time you need to start the ROS2-based application, use the saved Docker image with the same parameters as before, but omit the “—net=host \
” option if you want to listen to ROS2 topic messages on the Kria Ubuntu system that are sent from the Docker container. Use the previously saved image with the following command:
docker run \
--rm \
--env="DISPLAY" \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it eagle-eye-ros-humble:1.0 bash
Creating the First ROS2 Node and Running the GStreamer PipelineIn this section, we will extend our application by creating a ROS2 node that integrates our GStreamer pipeline. This node will publish the largest detected bounding box coordinates on a ROS2 topic.
Steps to Set-Up the ROS2 Node
1. Run the Previously Saved Docker Image (if not already running)
Start your Docker container with the saved image that includes the ROS2 Humble installation. Omit the —net=host
” option, if you want to listen to ROS2 topic messages on the Kria Ubuntu system.
2. Go to the Home directory
cd ~/
3. Create ROS2 Packages
We create eagle_eye_pkg package and the eagle_eye_interface.
To save some time, clone our GitHub repository and navigate to the eagle_eye_pkg package folder. Replace the existing gstreamer_pipeline.cpp file with the one provided below in the Node Code Explanation section or take it from the Attachment Section.
This replacement is necessary because we are working still within the Smartcam Docker container, not yet within the final one, and all paths to the models and VVAS libraries need to point to the correct file locations.
cd ~
git clone https://github.com/s59mz/eagle-eye-ai.git
cd eagle-eye-ai/ros2_ws/src/eagle_eye_pkg/src
# replace the current gstreamer_pipeline.cpp with the one below
rm gstreamer_pipeline.cpp
# add new content - see below
vi gstreamer_pipeline.cpp
4. Build the Project
Use colcon to build the project:
cd /root/eagle-eye-ai/ros2_ws
colcon build --packages-select eagle_eye_pkg eagle_eye_interfaces
5. Set Environment
source install/setup.bash
6. Run the ROS2 Node
ros2 run eagle_eye_pkg gstreamer_pipeline
7. Check the ROS2 Topic messages
By starting the Docker image in isolated network mode (without the –net=host
option), we can listen to topic messages from the previously started ROS2 node directly on the Kria board. To do this, open a new SSH session to the Kria board.
Since ROS2 is already installed on the host Ubuntu too, we can listen to these messages by using the ROS2 command-line tool. However, before we do this, we need to install the eagle_eye_interfaces ROS2 package so that ROS2 can recognize the eagle_eye_interfaces/msg/FaceDetect messages.
To install the eagle_eye_interfaces, follow these steps:
- Open a new SSH session to the Kria board from the host PC.
- Then, execute the following commands:
# get the sources or copy them from the Docker container
git clone https://github.com/s59mz/eagle-eye-ai.git
cd eagle-eye-ai/ros2_ws
# build the interface package
colcon build --packages-select eagle_eye_interfaces
source install/setup.bash
# check if interface is installed correctly
ros2 interface show eagle_eye_interfaces/msg/FaceDetect
Now, we can listen for the metadata of the detected faces with the largest bounding box by running this command:
ros2 topic echo /face_detect
The output should appear as follows:
---
frame_width: 640
frame_height: 360
bbox_x: 346
bbox_y: 208
bbox_width: 17
bbox_height: 22
face_detected: true
---
NOTE
You can also listen to these topic messages on the host PC. However, in this case, the Docker container running the GStreamer pipeline should be started in network bridged mode using the --net=host
option.
Save the Docker Image
Ensure you save the running Docker container to avoid losing all newly added ROS2 code. First, stop the running ROS2 nodes, but do not exit the Docker container yet. Instead, open another SSH session to the Kria board while the container is still running. Commit this container under the same Docker image name using the command:
docker commit <container_id> eagle-eye-ros-humble:1.0
Node Code ExplanationHere is the gstreamer_pipeline.cpp file from the eagle_eye_pkg, which defines the ROS2 node gstreamer_pipeline.
#include <gst/gst.h>
#include <rclcpp/rclcpp.hpp>
#include "eagle_eye_interfaces/msg/face_detect.hpp"
#include "eagle_eye_pkg/gstinferencemeta.h"
class GStreamerPipeline : public rclcpp::Node {
public:
GStreamerPipeline()
: Node("gstreamer_pipeline") {
publisher_ = this->create_publisher<eagle_eye_interfaces::msg::FaceDetect>("face_detect", 10);
// Initialize GStreamer
gst_init(nullptr, nullptr);
// Build the pipeline
pipeline_ = gst_parse_launch("rtspsrc location=rtsp://192.168.1.11:554/stream1 ! "
"rtph265depay ! h265parse ! omxh265dec ! videorate ! videoconvert ! "
"video/x-raw, framerate=30/1, format=NV12 ! tee name=t ! queue ! "
"vvas_xmultisrc kconfig=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json\" ! queue ! "
"vvas_xfilter kernels-config=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json\" ! "
"ima.sink_master vvas_xmetaaffixer name=ima ima.src_master ! queue name=probe max-size-buffers=1 leaky=2 ! fakesink "
"t. ! queue max-size-buffers=1 leaky=2 ! ima.sink_slave_0 ima.src_slave_0 ! queue ! "
"vvas_xfilter name=\"draw\" kernels-config=\"/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json\" ! queue ! "
"kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true", nullptr);
if (!pipeline_) {
RCLCPP_ERROR(this->get_logger(), "Failed to create pipeline");
rclcpp::shutdown();
return;
}
// Get the vvas_xmetaaffixer element
probe_ = gst_bin_get_by_name(GST_BIN(pipeline_), "probe");
if (!probe_) {
RCLCPP_ERROR(this->get_logger(), "Failed to get probe element");
gst_object_unref(pipeline_);
rclcpp::shutdown();
return;
}
// Attach a probe to the src pad of the probe element
GstPad *src_pad = gst_element_get_static_pad(probe_, "src");
if (!src_pad) {
RCLCPP_ERROR(this->get_logger(), "Failed to get src pad from probe");
gst_object_unref(probe_);
gst_object_unref(pipeline_);
rclcpp::shutdown();
return;
}
gst_pad_add_probe(src_pad, GST_PAD_PROBE_TYPE_BUFFER, probe_callback, this, nullptr);
gst_object_unref(src_pad);
// Start playing
gst_element_set_state(pipeline_, GST_STATE_PLAYING);
}
~GStreamerPipeline() {
// Free resources
gst_element_set_state(pipeline_, GST_STATE_NULL);
gst_object_unref(probe_);
gst_object_unref(pipeline_);
}
void publish_max_bounding_box(const BoundingBox *res_bbox, const BoundingBox *face_bbox) {
static unsigned int frame_count = 0;
// Handle only one frame per second and skip the others
if ((frame_count++) % 2)
return;
auto msg = eagle_eye_interfaces::msg::FaceDetect();
if (face_bbox != nullptr) {
msg.frame_width = res_bbox->width;
msg.frame_height = res_bbox->height;
msg.bbox_x = face_bbox->x;
msg.bbox_y = face_bbox->y;
msg.bbox_width = face_bbox->width;
msg.bbox_height = face_bbox->height;
msg.face_detected = true;
} else {
msg.frame_width = 0;
msg.frame_height = 0;
msg.bbox_x = 0;
msg.bbox_y = 0;
msg.bbox_width = 0;
msg.bbox_height = 0;
msg.face_detected = false;
}
publisher_->publish(msg);
}
static void handle_inference_meta(GstMeta *meta, GStreamerPipeline *node) {
GstInferenceMeta *inference_meta = reinterpret_cast<GstInferenceMeta *>(meta);
if (inference_meta && inference_meta->prediction) {
GstInferencePrediction *prediction = inference_meta->prediction;
// Parent bbox holds the resolution of the inference frame
BoundingBox *parent_bbox = &prediction->bbox;
// Note that holds all detected faces (child nodes)
GNode *predictions = prediction->predictions;
if (predictions) {
GstInferencePrediction *largest_child_pred = nullptr;
guint max_width = 0;
for (GNode *node = predictions->children; node; node = node->next) {
GstInferencePrediction *child_pred = (GstInferencePrediction *) node->data;
if (child_pred->bbox.width > max_width) {
max_width = child_pred->bbox.width;
largest_child_pred = child_pred;
}
}
if (largest_child_pred) {
node->publish_max_bounding_box(parent_bbox, &largest_child_pred->bbox);
return;
}
}
}
node->publish_max_bounding_box(nullptr, nullptr);
}
static GstPadProbeReturn probe_callback(GstPad *pad, GstPadProbeInfo *info, gpointer user_data) {
if (!pad) {
return GST_PAD_PROBE_OK;
}
GStreamerPipeline *node = reinterpret_cast<GStreamerPipeline *>(user_data);
GstBuffer *buffer = GST_PAD_PROBE_INFO_BUFFER(info);
if (!buffer) {
node->publish_max_bounding_box(nullptr, nullptr);
return GST_PAD_PROBE_OK;
}
static GType api_type = g_type_from_name("GstVvasInferenceMetaAPI");
GstMeta *meta = gst_buffer_get_meta(buffer, api_type);
if (meta) {
handle_inference_meta(meta, node);
} else {
node->publish_max_bounding_box(nullptr, nullptr);
}
return GST_PAD_PROBE_OK;
}
private:
GstElement *pipeline_;
GstElement *probe_;
rclcpp::Publisher<eagle_eye_interfaces::msg::FaceDetect>::SharedPtr publisher_;
};
int main(int argc, char *argv[]) {
rclcpp::init(argc, argv);
auto node = std::make_shared<GStreamerPipeline>();
rclcpp::spin(node);
rclcpp::shutdown();
return 0;
}
And the Interface File: FaceDetect.msg:
# video frame resolution
int32 frame_width
int32 frame_height
# boundary box of detected face
int32 bbox_x
int32 bbox_y
int32 bbox_width
int32 bbox_height
# inference status
bool face_detected
This code sets up a ROS2 node named gstreamer_pipeline that uses the GStreamer pipeline to process video frames, detects faces, and publishes the largest bounding box coordinates to the face_detect topic. If no faces are detected, a message indicating this is published.
Detailed Description of the ROS2 Node CodeThe ROS2 node GStreamerPipeline is a critical component of this project, designed to process video streams, detect faces, and publish the largest detected face’s bounding box coordinates. Below is a detailed breakdown of the code and its functionality.
Header Inclusions and Namespace
- GStreamer (gst.h): Provides the necessary functions and data structures for the GStreamer multimedia framework.
- rclcpp (rclcpp.hpp): Includes the ROS2 C++ client library, allowing for the creation of ROS2 nodes.
- FaceDetect Message (face_detect.hpp): Defines the custom message type for publishing face detection results.
- Eagleeye (eagleeye.h): Declares Inference Meta data structures.
Class Definition
class GStreamerPipeline : public rclcpp::Node
- GStreamerPipeline Class: Inherits from rclcpp::Node, making it a ROS2 node. It contains methods for initializing the GStreamer pipeline, handling inference metadata, and publishing detected face information.
Constructor
- ROS2 Publisher: Creates a ROS2 publisher that will publish messages of type FaceDetect on the topic face_detect.
- GStreamer Initialization: Initializes the GStreamer library.
- Pipeline Construction: Constructs the GStreamer pipeline using gst_parse_launch, specifying the various elements for decoding, converting, and processing the video stream.
- Error Handling: Checks if the pipeline and probe elements are successfully created. If not, logs an error and shuts down the node.
- Probe Attachment: Attaches a probe to the “src” pad of the “probe” element to inspect and handle buffers passing through it.
- Pipeline Start: Sets the pipeline to the playing state.
Destructor
- Resource Cleanup: Ensures that the GStreamer pipeline and elements are properly cleaned up when the node is destroyed.
Publishing Method
void GStreamerPipeline::publish_max_bounding_box(const BoundingBox *res_bbox, const BoundingBox *face_bbox)
- Frame Skipping(optional): Publishes messages only once per N frames to avoid overloading the system.
- Message Population: Fills the FaceDetect message with the bounding box coordinates and the frame resolution. If no face is detected, sets all values to zero and face_detected to false.
- Publishing: Publishes the message on the face_detect topic.
Inference Metadata Handling
void GStreamerPipeline::handle_inference_meta(GstMeta *meta, GStreamerPipeline *node)
- Extracts face detection results from the inference metadata. Finds the largest detected face based on the bounding box width and calls publish_max_bounding_box to publish the result.
Probe Callback
GstPadProbeReturn GStreamerPipeline::probe_callback(GstPad *pad, GstPadProbeInfo *info, gpointer user_data)
- Pad Probe Callback: Called for each buffer passing through the probe element. Extracts inference metadata and handles it, or publishes a empty message if no metadata is present.
Main Function
- ROS2 Initialization: Initializes the ROS2 client library.
- Node Creation: Creates an instance of the GStreamerPipeline node.
- Node Spinning: Keeps the node running, processing callbacks.
- ROS2 Shutdown: Cleans up and shuts down the ROS2 client library when the node stops.
This detailed explanation covers the structure and functionality of the GStreamerPipeline ROS2 node, highlighting its role in processing video streams and publishing face detection results.
Controlling the Camera RotatorAt the beginning of this project, we tested communication with the rotator using a USB to RS485 dongle and hardcoded Pelco-D protocol messages sent directly from the console. This testing phase highlighted two main challenges we need to address:
1. Building a ROS2 Node for Motor Control:
- We need to develop a ROS2 node capable of controlling the rotator motor via Pelco-D protocol messages. This node should accept input commands for pan and tilt directions and speeds, as well as stop commands.
- Additionally, we need to control the orange signaling lamp on the rotator using an auxiliary switch. The lamp should turn on when a face is detected and turn off when no faces are detected.
2. RS485 Communication Challenges:
- The current Kria board platform from Smartcam demo application does not support the RS485 PMOD module yet. As a workaround, we can use the host PC or the Kria board itself with a USB dongle for RS485 communication.
- In the future, we plan to demonstrate the full potential of the Kria board by utilizing its PMOD interfaces. This involves building a new hardware platform to support these interfaces and using a custom-made RS485 PMOD module. The PCB, schematic, and gerber files for this module are freely available on our GitHub. You can order the PMOD module (either as a bare PCB or fully assembled) from the JLCPCB website using the provided gerbers.zip file. In the final application, we will use this PMOD module for Pelco-D communication with the rotator.
To control the rotator motor, we will build a ROS2 node that listens for motor command messages and sends the appropriate Pelco-D protocol messages. The node will also control the signaling lamp based on face detection status.
Here are the message definitions for the rotator_controller ROS2 node:
- rotator_interfaces/msg/MotorCmd.msg:
# pan-left is positive
# pan-right is negative
# pan value is from -63 to +63, 255 is turbo
int32 pan_speed
# tilt up is positive
# tilt down is negative
# tilt value is from -63 to +63
int32 tilt_speed
# stop is both zero
- rotator_interfaces/msg/SwitchCmd.msg:
bool switch_on
Implementation Steps
1. Open a new SSH session to your Kria board or use your host PC with ROS2 Humble installed. Ensure you have installed the patched CH-340 USB to serial driver mentioned in the earlier sections, as the USB to RS485 dongle will not work properly without it. Connect the USB to RS485 dongle to the host PC or Kria board.
NOTE
The PMOD Module should not be installed on Kria board, bacause we dont have the right platform image yet.
If you have some issues installing the patched CH340 driver on Kria board, try the following:
# plug the RS485-USB dongle to Kria
# then unload the driver
sudo rmmod ch341
# unplug and plug again the RS485-USB dongle - very quickly!
# check if device is created successfully
ls /dev/ttyUSB*
# if not, then unload the ch341 driver again and re-plug the dongle quickly
# then try tu turn the lamp on the top of camera on:
echo -en '\xff\x01\x00\x09\x00\x01\x0b' > /dev/ttyUSB0
echo -en '\xff\x01\x00\x0b\x00\x01\x0d' > /dev/ttyUSB0
# and repeat the process if not work. This driver is very unstable!
2. Go to the ROS2 workspace:
Navigate to the previously cloned Git repository on the Kria Ubuntu system, not within the Docker container, as the gstreamer_pipeline node is already present there and is not yet configured for running multiple nodes.
cd ~/eagle-eye-ai/ros2_ws
3. Build the rotator_py_pkg
package:
colcon build --packages-select rotator_py_pkg rotator_interfaces
source install/setup.bash
4. Explore the rotator_controller
ROS2 node in the rotator_py_pkg package:
- Here is a basic implementation of the rotator_controller node that listens for MotorCmd and SwitchCmd messages and sends the corresponding Pelco-D protocol messages to control the motor and signaling lamp. Please note that the details of the Pelco-D protocol are beyond the scope of this project.
File name: rotator_controller.py
#!/usr/bin/env python3
import rclpy
import serial
from rclpy.node import Node
from rotator_interfaces.msg import MotorCmd, SwitchCmd
class RotatorControllerNode(Node):
def __init__(self):
super().__init__("rotator_controller")
# Declare parameters for serial communication
self.declare_parameter("port", "/dev/ttyUSB0")
self.declare_parameter("baudrate", "9600")
self.declare_parameter("address", 1)
# Retrieve parameter values
self.port_ = self.get_parameter("port").value
self.baudrate_ = self.get_parameter("baudrate").value
self.address_ = self.get_parameter("address").value
# Open serial port for communication with the RS485 dongle
self.serial_ = self.open_serial_port(self.port_, self.baudrate_)
# Initialize Pelco-D message buffer
self.pelcod_message_ = self.create_pelco_d_message()
# Create subscribers for motor control and switch control topics
self.sub_motor_control = self.create_subscription(
MotorCmd, "motor_control", self.callback_motor_control, 10
)
self.sub_switch_control = self.create_subscription(
SwitchCmd, "switch_control", self.callback_switch_control, 10
)
self.get_logger().info("Node Created")
def callback_switch_control(self, switch_cmd: SwitchCmd):
# Update Pelco-D message based on switch control command
self.create_pelco_d_message()
self.update_switch(switch_cmd.switch_on)
self.update_checksum()
self.send_message()
def callback_motor_control(self, motor_cmd: MotorCmd):
# Update Pelco-D message based on motor control command
self.create_pelco_d_message()
self.update_pan_speed(motor_cmd.pan_speed)
self.update_tilt_speed(motor_cmd.tilt_speed)
self.update_checksum()
self.send_message()
def update_switch(self, switch_on: bool):
# Update Pelco-D message to control switch
if switch_on:
cmd2 = 0x09 # Command to turn switch on
else:
cmd2 = 0x0b # Command to turn switch off
self.pelcod_message_[3] = 0xff & cmd2
self.pelcod_message_[5] = 0x01 # Auxiliary command for switch control
def update_pan_speed(self, speed: int):
# Update Pelco-D message with pan speed command
cmd2 = self.pelcod_message_[3]
cmd2 = cmd2 & ~0x06
if speed > 0: # Pan left
cmd2 = cmd2 | 0x04
elif speed < 0: # Pan right
cmd2 = cmd2 | 0x02
speed *= -1 # Convert negative speed to positive for command
self.pelcod_message_[3] = 0xff & cmd2
if speed > 0x3F:
speed = 0xff # Limit speed if necessary
self.pelcod_message_[4] = 0xff & speed
def update_tilt_speed(self, speed: int):
# Update Pelco-D message with tilt speed command
cmd2 = self.pelcod_message_[3]
cmd2 = cmd2 & ~0x18
if speed > 0: # Tilt up
cmd2 = cmd2 | 0x08
elif speed < 0: # Tilt down
cmd2 = cmd2 | 0x10
speed *= -1 # Convert negative speed to positive for command
self.pelcod_message_[3] = 0xff & cmd2
if speed > 0x3F:
speed = 0x3F # Limit speed if necessary
self.pelcod_message_[5] = 0xff & speed
def update_checksum(self):
# Calculate and update checksum of Pelco-D message
checksum = 0
for i in range(1, 6):
checksum += self.pelcod_message_[i]
self.pelcod_message_[6] = 0xFF & checksum
def create_pelco_d_message(self):
# Initialize Pelco-D message structure
self.pelcod_message_ = bytearray([0xff, self.address_, 0x00, 0x00, 0x00, 0x00, 0x00])
def send_message(self):
# Send Pelco-D message via serial port
self.serial_.write(self.pelcod_message_)
def open_serial_port(self, port, baudrate=9600, timeout=1):
# Open serial port for communication
try:
ser = serial.Serial(port, baudrate=baudrate, timeout=timeout)
self.get_logger().info(f"Serial port {port} opened successfully.")
return ser
except serial.SerialException as e:
self.get_logger().error(f"Failed to open serial port {port}: {e}")
return None
def close_serial_port(self):
# Close serial port
self.serial_.close()
self.serial_ = None
self.pelcod_message_ = None
self.get_logger().info(f"Serial port {self.port_} is now closed.")
def main(args=None):
rclpy.init(args=args)
node = RotatorControllerNode()
try:
rclpy.spin(node)
except KeyboardInterrupt:
node.close_serial_port()
return
node.close_serial_port()
rclpy.shutdown()
if __name__ == "__main__":
main()
Rotator Controller Node description
This Python script implements a ROS2 node named RotatorControllerNode that controls a camera rotator using the Pelco-D protocol via a USB to RS485 dongle. Here’s a breakdown of its functionality:
1. Imports and Setup:
- Imports necessary libraries (rclpy for ROS2 and serial for serial communication).
- Imports message types (MotorCmd and SwitchCmd) from custom ROS2 interfaces.
2. Node Initialization (__init__):
- Initializes the node with the name “rotator_controller”.
- Declares ROS2 parameters (port, baudrate, address) for configuring the serial port and Pelco-D protocol.
- Retrieves parameter values and initializes serial communication (open_serial_port).
- Creates subscriptions (subscriber_) to ROS2 topics (motor_control, switch_control) for receiving commands.
3. Callback Functions:
- callback_switch_control: Handles incoming messages from switch_control topic to turn the switch on/off based on SwitchCmd message.
- callback_motor_control: Handles incoming messages from motor_control topic to control pan and tilt movements based on MotorCmd message.
4. Pelco-D Protocol Functions:
- update_switch: Updates Pelco-D message buffer for controlling switch state (switch_on).
- update_pan_speed: Updates Pelco-D message buffer for setting pan speed (pan_speed).
- update_tilt_speed: Updates Pelco-D message buffer for setting tilt speed (tilt_speed).
- update_checksum: Calculates and updates checksum of the Pelco-D message buffer.
5. Message Creation (create_pelco_d_message):
- Initializes the Pelco-D message buffer with default values (sync byte, address, command bytes, data bytes, checksum).
6. Serial Communication:
- send_message: Sends the constructed Pelco-D message buffer over the serial port (serial_).
7. Serial Port Management:
- open_serial_port: Opens the serial port (port) with specified baudrate (baudrate) and timeout (timeout).
- close_serial_port: Closes the serial port and releases resources.
8. Main Functionality (main):
- Initializes the ROS2 node (RotatorControllerNode).
- Enters the ROS2 spin loop to process callbacks and maintain node activity.
- Handles keyboard interrupt (KeyboardInterrupt) to safely close the serial port (close_serial_port) before exiting.
This node enables communication between a ROS2-based system and a camera rotator via the Pelco-D protocol, facilitating control of pan/tilt movements and switch states based on incoming commands through ROS2 topics.
Test the Rotator Controller NodeAfter successfully building the node, you can test the Motor Controller as follows:
1. First, check the serial port and run the rotator_controller node on the host where the USB to RS485 dongle of the rotator is connected:
ls /dev/ttyU*
2. Run the rotator_controller node:
source install/setup.bash
ros2 run rotator_py_pkg rotator_controller --ros-args -p port:=/dev/ttyUSB0
Ensure that the correct /dev/ttyU*
port is selected.
Then, open another SSH session to the Kria board and publish the corresponding topic message to the rotator_controller node by running the following commands:
cd ~/eagle-eye-ai/ros2_ws
source install/setup.bash
# check the interface
ros2 interface show rotator_interfaces/msg/MotorCmd
# lamp on
ros2 topic pub --once /switch_control rotator_interfaces/msg/SwitchCmd "{switch_on: true}"
# lamp off
ros2 topic pub --once /switch_control rotator_interfaces/msg/SwitchCmd "{switch_on: false}"
# pan left
ros2 topic pub --once /motor_control rotator_interfaces/msg/MotorCmd "{pan_speed: 10, tilt_speed: 0}"
# pan right
ros2 topic pub --once /motor_control rotator_interfaces/msg/MotorCmd "{pan_speed: -10, tilt_speed: 0}"
# tilt up
ros2 topic pub --once /motor_control rotator_interfaces/msg/MotorCmd "{pan_speed: 0, tilt_speed: 10}"
# tilt down
ros2 topic pub --once /motor_control rotator_interfaces/msg/MotorCmd "{pan_speed: 0, tilt_speed: -10}"
# motor stop
ros2 topic pub --once /motor_control rotator_interfaces/msg/MotorCmd "{pan_speed: 0, tilt_speed: 0}"
The rotator should start moving in the selected direction, and the signaling lamp can be turned on or off. Successfully testing these commands sent from the ROS2 topic means we are ready for the next step.
Controlling the CameraIntroduction
In this section, we will integrate the components we have built so far: the GStreamer pipeline for face detection and the rotator controller node for motor control via the Pelco-D protocol. The final step is to create a camera controller ROS2 node that bridges these components, enabling automatic face tracking.
Overview
1. GStreamer Pipeline Node: This node runs a face detection inference, extracts bounding box coordinates, and publishes them as messages on a ROS2 topic.
2. Rotator Controller Node: This node communicates with the rotator using the Pelco-D protocol, receiving commands via ROS2 messages to control the motor for panning and tilting, as well as to switch an auxiliary lamp on or off.
Camera Controller Node
To close the control loop, we need a camera controller ROS2 node that accepts bounding box data from the detected face and controls the rotator motor accordingly. This node will:
- Subscribe to the face_detect topic to receive bounding box messages.
- Process the bounding box data to determine the necessary motor commands.
- Publish motor control commands to the motor_control topic.
- Manage the state of the auxiliary lamp via the switch_control topic.
State Machine for Control
The CameraControllerNode operates as a state machine with two states: idle and face tracking. When a face is detected, it switches from the idle state to the tracking state and calculates the motor speed based on the bounding box position relative to the frame center. This ensures the camera tracks the detected face smoothly. To prevent overloading the RS-485 communication protocol, which is slower than the frame rate, the node uses a timer to publish motor commands at a configurable interval.
If the face is no longer detected, the node counts the number of consecutive empty frames. If this count exceeds a configurable timeout, it switches back to the idle state. The auxiliary lamp indicates the state: on in tracking mode and off in idle mode. Additionally, the node implements hysteresis to avoid excessive motor adjustments.
Camera Controller Node Implementation
Here’s the Python code for the CameraControllerNode in the camera_controller.py
file:
#!/usr/bin/env python3
import rclpy
from rclpy.node import Node
from rotator_interfaces.msg import MotorCmd, SwitchCmd
from eagle_eye_interfaces.msg import FaceDetect
class CameraControllerNode(Node):
def __init__(self):
super().__init__("camera_controller")
# Get configurable parameters
self.declare_parameter("updating_period", 0.2)
self.updating_period_ = self.get_parameter("updating_period").value
self.declare_parameter("frame_timeout", 10)
self.frame_timeout_ = self.get_parameter("frame_timeout").value
self.subscriber_ = self.create_subscription(
FaceDetect, "face_detect", self.callback_face_detect, 10
)
# Create publishers
self.motor_publisher_ = self.create_publisher(MotorCmd, "motor_control", 10)
self.switch_publisher_ = self.create_publisher(SwitchCmd, "switch_control", 10)
self.lamp_on_ = False
self.face_detected_ = False
self.updating_timer_ = None
# Face tracking state
self.face_tracking_ = False
self.frame_count_ = 0
# Motor speed definition
self.pan_speed_ = 0
self.tilt_speed_ = 0
self.motor_cmd_ = MotorCmd()
self.switch_cmd_ = SwitchCmd()
self.update_motor_speed()
self.switch_lamp(False)
self.get_logger().info("Node Created")
def callback_face_detect(self, msg: FaceDetect):
if msg.face_detected:
self.frame_count_ = 0
# Calculate face position relative to frame center
center_x = msg.frame_width / 2
center_y = msg.frame_height / 2
face_center_x = msg.bbox_x + msg.bbox_width / 2
face_center_y = msg.bbox_y + msg.bbox_height / 2
self.offset_y = face_center_y - center_y
self.offset_x = face_center_x - center_x
# Calculate required motor speed for tracking
if abs(self.offset_x) > msg.bbox_width / 2:
self.pan_speed_ = int(-self.offset_x / 2)
else:
# Face is centered enough in X
self.pan_speed_ = 0
if abs(self.offset_y) > msg.bbox_height / 2:
self.tilt_speed_ = int(-self.offset_y)
else:
# Face is centered enough in Y
self.tilt_speed_ = 0
if not self.face_tracking_:
self.start_face_tracking()
else:
if self.face_tracking_:
self.frame_count_ += 1
if self.frame_count_ > self.frame_timeout_:
self.stop_face_tracking()
def start_face_tracking(self):
self.face_tracking_ = True
# Start the motor immediately
self.update_motor_speed()
# Turn the lamp on
self.switch_lamp(True)
# Start updating timer
if self.updating_timer_ is None:
self.updating_timer_ = self.create_timer(self.updating_period_, self.update_motor_speed)
def stop_face_tracking(self):
self.face_tracking_ = False
# Stop the motor immediately
self.pan_speed_ = 0
self.tilt_speed_ = 0
self.update_motor_speed()
# Turn the lamp off
self.switch_lamp(False)
# Stop updating timer
if self.updating_timer_:
self.updating_timer_.cancel()
self.updating_timer_ = None
def update_motor_speed(self):
self.motor_cmd_.pan_speed = self.pan_speed_
self.motor_cmd_.tilt_speed = self.tilt_speed_
self.motor_publisher_.publish(self.motor_cmd_)
def switch_lamp(self, lamp_state):
self.switch_cmd_.switch_on = lamp_state
self.switch_publisher_.publish(self.switch_cmd_)
def main(args=None):
rclpy.init(args=args)
node = CameraControllerNode()
try:
rclpy.spin(node)
except KeyboardInterrupt:
return
if __name__ == "__main__":
main()
This node ensures smooth and efficient control of the camera’s pan and tilt functions, allowing it to track detected faces accurately. It uses ROS2 for communication and leverages the GStreamer pipeline’s output for face detection.
Camera Controller Code DescriptionThe CameraControllerNode is a ROS2 node implemented in Python that manages the control of a camera rotator to track faces detected in a video stream. This node subscribes to face detection messages, processes the bounding box coordinates of detected faces, and publishes commands to control the rotator’s motors. It operates as a state machine with two states: idle and face tracking.
Key Components
1. Initialization:
- The node is initialized by defining its name and parameters.
- Parameters include updating_period (the interval for sending motor commands) and frame_timeout (the number of frames to wait before switching back to idle if no face is detected).
- The node subscribes to the face_detect topic to receive face detection messages and sets up publishers for motor_control and switch_control topics to control the rotator and the signaling lamp.
2. Face Detection Callback:
- callback_face_detect: This method processes incoming face detection messages. It calculates the position of the detected face relative to the center of the frame and determines the required motor speeds to center the face in the frame.
- If a face is detected, the node transitions to the face tracking state. If no face is detected for a specified number of frames, it transitions back to the idle state.
3. State Management:
- Idle State: The node does nothing in this state except waiting for face detection messages. The motor speeds are set to zero, and the signaling lamp is turned off.
- Face Tracking State: The node calculates the motor speeds based on the face position and publishes these speeds at regular intervals defined by the updating_period. The signaling lamp is turned on in this state.
4. Motor Speed and Lamp Control:
- update_motor_speed: This method constructs and publishes motor control commands based on the calculated motor speeds.
- switch_lamp: This method constructs and publishes commands to turn the signaling lamp on or off.
5. Timers:
- A timer is used to periodically send motor control commands while in the face tracking state. This ensures the motor commands are sent at a lower frequency than the frame rate to match the communication speed of the RS-485 protocol.
Conclusion
The CameraControllerNode effectively closes the loop for a face tracking system by using ROS2 to integrate face detection with motor control. It transitions between idle and face tracking states based on the presence of detected faces and uses a timer to manage the frequency of motor commands. This approach ensures smooth and efficient control of the camera rotator, making the overall system responsive and reliable.
Integrating the Camera ControllerTo fully integrate the CameraControllerNode and test the entire system, follow these steps:
Integrate the CameraControllerNode into the rotator_py_pkg (if you are not already using our code from GitHub):
1. Add the CameraControllerNode:
- Navigate to the rotator_py_pkg where the rotator controller is installed already.
- Add the camera_controller.py script to the package.
2. Update package.xml:
- Open the package.xml file and add the necessary dependencies for the camera controller. Ensure it includes dependencies for both the rotator interfaces and eagle eye interfaces.
<?xml version="1.0"?>
<?xml-model href="http://download.ros.org/schema/package_format3.xsd" schematypens="http://www.w3.org/2001/XMLSchema"?>
<package format="3">
<name>rotator_py_pkg</name>
<version>0.0.0</version>
<description>TODO: Package description</description>
<maintainer email="root@todo.todo">root</maintainer>
<license>TODO: License declaration</license>
<depend>rclpy</depend>
<depend>serial</depend>
<depend>rotator_interfaces</depend>
<depend>eagle_eye_interfaces</depend>
<test_depend>ament_copyright</test_depend>
<test_depend>ament_flake8</test_depend>
<test_depend>ament_pep257</test_depend>
<test_depend>python3-pytest</test_depend>
<export>
<build_type>ament_python</build_type>
</export>
</package>
3. Update setup.py:
- Open the setup.py file and ensure it includes the new script.
from setuptools import find_packages, setup
package_name = 'rotator_py_pkg'
setup(
name=package_name,
version='0.0.0',
packages=find_packages(exclude=['test']),
data_files=[
('share/ament_index/resource_index/packages',
['resource/' + package_name]),
('share/' + package_name, ['package.xml']),
],
install_requires=['setuptools'],
zip_safe=True,
maintainer='root',
maintainer_email='root@todo.todo',
description='TODO: Package description',
license='TODO: License declaration',
tests_require=['pytest'],
entry_points={
'console_scripts': [
"rotator_controller = rotator_py_pkg.rotator_controller:main",
"camera_controller = rotator_py_pkg.camera_controller:main"
],
},
)
4. Compile the Package:
- Compile the package using colcon build.
colcon build --packages-select rotator_py_pkg
5. Set Up the Environment:
• Source the environment setup script.
source install/setup.bash
Start the Whole System
Three nodes need to be run:
- gstreamer_pipeline in the eagle_eye_pkg on the Kria board in Docker environment,
- rotator_controller and camera_controller in the rotator_py_pkg on the remote host PC or directly on the Kria board.
All three created nodes should communicate with each other as depicted in the ROS2 node diagram below, taken from the RQt application.
To run the nodes and test the system, follow these steps:
1. Docker Setup on Kria Board:
Start the Docker container on the Kria board without the --net=host
option to enable communication between the Docker container and Ubuntu on Kria board.
docker run \
--rm \
—-net=host \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it eagle-eye-ros-humble:1.0 bash
- start the gstreamer_pipeline node:
cd /root/eagle-eye-ai/ros2_ws
source install/setup.bash
ros2 run eagle_eye_pkg gstreamer_pipeline
- Alternatively, if running all nodes on the Kria board (host Ubuntu), start the Docker container without the --net option and ensure the USB to RS485 dongle is installed on the Kria board instead on the Host PC.
2. Start Nodes on Host PC or Kria Ubuntu:
- Open two terminal windows or SSH sessions on the host PC or Kria Ubuntu, depent on how to run the Docker container (with or without the
--net=host
option). - In the first terminal, start the rotator_controller node:
ros2 run rotator_py_pkg rotator_controller --ros-args -p port:=/dev/ttyUSB0
- In the second terminal, start the camera_controller node:
ros2 run rotator_py_pkg camera_controller
3. Check Node Communications:
Verify that all ROS2 nodes have started and can communicate with each other correctly using the RQt application on your host PC, where two nodes are already running.
- Install the RQt application from here
- Enter the command ”rqt” in the terminal
- From the Plugins menu, select Introspection, then Node Graph
- Choose the option Nodes/Topics (active)
The displayed graph should resemble the one shown below:
Note about CH-360 Driver
Ensure the patched CH-360 driver is installed on the Kria board or Host PC (where the rotator_controller is).
cd CH341SER
sudo make load
sudo rmmod ch341
lsmod | grep ch34
# plug & unplug the USB device
sudo dmesg | tail
Testing the Whole Setup for the First TimeThe system should now be fully operational (without the inclinometer). When the camera detects a human face, it should start following it. The signaling lamp on top of the camera should turn on when a face is detected.
Move Face
Move your face around within the camera’s view and verify that the camera tracks the movement accurately, adjusting pan and tilt as needed.
Obstruct Face
Cover your face with an object such as a newspaper. After the specified timeout period, the signaling lamp should turn off, but the camera should remain in its last position.
Multiple Faces
- Introduce multiple faces into the camera’s view. The camera should always track the closest (largest) face. When the currently tracked face is no longer visible, the camera should switch to the next closest face.
- Repeat these tests several times to ensure the system functions as expected under various conditions.
Conclusion
By integrating the CameraControllerNode into the rotator_py_pkg and thoroughly testing the system, we have created a robust, closed-loop face tracking system. This system effectively combines face detection, motor control, and state management to maintain continuous tracking of detected faces. This prototype demonstrates the capability to follow moving faces and handle various scenarios such as occlusions and multiple targets, ensuring reliable performance.
Creating a New Platform with PMOD Hardware Interface Support for RS485 ModuleIn this section, we will create a new platform based on the existing Smartcam application demo platform, adding PMOD hardware interface support for the RS485 module. We will follow the official procedure using the Vitis Platform Flow as outlined in the AMD Xilinx tutorial.
This project uses the 2022.1 releases, so ensure you are using version 22.1 for tools, repositories, etc. We will base our work on the Smartcam platform kv260_ispMipiRx_vcu_DP from the AMD GitHub repository.
RS-485 PMOD ModuleIn our project, we will use a custom-made RS-485 PMOD module. The schematic, PCB layout, BOM, gerber, and placement files are freely available on our GitHub repository. You can order a fully assembled module or just a bare PCB directly from JLCPCB using the provided gerber files.
This module also includes two additional multipurpose LEDs and two buttons, which will be connected to the AXI GPIO IP block in our block design.
The module schematic diagram:
Prerequisites
Ensure you have installed Vitis and Vivado, both version 2022.1. Configure the environment with the following commands:
source /opt/Xilinx/Vitis/2022.1/settings64.sh
source /opt/Xilinx/Vivado/2022.1/settings64.sh
Obtaining the Platform
Since we are modifying the Smartcam platform kv260_ispMipiRx_vcu_DP, we first need to obtain the platform. Detailed instructions are available in the “Creating Vitis Platform” tutorial. Here are the specific commands to generate the platform:
1. Clone the Repository:
git clone --recursive --branch xlnx_rel_v2022.1 https://github.com/Xilinx/kria-vitis-platforms.git
2. Navigate to the Platform Directory:
cd kria-vitis-platforms/kv260/
We will refer to the above directory as $kv260-vitis,
consistent with the terminology used in AMD tutorials.
3. Build the Platform
make platform PFM=kv260_ispMipiRx_vcu_DP
cd $kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/project/
The build process will take approximately 40 minutes.
The vivado project will be at:
$kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/project/kv260_ispMipiRx_vcu_DP.xpr
Open the.xpr project in Vivado.
Applying the Patch to UartLite IP in Vivado for Using the RS-485 PMOD ModuleTo enable the use of the RS-485 PMOD module with UartLite IP in Vivado, apply the necessary patch. This is required because the RS-485 PMOD module uses the TX-enable signal for transmitting, which the current UartLite IP lacks. Follow these steps to apply the patch:
1. Locate the patch file:
- The patch file is located in the previously cloned repository:
kria-vitis-platforms/kr260/platforms/vivado/kr260_tsn_rs485pmod/ip/axi_uartlite_v2_0_rs485.patch
2. Copy the patch:
- Copy the patch to the Vivado UartLite IP directory:
cp kria-vitis-platforms/kr260/platforms/vivado/kr260_tsn_rs485pmod/ip/axi_uartlite_v2_0_rs485.patch /opt/Xilinx/Vivado/2022.1/data/ip/xilinx/axi_uartlite_v2_0
3. Go to Vivado IP directory:
- Navigate to the Vivado UartLite IP directory:
cd /opt/Xilinx/Vivado/2022.1/data/ip/xilinx/axi_uartlite_v2_0
4. Apply the patch:
- Apply the patch, starting with a dry run:
patch --dry-run -p1 < axi_uartlite_v2_0_rs485.patch
patch -p1 < axi_uartlite_v2_0_rs485.patch
Adding PMOD Interface to Block Design and Building PlatformIn the previously built Vivado project, open the block design. We will make the following modifications to the original Smartcam design:
1. Remove the audio block:
- We don’t need it for our EagleEye-AI project. Also, delete all its associated external ports, as we will use the same PMOD pins for our RS-485 module.
2. Add the Axi_UartLite IP:
- This is to support the RS-485 PMOD module. Make the UART port external and name it pmod1_uart.
- Double-click on the axi_uartlite block and configure it as illustrated in the image below.
Note the settings of the AXI CLK Frequency parameter. It should be set to MANUAL with a value of 90, even though the actual AXI clock frequency is 100 MHz. This value was determined experimentally by measuring the UART output signal with an oscilloscope. This module has some issues with transforming the input clock frequency to the baud rate, which is fixed at 9600 for our needs during the module design phase and cannot be changed later by the software driver.
3. Add the GPIO block:
- This will control the additional two LEDs and two buttons available on our RS-485 module.
- Double-click on the axi_gpio block and configure it as shown in the image below. We have two output GPIO pins for driving LEDs and two input pins for reading buttons.
4. Connecting to AXI Interconnect block:
- Connect both newly added IP blocks to the same axi_ic_ctrl_100 IP block where the previously removed audio block was connected. Update the number of AXI master outputs to 5. Also, connect the AXI clock and reset signals from the AXI interconnect block to both newly added IP blocks.
5. Connecting the Interrupt Signals:
- Connect the interrupt signals to both newly added IP blocks and update the number of concatenated interrupts accordingly.
6. UART Rx-disable signal:
- This signal is necessary for our PMOD module and must be set to a fixed logic value of 0. Add a new Constant IP block, set the output value to 0, and make the port external. Name it pmod1_uart_rxn.
The final block design should resemble the screenshot below. Avoid using Connection Automation to prevent unintended changes.
7. Assign Memory Addresses
- For the newly added blocks, we need to assign memory addresses. To avoid any unintended changes, do not reassign all addresses at once. Instead, select one line at a time for each newly added block and automatically assign the memory address.
8. Update constraint file:
- Edit the pin.xdc file and remove all external ports associated with the deleted audio block and add the ports for the RS-485 module.
- The updated Pin.xdc file:
# (C) Copyright 2020 - 2022 Xilinx, Inc.
# SPDX-License-Identifier: Apache-2.0
#MIPI
set_property DIFF_TERM_ADV TERM_100 [get_ports {mipi_phy_if_clk_p}]
set_property DIFF_TERM_ADV TERM_100 [get_ports {mipi_phy_if_clk_n}]
set_property DIFF_TERM_ADV TERM_100 [get_ports {mipi_phy_if_data_p[*]}]
set_property DIFF_TERM_ADV TERM_100 [get_ports {mipi_phy_if_data_n[*]}]
#GPIO
#ISP AP1302_RST_B HDA02
set_property PACKAGE_PIN J11 [get_ports {ap1302_rst_b}]
set_property IOSTANDARD LVCMOS33 [get_ports {ap1302_rst_b}]
set_property SLEW SLOW [get_ports {ap1302_rst_b}]
set_property DRIVE 4 [get_ports {ap1302_rst_b}]
#ISP AP1302_STANDBY HDA03
set_property PACKAGE_PIN J10 [get_ports {ap1302_standby}]
set_property IOSTANDARD LVCMOS33 [get_ports {ap1302_standby}]
set_property SLEW SLOW [get_ports {ap1302_standby}]
set_property DRIVE 4 [get_ports {ap1302_standby}]
#Fan Speed Enable
set_property PACKAGE_PIN A12 [get_ports {fan_en_b}]
set_property IOSTANDARD LVCMOS33 [get_ports {fan_en_b}]
set_property SLEW SLOW [get_ports {fan_en_b}]
set_property DRIVE 4 [get_ports {fan_en_b}]
#I2C signals --> I2C switch 0--> ISP AP1302 + Sensor AR1335
set_property PACKAGE_PIN G11 [get_ports iic_scl_io]
set_property PACKAGE_PIN F10 [get_ports iic_sda_io]
set_property IOSTANDARD LVCMOS33 [get_ports iic_*]
set_property SLEW SLOW [get_ports iic_*]
set_property DRIVE 4 [get_ports iic_*]
#
## PMOD-1 RS-485 module pins
set_property PACKAGE_PIN H12 [get_ports pmod1_uart_rxn]; # PMOD-1 pin-1 Rx-disable
set_property PACKAGE_PIN B10 [get_ports pmod1_led_tri_o[0]]; # PMOD-1 pin-2 LED-1
set_property PACKAGE_PIN E10 [get_ports pmod1_uart_txd]; # PMOD-1 pin-3 Tx
set_property PACKAGE_PIN E12 [get_ports pmod1_led_tri_o[1]]; # PMOD-1 pin-4 LED-2
set_property PACKAGE_PIN D10 [get_ports pmod1_uart_rxd]; # PMOD-1 pin-5 Rx
set_property PACKAGE_PIN D11 [get_ports pmod1_btn_tri_i[0]]; # PMOD-1 pin-6 Btn-1
set_property PACKAGE_PIN C11 [get_ports pmod1_uart_txen]; # PMOD-1 pin-7 Tx-enable
set_property PACKAGE_PIN B11 [get_ports pmod1_btn_tri_i[1]]; # PMOD-1 pin-8 Btn-2
set_property IOSTANDARD LVCMOS33 [get_ports pmod1_*];
set_property SLEW SLOW [get_ports pmod1_*]
set_property DRIVE 4 [get_ports pmod1_*]
set_property BITSTREAM.CONFIG.OVERTEMPSHUTDOWN ENABLE [current_design]
- The Module pinout:
9. PlatformSetup
- Go to Platform Setup tab - no changes are needed, but note that we are keeping the same ports for accelerator insertion in Vitis.
10. Validate Block Design
- Tools (top tool bar) -> Validate Design - click “No” when it asks if you want to auto assign unassigned address segment.
11. GenerateHDLwrapper
We need to re-generate the wrapper because we have renamed the PMOD output signals.
- Block Design -> sources -> Design Sources -> right click on kv260_ispMipiRx_vcu_DP_wrapper and select “Remove File from Project”:
- Block Design -> sources -> Design Sources -> right click on kv260_ispMipiRx_vcu_DP and select “Create HDL wrapper”:
- In the pop-up, select “Let Vivado manage wrapper and auto-update” and press OK.
- Double click the kv260_ispMipiRx_vcu_DP_wrapper file and you should see PMOD1 signals in the wrapper.
11. Generate Output Products
- Block Design -> sources -> Design Sources -> right click on kv260_ispMipiRx_vcu_DP_i and select “Generate Output Products”, and choose “out of context per IP” and click Generate:
11. Generate Bitstream
- Now we are ready to generate bitstream. To generate bitstream, click on Program and Debug -> Generate Bitstream. This process will take about 40 minutes.
- After.bit file has been generated, we need to convert it to.bit.bin file, a format that xmutil is expecting:
cd $kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/project/kv260_ispMipiRx_vcu_DP.runs/impl_1/
echo 'all:{kv260_ispMipiRx_vcu_DP_wrapper.bit}'>bootgen.bif
bootgen -w -arch zynqmp -process_bitstream bin -image bootgen.bif
mv kv260_ispMipiRx_vcu_DP_wrapper.bit.bin kr260-eagleeye-prelim.bit.bin
NOTE
The generated preliminary .bit.bin file does not yet include the integrated DPU and accelerators. We will integrate these components later when we compile the Overlay. However, before proceeding with the Overlay compilation, which can take several hours, we need to test the PMOD interface to ensure all connections are correct.
Export Block Design to TCL scriptAfter updating the block design and successfully generating the bitstream, we need to create a TCL script to recreate the block design. This script will be used later for building an Overlay. However, before proceeding, we must test the.bit.bin file on the real target to ensure the RS-485 interface functions correctly.
- To generate the block design TCL script, use the TCL console in Vivado with the block design open and enter the following command in the TCL window:
write_bd_tcl config_bd.tcl
This command generates a TCL script named config_bd.tcl. Move the script to a location where it can be easily found later. If you make any changes to the block design after testing on real hardware, repeat this process to update the script.However, if you encounter difficulties creating the block design, you can use our pre-tested TCL script available in our GitHub repository. To use this script in Vivado, delete the current block design and the HDL Wrapper, then type the following command in the TCL console (assuming you have copied the script to the current directory of the TCL console):
source config_bd.tcl
Then, generate the bitstream following the steps described above.
Export PlatformWe need the.xsa file to retrieve the GPIO and UartLite device data for updating the Device Tree Structure. Check the Include Bit Stream option, when exporting the hardware.
The exported platform file is located at:
$kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/project/kv260_ispMipiRx_vcu_DP_wrapper.xsa
Device Tree StructureCreating DtsiFile
We will base our.dtsi file from the smartcam.dtsi file. We will need to replace the audio devices with the UartLite and GPIO devices.
- To get the smartcam.dtsi file:
wget https://raw.githubusercontent.com/Xilinx/kria-apps-firmware/xlnx_rel_v2022.1/boards/kv260/smartcam/kv260-smartcam.dtsi
To generate the.dtsi file from our exported platform file, we use the XSCT tools. This process is done to extract the UartLite and GPIO device tree data only. Follow these steps:
- Get the device-tree-xlnx repository from Xilinx GitHub repository:
git clone https://github.com/Xilinx/device-tree-xlnx
- Start xsct and generate dtsi
# start xsct
xsct
hsi open_hw_design $kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/project/kv260_ispMipiRx_vcu_DP_wrapper.xsa
hsi set_repo_path <path to device-tree-xlnx repository>
hsi create_sw_design device-tree -os device_tree -proc psu_cortexa53_0
hsi set_property CONFIG.dt_overlay true [hsi::get_os]
hsi set_property CONFIG.dt_zocl true [hsi get_os]
hsi generate_target -dir eagle_eye_dtg_output
hsi close_hw_design kv260_ispMipiRx_vcu_DP_wrapper
exit
A folder eagle_eye_dtg_output will be created, pl.dtsi is the needed.dtsi file to be used to update the original smartcam.dtsi file.
Open the eagle_eye_dtg_output/pl.dtsi file and locate the nodes for the GPIO and UartLite devices. They should resemble the following examples:
axi_gpio_0: gpio@80010000 {
#gpio-cells = <2>;
#interrupt-cells = <2>;
clock-names = "s_axi_aclk";
clocks = <&misc_clk_0>;
compatible = "xlnx,axi-gpio-2.0", "xlnx,xps-gpio-1.00.a";
gpio-controller ;
interrupt-controller ;
interrupt-names = "ip2intc_irpt";
interrupt-parent = <&gic>;
interrupts = <0 109 4>;
reg = <0x0 0x80010000 0x0 0x10000>;
xlnx,all-inputs = <0x0>;
xlnx,all-inputs-2 = <0x1>;
xlnx,all-outputs = <0x1>;
xlnx,all-outputs-2 = <0x0>;
xlnx,dout-default = <0x00000000>;
xlnx,dout-default-2 = <0x00000000>;
xlnx,gpio-width = <0x2>;
xlnx,gpio2-width = <0x2>;
xlnx,interrupt-present = <0x1>;
xlnx,is-dual = <0x1>;
xlnx,tri-default = <0xFFFFFFFF>;
xlnx,tri-default-2 = <0xFFFFFFFF>;
};
axi_uartlite_0: serial@80020000 {
clock-names = "s_axi_aclk";
clocks = <&misc_clk_0>;
compatible = "xlnx,axi-uartlite-2.0", "xlnx,xps-uartlite-1.00.a";
current-speed = <9600>;
device_type = "serial";
interrupt-names = "interrupt";
interrupt-parent = <&gic>;
interrupts = <0 108 1>;
port-number = <0>;
reg = <0x0 0x80020000 0x0 0x10000>;
xlnx,baudrate = <0x2580>;
xlnx,data-bits = <0x8>;
xlnx,odd-parity = <0x0>;
xlnx,s-axi-aclk-freq-hz-d = <0x5a>;
xlnx,txen-delay = <0x0>;
xlnx,use-parity = <0x0>;
};
- Copy these two nodes for the GPIO and UartLite devices. Then, open the original kv260-smartcam.dtsi file from Smartcam, locate the commented-out audio nodes, and replace them with the copied nodes. Save the updated file as kr260-eagle-eye.dtsi.
Creating Dtbo File
- Complile the kr260-eagle-eye.dtsi file:
dtc -@ -I dts -O dtb -o kr260-eagle-eye.dtbo kr260-eagle-eye.dtsi
The generated kr260-eagle-eye.dtbo file is our final device tree blob file that we will use in our application project. Save it in a separate location so it can be easily found later.
Loading the Firmware on TargetAt this stage, we have updated the block design in Vivado, generated the bitstream, and compiled the device tree blob. To test the RS-485 PMOD interface on our Kria board, we need the following files:
- kr260-eagleeye-prelim.bit.bin: preliminary bitstream file without DPU
- kr260-eagle-eye.dtbo - the final device tree blob
- shell.json - metadata file needed for dxf-mgr
To create the shell.json file, create an empty file and copy the content below into it:
{
"shell_type" : "XRT_FLAT",
"num_slots" : "1"
}
Copy these three files to the Kria KV260 board. First, use the scp command on the host PC to copy them to the Ubuntu home directory, as you don’t have the necessary permissions to copy them directly to the final location.
# on host PC
scp kr260-eagleeye-prelim.bit.bin kr260-eagle-eye.dtbo shell.json ubuntu@<kria board ip>:/home/ubuntu
Then, SSH into the Kria board, create the /lib/firmware/xilinx/kr260-eagle-eye directory, and move these files to that location:
# on Kria board
sudo mkdir /lib/firmware/xilinx/kr260-eagle-eye
mv *.bin *.dtbo *.json /lib/firmware/xilinx/kr260-eagle-eye
Then load the new provided firmware to the FPGA by following this procedure:
# to check if our new firmware is listed here
sudo xmutil listapps
sudo xmutil unloadapp
sudo xmutil loadapp kr260-eagle-eye
# to check if the firmware was loaded correctly
sudo xmutil listapps
If the device tree was generated correctly and the serial driver loaded successfully, you should see the new serial device /dev/ttyUL0
in the following location:
ls /dev/ttyUL*
Testing the RS-485 PMOD InterfaceWe use the Linux echo command to send some simple hard-coded PELCO-D serial commands through the RS-485 PMOD module to the camera rotator to verify its functionality. This process is similar to what we did in a previous section using a host PC. This time, however, we utilize our custom-made RS-485 PMOD module connected to the Kria’s PMOD-1 interface.
Remember to connect the RS-485 A and B wires from the camera rotator to the RS-485 PMOD module. This module is supported only on the first PMOD interface, so ensure it is connected there.
To test the rotator communication use these commands on the Kria board:
# lamp on
echo -en '\xff\x01\x00\x09\x00\x01\x0b' > /dev/ttyUL0
# lamp off
echo -en '\xff\x01\x00\x0b\x00\x01\x0d' > /dev/ttyUL0
# pan left
echo -en '\xff\x01\x00\x04\xff\x00\x04' > /dev/ttyUL0
# pan right
echo -en '\xff\x01\x00\x02\xff\x00\x02' > /dev/ttyUL0
# tilt up
echo -en '\xff\x01\x00\x08\x00\x3f\x48' > /dev/ttyUL0
# tilt down
echo -en '\xff\x01\x00\x10\x00\x3f\x50' > /dev/ttyUL0
# motors stop
echo -en '\xff\x01\x00\x00\x00\x00\x01' > /dev/ttyUL0
Tips
- Use a USB to RS-485 dongle connected to the host PC as a sniffer to monitor the messages sent from the PMOD interface. Connect the A and B ports from the USB dongle in parallel with those from the PMOD module. Utilize a terminal program like minicom or a similar tool for this purpose.
- You can verify the timing diagram of the RS-485 interface by connecting an oscilloscope to the A and B lines. This will help ensure that the baud rate of 9600 is within specifications, with timings that should not deviate by more than 3%. Here is an example of what the oscillogram should resemble:
To test the LEDs and buttons available on this PMOD module, we use the devmem2 application available on Ubuntu. This allows us to access the registers of the GPIO block and verify their functionality:
# install the devmem2 if not already
sudo apt update
sudo apt install devmem2
# to turn on red or blue led on rs485-pmod module
sudo devmem2 0x80010000 w 0x01
sudo devmem2 0x80010000 w 0x02
sudo devmem2 0x80010000 w 0x03
sudo devmem2 0x80010000 w 0x00
# read button states - they are active low
sudo devmem2 0x80010008 w
If everything functions correctly as expected, ensure that you export the block design in Vivado to a TCL script if you haven’t already and store it in a secure location.
write_bd_tcl config_bd.tcl
Building the OverlayAfter successfully testing the RS-485 PMOD interface, you can proceed to build the complete Overlay FPGA image, which includes the DPU for AI inference and accelerators for preprocessing and drawing results. We will create a new.bit file using the Make utility within the same repository, kria-vitis-platforms/kv260, but this time in the overlays subdirectory instead of platforms as before.
To build the Overlay, follow these steps:
- Copy the previously generated config_bd.tcl script to the appropriate location in the platforms (not overlays) subfolder overwriting the existing one. This script will re-create the block design during the Overlay building process, allowing the DPU and accelerators to be added by the Vitis compiler. Alternatively, you can use the config_bd.tcl script from our GitHub repository.
cp config_bd.tcl $kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/scripts/config_bd.tcl
- Also, copy the previously updated pin.xdc constraint file from our project’s block design over the existing one in the separate xdc subfolder. This step is crucial; if missed, the original pin.xdc file from separate xdc directory will overwrite our constraint data in the block design. Alternatively, you can use the pin.xdc file from our GitHub repository.
cp $kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/project/kv260_ispMipiRx_vcu_DP.srcs/constrs_1/imports/xdc/pin.xdc $kv260-vitis/platforms/vivado/kv260_ispMipiRx_vcu_DP/xdc/pin.xdc
- Now in $kv260-vitis, do a
make clean
so that the scripts would regenerate platform with the updated artifacts when we callmake overlay OVERLAY=smartcam
in the next step.
cd $kv260-vitis
make clean
- Now we can run the makefile that was provided for smartcam to put DPU overlay into this platform:
make overlay OVERLAY=smartcam
The build process would last about 4 hours. After the makefile completes, you should find the.bit and.xclbin files:
ls -l $kv260-vitis/overlays/examples/smartcam/binary_container_1/link/int/system.bit
ls -l $kv260-vitis/overlays/examples/smartcam/binary_container_1/dpu.xclbin
- Perform the following to convert system.bit to kr260-eagle-eye.bit.bin:
cd $kv260-vitis/overlays/examples/smartcam/binary_container_1/link/int/
echo 'all:{system.bit}' > bootgen.bif
bootgen -w -arch zynqmp -process_bitstream bin -image bootgen.bif
cd $kv260-vitis/
cp $kv260-vitis/overlays/examples/smartcam/binary_container_1/link/int/system.bit.bin ./kr260-eagle-eye.bit.bin
and rename the dpu.xclbin file. We need that one too, because our Eagle-Eye-AI application uses accelerators and this file provides a metadata to accessing it.
cd $kv260-vitis/
cp $kv260-vitis/overlays/examples/smartcam/binary_container_1/dpu.xclbin ./kr260-eagle-eye.xclbin
Testing with DPU and Accelerators on Target- Copy the kr260-eagle-eye.bit.bin and kr260-eagle-eye.xclbin files to the Kria board home directory. Then, move them into the /lib/firmware/xilinx/kr260-eagle-eye directory using the
sudo mv
command. The new .bit.bin file should replace the previous one (without accelerators).
# on Kria target board
sudo xmutil unloadapp
sudo mv ~/kr260-eagle-eye.bit.bin /lib/firmware/xilinx/kr260-eagle-eye/
sudo mv ~/kr260-eagle-eye.xclbin /lib/firmware/xilinx/kr260-eagle-eye/
- Next, load the new firmware after disabling the desktop environment on the Kria board:
sudo xmutil desktop_disable
sudo xmutil loadapp kr260-eagle-eye
sudo xmutil listapps
Running the EagleEye-AI application in DockerRun the Docker image previously saved, which already contains all our ROS2 nodes needed for running our Eagle-Eye-AI application.
docker run \
--rm \
--env="DISPLAY" \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it eagle-eye-ros-humble:1.0 bash
If you forgot to save the Docker image, you can retrieve all the necessary ROS2 code for running the EagleEye-AI application from our GitHub repository. Clone the repository and build the code by using colcon build command:
# inside docker container
cd
git clone https://github.com/s59mz/eagle-eye-ai.git
cd eagle-eye-ai/ros2_ws
colcon build
source install/setup.bash
Before starting the GStreamer pipeline application, we need to update some kernel configuration.json files since the.xclbin file location has changed. To identify these configuration files, we examine our GStreamer pipeline, which is as follows:
gst-launch-1.0 rtspsrc location=rtsp://192.168.1.11:554/stream1 ! rtph265depay ! h265parse ! omxh265dec ! video/x-raw, width=1920, height=1080 ! videoconvert ! video/x-raw, format=NV12 ! tee name=t ! queue ! vvas_xmultisrc kconfig="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/preprocess.json" ! queue ! vvas_xfilter kernels-config="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/aiinference.json" ! ima.sink_master vvas_xmetaaffixer name=ima ima.src_master ! fakesink t. ! queue max-size-buffers=1 leaky=2 ! ima.sink_slave_0 ima.src_slave_0 ! queue ! vvas_xfilter kernels-config="/opt/xilinx/kv260-smartcam/share/vvas/facedetect/drawresult.json" ! queue ! kmssink driver-name=xlnx plane-id=39 sync=false fullscreen-overlay=true
We use three kernel configuration files in our pipeline, all located in the /opt/xilinx/kv260-smartcam/share/vvas/facedetect
directory:
- preprocess.json
- aiinference.json
- drawresult.json
Edit each of these.json files and update the “xclbin-location” parameter to point to our new kr260-eagle-eye.xclbin header file.
cd /opt/xilinx/kv260-smartcam/share/vvas/facedetect
# update the parameter
# "xclbin-location":"/lib/firmware/xilinx/kr260-eagle-eye/kr260-eagle-eye.xclbin"
sudo vi preprocess.json
sudo vi aiinference.json
sudo vi drawresult.json
NOTE:
You should also verify the other paths in these.json kernel configuration files that reference VVAS libraries. For example, if you build your own Docker image instead of using the one from the Smartcam demo app, as described in the appendix section, you should rename the directories containing the VVAS libraries to be consistent with the application name. In this case, update the following parameters to point to the correct library and model paths, as demonstrated in our GitHub repository:
- xclbin-location
- vvas-library-repo
- model-path
And don’t forget to update the paths to these.json files in the GStreamer pipeline, otherwise it won’t work.
Saving the Docker Image
Save the Docker image file, as we are almost ready to run our EagleEye-AI application. However, there’s one more step.
Running the Application
To run the application in the ROS2 environment, we need to start at least three nodes:
- gstreamer_pipeline: a C++ node that runs the GStreamer pipeline.
- rotator_controller: a node that controls the rotator’s motor via PELCO-D protocol messages.
- camera_controller: a node that connects the above two nodes and calculates the needed speed of both motors in the rotator.
Since we want to run the application in the Docker container using a single command shell, it is impractical to start each node individually from the command line. Instead, we will use a ROS2 launch.xml script to start all three nodes with the appropriate input parameters.
We already have this script in our GitHub repository. If you cloned the repository in the previous section, you already have it in your Docker container within the ROS2 package eagle_eye_bringup, named follow_cam.launch.xml. If not, copy the entire eagle_eye_bringup package to your ros2_ws workspace. Here is the content of the script:
<launch>
<arg name="camera_url" default="rtsp://192.168.1.11:554/stream1" />
<node pkg="eagle_eye_pkg" exec="gstreamer_pipeline" name="gstreamer_pipeline">
<param name="camera_url" value="$(var camera_url)" />
</node>
<node pkg="rotator_py_pkg" exec="rotator_controller" name="rotator_controller">
<param name="port" value="/dev/ttyUL0"/>
<param name="baudrate" value="9600"/>
<param name="address" value="1"/>
</node>
<node pkg="rotator_py_pkg" exec="camera_controller" name="camera_controller">
<param name="updating_period" value="0.2"/>
<param name="frame_timeout" value="5"/>
</node>
</launch>
Update the camera_url parameter with your camera’s URL if necessary, and then compile the launch package:
cd /root/eagle-eye-ai/ros2_ws
colcon build
source install/setup.bash
Launch the Eagle-Eye-AIFinally, we have reached the end of this project tutorial!
To launch the Eagle-Eye-AI app, simply run the ROS2 launch script:
ros2 launch eagle_eye_bringup follow_cam.launch.xml
Optionally, you can launch the app with your camera URL as an input parameter:
ros2 launch eagle_eye_bringup follow_cam.launch.xml <camera_url>
Verify that all ROS2 nodes have started and can communicate with each other correctly using the RQt application on your host PC. The displayed graph should resemble the one shown below:
Test all the features as we did before when using the USB to RS-485 dongle from the host PC. Now, however, all the code runs on the Kria KR260 board and communicates with the rotator through our custom-made RS-485 PMOD module via Kria’s PMOD interface.
If you’ve made it through all the parts successfully, congratulations! It has been a very long journey. Have fun with your project and enjoy!
Feel free to comment on this blog.
Appendix: Camera Inclinometer SupportWhen building the hardware, we mounted the inclinometer inside the junction box on the camera mounting bracket, which rotates with the camera and measures the roll, pitch, and yaw angles in real-time. The inclinometer communicates with the system via the Modbus RTU protocol, using the same RS-485 lines as the rotator, but a different protocol.
In our Rotator Controller node, we implemented a simple timer loop that periodically (every 1 second) reads the inclinometer using the PyModbus library to send Read Holding Registers commands and publishes its data on the camera_orientation
ROS2 topic.
In the GStreamer Pipeline Node, we listen for these topic messages and populate the inference metadata structure in the video stream, embedding the camera orientation status directly into the video stream displayed on the HDMI monitor.
The inclinometer device uses address 80, with the Holding Registers for angles starting at address 0x3d.
Detailed information about the Modbus protocol is beyond the scope of this project. For more details about the inclinometer, refer to the device’s datasheet available at this link.
The Modbus transaction messages are similar to those captured by a sniffer device on a host PC.
# Read 12 Holding 16-bit registers, starting at offset address=0x34 on a device with a slave address=80
50 03 00 34 00 0c 09 80
# Response from inclinometer
50 03 18 07 f9 00 00 00 aa 00 00 00 00 00 00 f3 ac 05 8b 05 38 ff f5 c3 61 de 26 b0 c2
Since the Modbus and Pelco-D protocols share the same RS-485 line, we need to manage message transmission carefully to avoid collisions. To simplify the design, we wait for the Pelco-D transaction to complete by calling flush() before initiating a Modbus transaction. We also clear both RX and TX buffers to ensure we receive only the inclinometer’s response. After the Modbus transaction, we continue with Pelco-D transactions until the Inclinometer timer calls for another Modbus reading.
The code addition in the rotator_controller.py
file is shown below:
.. Initialization ..
# create camera orientation publisher
self.pub_cam_orient_ = self.create_publisher(
CameraOrientation, "camera_orientation", 10
)
# create inclinometer reading timer
self.inclinometer_timer_ = self.create_timer(1.0, self.callback_inclinometer) # read every 1000ms
# Initialize Modbus client with pymodbus
self.modbus_client_ = ModbusClient(method='rtu', port=self.port_,
handle_local_echo=True, baudrate=self.baudrate_, timeout=1)
if (self.serial_):
self.modbus_client_.connect()
.. Callback ..
# inclinometer timer callback
def callback_inclinometer(self):
# checknif port is open
if not self.serial_:
return
# Reset input buffer
self.serial_.flush()
self.serial_.reset_output_buffer()
self.serial_.reset_input_buffer()
# Read Roll, Pitch and Yawn Holding registers at offset 0x3d, device address=80
result = self.modbus_client_.read_holding_registers(0x3d, 3, slave=80)
if result.isError():
self.get_logger().info(f"Modbus Error: Read Holding reg")
return
value=[0]*3
# Handling negative numbers
for i in range(0, 3):
if (result.registers[i]>32767):
value[i]=result.registers[i]-65536
else:
value[i]=result.registers[i]
# convert values to degrees
angle_degree = [value[i] / 32768.0 * 180.0 for i in range(0, 3)]
# Yawn
azimuth = -angle_degree[2]
if azimuth < 0:
azimuth += 360.0
# Pitch
elevation = -angle_degree[1]
# Create message for publishing
msg = CameraOrientation()
msg.azimuth = azimuth
msg.elevation = elevation
# publish camera orientation
self.pub_cam_orient_.publish(msg)
In the GStreamer Node, we moved the position of the probe from the VVAS Metaaffixer to the sink pad of the Draw VVAS_FILTER element. This allows us to perform two tasks simultaneously:
- Extracting the size and position of the boundary boxes of detected faces, as we were already doing before.
- Populating the Inference Metadata structures with the camera orientation data structure, enabling the VVAS Draw element to embed this data into the video stream.
Please take a look at the updated gstreamer_pipeline.cpp
file on our GitHub for more details. Below is an excerpt of the code:
static void handle_inference_meta(GstMeta *meta, GStreamerPipeline *gsnode) {
..
// check if the largest boundary box even exists
if (largest_child_pred) {
// publish the largest one
gsnode->publish_max_bounding_box(parent_bbox, &largest_child_pred->bbox);
// Also, attach the inclinometer shared data struct to the
// Unused pointer of GstInferencePrediction struct
// So, the VVAS Draw Filter element can show camera orientation on the screen
largest_child_pred->reserved_1 = cam_orient;
{ // update shared structure with inclinometer data with guard lock
std::lock_guard<std::mutex> lock(gsnode->mutex_);
cam_orient->azimuth = gsnode->camera_orientation_.azimuth;
cam_orient->elevation = gsnode->camera_orientation_.elevation;
}
return;
This change ensures that both the face detection boundaries and the camera orientation status are accurately represented in the video stream.
NOTE
Because we moved the probe to a new element (draw) that has a higher video resolution, we may need to adjust the calculated rotator motor speed in the camera_controller.cpp file. This is necessary to ensure the camera movement remains stable.
# calculate required motor speed for tracking
if (abs(self.offset_x) > msg.bbox_width / 2):
self.pan_speed_ = (int) (- self.offset_x / 8)
else:
# face is centered enough in X
self.pan_speed_ = 0
if (abs(self.offset_y) > msg.bbox_height / 4):
self.tilt_speed_ = (int) (- self.offset_y)
else:
# face is centered enough in Y
self.tilt_speed_ = 0
Displaying Status Text on the Screen
To render the camera orientation status text on the screen, we need to update the vvas_airender.cpp file taken from the Smartcam demo application. Below is how we accomplish this in the overlay_node_foreach
function:
overlay_node_foreach (GNode * node, gpointer kpriv_ptr) {
..
/* Check whether the frame is NV12 or BGR and act accordingly */
if (frameinfo->inframe->props.fmt == VVAS_VFMT_Y_UV8_420) {
..
std::string camera_text = "";
// place a dynamic text on a screen
if (prediction && prediction->reserved_1) {
std::stringstream ss_azimuth, ss_elevation;
CameraOrientation * cam_orient = reinterpret_cast<CameraOrientation *>
(prediction->reserved_1);
// Format azimuth
ss_azimuth << std::setw(6) << std::setfill(' ') << std::fixed
<< std::setprecision(2) << cam_orient->azimuth;
// Format elevation if needed similarly
ss_elevation << std::setw(6) << std::setfill(' ') << std::fixed
<< std::setprecision(2) << cam_orient->elevation;
// Combine the formatted strings into a single text
camera_text = "Face Tracking: AZ:" + ss_azimuth.str() +
", EL:" + ss_elevation.str();
// Calculate text size
int baseLine;
Size text_size = getTextSize(camera_text, kpriv->font, kpriv->font_size,
kpriv->line_thickness, &baseLine);
// Get frame dimensions from properties
int frame_width = frameinfo->inframe->props.width;
int frame_height = frameinfo->inframe->props.height;
// Text position
int x = (frame_width - text_size.width) / 2;
int y = frame_height - baseLine - 30;
convert_rgb_to_yuv_clrs(kpriv->label_color, &yScalar, &uvScalar);
// Draw text on Y plane
putText (frameinfo->lumaImg, camera_text, cv::Point (x, y), kpriv->font,
kpriv->font_size, Scalar (yScalar), kpriv->line_thickness, 1);
// Draw text on UV plane
putText (frameinfo->chromaImg, camera_text, cv::Point (x / 2, y / 2),
kpriv->font, kpriv->font_size / 2, Scalar (uvScalar),
kpriv->line_thickness, 1);
}
This snippet ensures that the camera orientation status is rendered correctly on the screen, with the text displaying the azimuth and elevation values formatted to three digits and two decimal places.
NOTE
The vvas_airender.cpp library taken from the Smartcam demo application needs to be recompiled on the Kria board using the xilinx/kria-developer Docker image. It cannot be built on the host PC. We will address this in the next section when creating a Docker image for our application.
Appendix: Creating a Docker Image for Eagle-Eye-AI ApplicationIn this appendix, we’ll guide you through creating a Docker image that includes everything needed for our Eagle-Eye-AI application, except for the FPGA binary image, which will be covered in the next section when we create a Debian package.
To build the Docker image, we first clone the eagle-eye-ai repository from our GitHub. The directory structure in these Dockerfiles is based on the directory structure of this repository.
git clone https://github.com/s59mz/eagle-eye-ai.git
cd eagle-eye-ai
There are two Dockerfiles in this repository:
- ros2-humble-docker
- eagle-eye-docker
Dockerfile Overview
We’ve split the building process into two phases:
1. The first phase builds an image based on the official xilinx/kria-runtime
image and adds ROS2, PySerial, PyModbus and some utilities.
2. The second phase builds the application and libraries on the kria-developer
image and then adds them to the previous image from phase 1, based on kria-runtime
image.
Phase 1: Building the ROS2-Humble Docker Image
This Dockerfile, named ros2-humble-docker
, creates a base image with ROS2 and essential tools.
## Building based on kria-runtime docker image
FROM xilinx/kria-runtime:2022.1
# Set environment variables
ENV DEBIAN_FRONTEND=noninteractive
# Install Locales
RUN apt-get update && apt-get install --yes --no-install-recommends \
locales \
&& locale-gen en_US.UTF-8 \
&& update-locale LANG=en_US.UTF-8
# Install tools & dependencies
RUN apt-get install --yes --no-install-recommends \
vim \
curl \
ca-certificates \
git
# Add ROS2 apt repository
RUN sh -c 'curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg'
RUN sh -c 'echo "deb [arch=arm64 signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2-testing/ubuntu jammy main" | tee /etc/apt/sources.list.d/ros2.list > /dev/null'
# Install ROS2 packages
RUN apt-get update && apt-get install --yes --no-install-recommends \
ros-humble-ros-base \
ros-dev-tools
# Install PySerial and PyModbus
RUN pip3 install pyserial
RUN pip3 install pymodbus
# Source the ROS2 setup script
RUN echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc
RUN echo "source /usr/share/colcon_argcomplete/hook/colcon-argcomplete.bash" >> ~/.bashrc
Phase 2: Building the Eagle-Eye-AI Docker Image
This Dockerfile, named eagle-eye-docker
, builds our application, including necessary VVAS libraries and ROS2 nodes.
The libraries, particularly the modified vvas_airender.cpp
which displays inclinometer data on the screen, are built using the kria-developer
Docker image.
The final application, however, will be based on the kria-runtime
Docker image. This approach conserves significant resources needed for building, as the runtime environment does not require them.
## Building based on kria-developer docker image
FROM xilinx/kria-developer:2022.1 AS build
WORKDIR /workspace
COPY eagle-eye-app eagle-eye-app
RUN cd eagle-eye-app ;\
mkdir -p build/install ;\
cd build ;\
cmake ../ -DCMAKE_INSTALL_PREFIX:PATH=/ ;\
make ;\
make DESTDIR=./install install
## Building based on ros2-humble image
FROM ros2-humble:1.0
# Copy from kria-developer built files
COPY --from=build /workspace/eagle-eye-app/build/install /
# Log build time
ARG BUILD_DATE
LABEL org.label-schema.build-date=$BUILD_DATE
# Install welcome screen
COPY scripts/bashrc /etc/bash.bashrc
RUN chmod a+rwx /etc/bash.bashrc
ADD scripts/welcome.sh /etc/
RUN echo $BUILD_DATE > /etc/BUILD_DATE.txt
# Set working directory
WORKDIR /root/ros2_ws
RUN echo "source /root/ros2_ws/install/setup.bash" >> ~/.bashrc
# Copy custom ROS2 code
COPY ros2_ws/src src
# Build the ROS2 workspace
RUN /bin/bash -c "source /opt/ros/humble/setup.bash && colcon build"
# Install application startup script
ADD scripts/run_eagle_eye_ai.sh .
RUN chmod +x run_eagle_eye_ai.sh
Building the Docker Images
Use the build.sh
script to build the images.
# Define the names of the images
ROS_IMAGE="ros2-humble:1.0"
APP_IMAGE="eagle-eye-ai:1.0"
# Define the names of Dockerfiles
ROS_DOCKERFILE="ros2-humble-docker"
APP_DOCKERFILE="eagle-eye-docker"
# Check if ROS image exists
if ! docker image inspect $ROS_IMAGE > /dev/null 2>&1; then
echo "$ROS_IMAGE does not exist. Building $ROS_IMAGE..."
docker build --build-arg BUILD_DATE="$(date -u +'%Y/%m/%d %H:%M')" -f $ROS_DOCKERFILE . -t $ROS_IMAGE
else
echo "$ROS_IMAGE already exists."
fi
# Build app image
if docker image inspect $ROS_IMAGE > /dev/null 2>&1; then
echo "Building $APP_IMAGE..."
docker build --build-arg BUILD_DATE="$(date -u +'%Y/%m/%d %H:%M')" -f $APP_DOCKERFILE . -t $APP_IMAGE
else
echo "ERROR: Can't build the $ROS_IMAGE image"
fi
if docker image inspect $APP_IMAGE > /dev/null 2>&1 && docker image inspect $ROS_IMAGE > /dev/null 2>&1; then
echo "Both $ROS_IMAGE and $APP_IMAGE are now on the Kria board."
else
echo "ERROR: Can't build the $APP_IMAGE image"
fi
Running the Docker Container
Use the run.sh
script to run the built image.
docker run \
--rm \
--net=host \
--env="DISPLAY" \
-h "xlnx-docker" \
--env="XDG_SESSION_TYPE" \
--privileged \
--volume="$HOME/.Xauthority:/root/.Xauthority:rw" \
-v /tmp:/tmp \
-v /dev:/dev \
-v /sys:/sys \
-v /etc/vart.conf:/etc/vart.conf \
-v /lib/firmware/xilinx:/lib/firmware/xilinx \
-v /run:/run \
-it eagle-eye-ai:1.0 bash
Running the Application Inside the Container
Once inside the running container, launch the application using the run_eagle_eye_ai.sh
script.
# Default camera_url
default_camera_url="rtsp://192.168.1.11:554/stream1"
# Check if a camera_url parameter is passed
if [ -z "$1" ]; then
# No parameter passed, use the default camera_url
camera_url=$default_camera_url
else
# Parameter passed, use it as camera_url
camera_url=$1
fi
# Source the ROS2 setup script
source install/setup.bash
# Start the ROS2 launch file with the specified camera_url
ros2 launch eagle_eye_bringup follow_cam.launch.xml camera_url:=$camera_url
By following these steps, you can quickly build and run the Docker image for the Eagle-Eye-AI application. This ensures a consistent environment and simplifies deployment. For a more detailed explanation, you can follow the comprehensive tutorial.
Appendix: Creating an FPGA Firmware Debian PackageIn this section, we will cover the steps to create a Debian package for the FPGA firmware needed for our Eagle-Eye-AI application. This Debian package will simplify the deployment of the firmware on the Kria board.
Directory Structure
The firmware-kr260-eagle-eye
directory contains the necessary files and metadata for packaging the firmware. The structure is as follows:
firmware-kr260-eagle-eye/
├── DEBIAN/
│ └── control
├── lib/
└── firmware/
└── xilinx/
└── kr260-eagle-eye/
├── kr260-eagle-eye.bit.bin
├── kr260-eagle-eye.xclbin
├── kr260-eagle-eye.dtbo
├── kr260-eagle-eye.dtbi
└── shell.json
Control File
The DEBIAN/control
file contains the package metadata. Here is an example:
Package: firmware-kr260-eagle-eye
Version: 1.0
Section: base
Priority: optional
Architecture: arm64
Maintainer: Your Name <youremail@example.com>
Description: FPGA firmware for Eagle-Eye-AI application on Kria KR260
This package provides the FPGA firmware files required for running the Eagle-Eye-AI application on the Kria KR260 board.
Building the Debian Package
To build the.deb package, navigate to the directory containing firmware-kr260-eagle-eye
and run the following command:
dpkg-deb --build firmware-kr260-eagle-eye
This command will generate a firmware-kr260-eagle-eye.deb
file in the current directory.
Installing the Debian Package
To install the firmware-kr260-eagle-eye.deb
package on the Kria board, follow these steps:
1. Copy the firmware-kr260-eagle-eye.deb
file to the /tmp
directory on the Kria board.
2. Install the package using the following command:
sudo apt install /tmp/firmware-kr260-eagle-eye.deb
Verifying the Installation
After installation, verify that the firmware files have been copied to the appropriate directory:
ls /lib/firmware/xilinx/kr260-eagle-eye/
You should see the following files:
kr260-eagle-eye.bit.bin
kr260-eagle-eye.xclbin
kr260-eagle-eye.dtbo
kr260-eagle-eye.dtbi
shell.json
By creating and installing this Debian package, you can ensure that the FPGA firmware is properly deployed on the Kria KR260 board, making the setup process for the Eagle-Eye-AI application more streamlined and consistent.
Appendix: Creating LED and Button Control NodesThe custom RS-485 PMOD module includes two integrated LEDs (Red and Blue) and two buttons on its PCB.
We have already added a GPIO AXI interface to our platform in the block design, enabling easy control through two registers:
- LED Control at address: 0x8001 0000
- Push-Button Status at: 0x8001 0008
LED Controller Node
First, we’ll create an LED ROS2 controller node that listens to MotorCmd
messages on the motor_control
topic and periodically updates the LEDs with the current motor status. When the pan motor is moving, the Red LED will be on, and when the tilt motor is moving, the Blue LED will be on.
Here is the led_controller.py
ROS2 Node code that can be added to the existing rotator_py_pkg
package:
import rclpy
from rclpy.node import Node
from rotator_interfaces.msg import MotorCmd
import os
import mmap
class LedController(Node):
def __init__(self):
super().__init__('led_controller')
# Initialize motor speeds
self.pan_speed_ = 0
self.tilt_speed_ = 0
# Open /dev/mem for direct register access
self.mem_fd_ = os.open('/dev/mem', os.O_RDWR | os.O_SYNC)
self.mem_ = mmap.mmap(self.mem_fd_, mmap.PAGESIZE, mmap.MAP_SHARED, mmap.PROT_READ | mmap.PROT_WRITE, offset=0x80010000)
# Create a subscriber to listen to /motor_control topic
self.subscription_ = self.create_subscription(MotorCmd, '/motor_control', self.motor_control_callback, 10)
# Create a timer to periodically update registers
self.timer_ = self.create_timer(0.5, self.update_registers)
self.get_logger().info('Node created.')
def motor_control_callback(self, msg: MotorCmd):
self.pan_speed_ = msg.pan_speed
self.tilt_speed_ = msg.tilt_speed
def update_registers(self):
register_value = 0
if self.pan_speed_ != 0:
register_value |= 0x01 # Set bit0 if pan_speed is non-zero to turn on LED-1
if self.tilt_speed_ != 0:
register_value |= 0x02 # Set bit1 if tilt_speed is non-zero to turn on LED-2
# Write to the register
self.write_register(0x00, register_value)
def write_register(self, offset, value):
self.mem_.seek(offset)
self.mem_.write_byte(value)
def destroy_node(self):
# Clean up resources here
self.mem_.close()
os.close(self.mem_fd_)
super().destroy_node()
def main(args=None):
rclpy.init(args=args)
node = LedController()
try:
rclpy.spin(node)
except KeyboardInterrupt:
node.destroy_node()
return
if __name__ == '__main__':
main()
The node can be launched using the same follow_cam.launch.xml
file in the eagle_eye_bringup
package, which also launches our three main nodes.
Code Explanation
1. Initialization and Cleanup:
- self.mem_fd_ and self.mem_ are used to open and map the /dev/mem device.
- These are initialized in the __init__ method and closed in the destroy_node method to ensure proper resource management.
2. Subscription:
- The node subscribes to the /motor_control topic to receive MotorCmd messages.
- The motor_control_callback method updates the pan_speed and tilt_speed attributes based on the received message.
3. Timer:
- A timer is created to call the update_registers method every 0.5 seconds.
4. Register Update:
- The update_registers method checks the pan_speed and tilt_speed values.
- It sets bit0 of the register if pan_speed is non-zero and bit1 if tilt_speed is non-zero.
- The write_register method writes the value to the specified register offset.
This implementation ensures that the ROS 2 node properly interacts with the /dev/mem device, handles incoming messages, and updates the registers as specified.
And here is also the code that should be added to the follow_cam.launch.xml
file in the eagle_eye_bringup
package to start the LED controller node:
<node pkg="rotator_py_pkg" exec="led_controller" name="led_controller"></node>
Push-Button Controller Node
For situations where manual control of the camera is needed to scan the environment for faces, the two push buttons on our custom-made RS-485 PMOD module can be utilized.
We’ll create a Push-Button ROS2 controller node that reads the GPIO register status of the button states and sends the corresponding MotorCmd
messages on the motor_control
topic to the rotator_controller
node. When the left or right button is pressed and held, the pan camera motor will move left or right, respectively. If both buttons are pressed and held, the tilt camera motor will move up or down, depending on which button was pressed first. When both buttons are released, both motors will stop.
Here is the btn_controller.py
ROS2 Node code that can be added to the existing rotator_py_pkg
package:
import rclpy
from rclpy.node import Node
from rotator_interfaces.msg import MotorCmd
import os
import mmap
class BtnController(Node):
def __init__(self):
super().__init__('btn_controller')
# Open /dev/mem for direct register access
self.mem_fd = os.open('/dev/mem', os.O_RDWR | os.O_SYNC)
self.mem = mmap.mmap(self.mem_fd, mmap.PAGESIZE, mmap.MAP_SHARED, mmap.PROT_READ | mmap.PROT_WRITE, offset=0x80010000)
# Publisher for /motor_control
self.publisher_ = self.create_publisher(MotorCmd, '/motor_control', 10)
# Initialize button states
self.button1_state = False
self.button2_state = False
# Create a timer to periodically check the button states
self.timer = self.create_timer(0.2, self.check_buttons) # Check every 200ms
self.get_logger().info('Node created.')
def check_buttons(self):
# Read button register value
button_register = self.read_register(0x08)
# Process button 1
button1_pressed = ~button_register & 0x01
if button1_pressed and not self.button1_state:
self.button1_state = True
self.handle_button1_press()
elif not button1_pressed and self.button1_state:
self.button1_state = False
self.handle_button1_release()
# Process button 2
button2_pressed = ~button_register & 0x02
if button2_pressed and not self.button2_state:
self.button2_state = True
self.handle_button2_press()
elif not button2_pressed and self.button2_state:
self.button2_state = False
self.handle_button2_release()
def handle_button1_press(self):
if not self.button2_state:
self.publish_motor_cmd(32, 0) # Pan left
else:
self.publish_motor_cmd(0, 32) # both buttons detected, tilt up
def handle_button1_release(self):
self.publish_motor_cmd(0, 0)
def handle_button2_press(self):
if not self.button1_state:
self.publish_motor_cmd(-32, 0) # Pan right
else:
self.publish_motor_cmd(0, -32) # Both buttons detected, tilt down
def handle_button2_release(self):
self.publish_motor_cmd(0, 0)
def publish_motor_cmd(self, pan_speed, tilt_speed):
msg = MotorCmd()
msg.pan_speed = pan_speed
msg.tilt_speed = tilt_speed
self.publisher_.publish(msg)
def read_register(self, offset):
self.mem.seek(offset)
return int.from_bytes(self.mem.read(1), byteorder='little')
def destroy_node(self):
# Clean up resources here
self.mem.close()
os.close(self.mem_fd)
super().destroy_node()
def main(args=None):
rclpy.init(args=args)
node = BtnController()
try:
rclpy.spin(node)
except KeyboardInterrupt:
node.destroy_node()
return
if __name__ == '__main__':
main()
Code Explanation
This code defines a ROS2 node named BtnController that reads the states of two buttons on a custom RS-485 PMOD module via GPIO registers. The node publishes MotorCmd messages on the /motor_control topic to control the pan and tilt motors of a camera rotator. When a button is pressed, it commands the pan motor to move left or right. When both buttons are pressed, the tilt motor moves up or down based on which button was pressed first. The node continuously checks the button states and updates the motor commands accordingly.
The node can be launched using the same follow_cam.launch.xml
file in the eagle_eye_bringup
package, which also launches our three main nodes.
<node pkg="rotator_py_pkg" exec="btn_controller" name="btn_controller"></node>
NOTE
The camera cannot be moved using the buttons if face tracking is active because the gstreamer_pipeline node continuously sends messages on the same topic when a face is detected. The push buttons in this example code can only be used to scan the area when no faces are detected.
CONCLUSIONThroughout this tutorial, we have built a comprehensive and advanced AI-driven smart camera system centered around the KRIA™ KR260 Robotics Starter Kit. Leveraging the capabilities of this powerful hardware platform, we were able to integrate various technologies and protocols to create a versatile solution for real-time face detection and camera orientation monitoring.
By leveraging GStreamer and VVAS (Vitis Video Analytics SDK), we developed a sophisticated pipeline that processes video streams, applies AI inference using the DPU (Deep Learning Processing Unit) with a model compiled via the Vitis AI tool, and overlays critical information onto the video feed. We explored manipulating inference metadata and utilizing custom plugins to display camera orientation status directly on the screen.
Additionally, we delved into communication protocols such as Pelco-D for controlling the camera and Modbus RTU for reading data from an inclinometer. This required careful management of the shared RS-485 communication line to avoid collisions, demonstrating practical skills in integrating multiple protocols on a single communication bus.
One of the significant challenges we addressed was building a custom platform for the KRIA™ KR260 board to support PMOD peripherals, ensuring seamless interfacing with external sensors and controllers. This involved configuring and optimizing the hardware and software stack to meet the specific requirements of our application.
Through this project, we learned how to harness the full potential of the KRIA™ KR260 Robotics Starter Kit by combining AI, video analytics, and sensor integration. The project showcases the board’s versatility and power, making it an ideal platform for developing complex, real-time AI applications. This tutorial provides a solid foundation for anyone looking to explore the capabilities of the KRIA™ KR260 board and build their own AI-powered solutions.
Comments