The project name "Xiraffe" is a combination of "Xilinx" and "Giraffe". This
The project aim, in very simple words, is to build an "animal observatory", to observe animals from a distance, using a stabilized, pan-able camera, mounted somewhere high (on a long mast or tree-top), with a long focal length (a.k.a zoom) lens so that a wide area can be covered by a single unit. Such a device can be used by wildlife biologists to help study animal habits, migration patterns, etc.
The same device can also be useful for animal conservationists in places like animal sanctuaries, to detect and prevent illegal animal poaching. The device can be mounted somewhere high and can help keep a look-out over a wide area...much like a Giraffe!
In this post, I will try to document my journey in attempting to build such a device.
BackgroundI had been using Xilinx FPGAs for some time now, and have enjoyed using the Zynq-7000 SOCs in various projects. I am a big fan of the SOC concept, and have found a good place for Zynq FPGAs in various projects in the past, especially when working with imaging sensors. Lately, however, I have been wanting to explore the more recent Zynq Ultrascale devices boasting all kinds of fancy capabilities. This project seems a good fit for this.
The board took its sweet time to reach my hands, but it finally did. My own experience working with Xilinx devices is mostly using Vivado and SDK, and most of my designs have been baremetal or FreeRTOS based, but none on petalinux. Since in this project I needed to use both the VCU and DPU, I knew I must dabble in the linux domain. It has been quite a learning curve, so I will attempt to document as much detail of my journey as I can.
The KitThe ZCU104 Kit included the ZCU104 board, with it's power supply, a 16GB SD card, a USB3 hub, and a USB3 camera. Everything came very well packed. The only other things I needed to get started was a HDMI cable, a display port cable, and a monitor. Thankfully I already had these with me.
The Getting-Started journey...Since there was a lot to learn (both on hardware and software side) before I could start on the specifics of the project, I embarked on the journey of familiarizing myself with this beast of a board, and the available tools available.
Step 1 : Power up the board and run the factory image: I followed the instructions in the brochure included with the kit. This included a few peripheral tests as well verified the various voltages on the board by looking at the leds.
Step 2: Try out PYNQ: I had been very excited about PYNQ since python makes trying out ideas much easier than C/C++ in general.
Setup was simple, by downloading the image for ZCU104 and following instructions at http://www.pynq.io/
I logged into the board and went through all the provided examples, which was quite interesting. Frankly, I was hoping that PYNQ would provide enough "power" to allow me to do all what I wanted in the project, at-least for prototyping purposes. However, I soon realized that PYNQ did not support usage of DPU or the VCU, which were both needed for my project. I tried to dig around on the internet and go the impression that I would need considerable skills to have these integrated and usable inside PYNQ, so I gave up on it for the time being, and
Btw, I have recently come across nice posts here and here which show how to use the DPU with PYNQ. I plan to try them out soon. For using the VPU in PYNQ, I haven't found anything useful yet.
Step 3: Try the USB3 camera reVISION stack examples: While trying to look up some info on the included camera, I came across this link, and decided to try out the reVISION examples on the ZCU104. I found the dense optical flow example especially fascinating and it gave me good motivation to try something similar on the board in future. I also had some learning on using gstreamer pipelines on the board, which was useful later also.
Step 4: Install VITIS: By this point I was convinced I needed to go the VITIS route to be able to use both the DPU and the VCU. And so, I got started by following the excellent guide (including video) by Adam Taylor at this link. In my case, since I did not have a dedicated Linux PC, I installed VITIS inside a VM (I used VMware but VirtualBox or another should work as fine).
a) One *must* use a compatible version of Ubuntu. I initially tried with Ubuntu 18.04 and I was getting strage error whereby it wouldn't move forward with the Vitis intallation. Finally I created a new VM for 18.04.4 LTS and things worked fine.
b) When you create the VM, make sure to assign at-least 120GB space, because the Vitis installation doesn't move forward unless it sees all the free space it thinks it will need. I initially tried with 100GB assigned to the VM and it wouldn't proceed with the installation. In my case, I used the Linux utility "GParted" to extend the disk to 150GB, after increasing the allocated disk space in Virtual Machine settings. I learned later that even this is not enough when you begin to install and use other Xilinx tools (XRT, petalinux, Vitis-AI, etc) and I had to resize my drive several times until 200GB.
At this point, I would recommend
Step 5: Followed along with Adam Taylor's Webinar "Building Accelerated Applications with Vitis", available here.
I hit a minor snag in the end when the QEMU simulator wouldn't start. Looking carefully at the error message, I saw that it couldn't recognize the keyword "netstat" so I opened a terminal window in the Ubuntu OS and installed net-tools, after which it started working fine.
sudo apt-get install net-tools
Note that the tutorial guides you until integrating DPU with your custom design. Usage of this DPU in Vitis-AI is not covered by this particular webinar.
Step 6: Vitis In-Depth Tutorials: I followed along some of the Vitis In-depth tutorials to familiarize myself with some of the newer concepts in Vitis (I have mostly worked with SDK in the past), and to get familiar with the AI side of things, which I am very new to. These tutorials were a good start
https://github.com/Xilinx/Vitis-In-Depth-Tutorial
Step 7: Build IVAS TRD Design: I came across Xilinx's recently published IVAS (Intelligent Video Analytics System) reference design (here), which seemed to be a good starting point for my project, since it incorporates usage of DPU for AI inference, as well as the the VPU for video encoding/decoding.
It took me several days to successfuly recreate the design and run the Facedetect + reid demo, but eventually I managed and was quite happy with the result. The design uses gstreamer to run the demo, which was both good and bad. Good because I got to play around with gstreamer quite a bit, and bad because I felt that building my application as gstreamer filters (as done in the demo) was a bit too much for my current competency level. That said, I decided to use this image as my base platform, and build my application on top of it.
Building the HardwareOne of the important aspects of this project is that when you're working in non-urban environments, you don't have rigid structures to mount cameras on. So you would use e.g, a long pole hoisted using "guy-lines", or perhaps a pole tied to the top part of trunk of one of the higher trees in the area, to give a high vantage point, and be able to observe larger areas. Either way, you will have a somewhat unstable mount for your camera, and hence unstable video feed..not so good for image processing. I decided to address this by mounting the camera on a 3-axis brushless gimbal, and (eventually) do video some stabilization in the FPGA to remove the remaining motion artifacts.
The brushess gimbal should actually give two benefits:1. Stabilize the video to a major extent2. Allow using the same motors to steer the camera, as we will need Pan controls in our system.
For the brushless gimbal part, I bought a cheap 3 axis Brushless Gimbal Frame, which came fitted with 3 small motors. The design was not the best, but I didn't want to spend too much time on this part of the project yet.
I researched a good brushless gimbal to use in my design and came across "STorm32 BGC" from Olliw. I ordered a V1.3 board to get started, which I received in two weeks. I tried it with a Gopro first, and it worked really well and was very easy to setup. My initial plan was to port the ARM Coretx-M3 code to an ARM Cortex-M3 instantiated inside the FPGA, or the Cortex-R5 inside the ZU7-EV. However, I found out, to my dismay, that although the hardware for this particular gimbal board is open-source, the software is NOT. You can use the compiled binaries as you like, but the code is not open. I was somewhat disappointed, but I had to move on with how things were. Luckily, I found that this gimbal allows controlling the position through uart interface, so I decided to make use of that instead.
Since I needed a long focal length (a.k.a "zoom") lens, I unscrewed the M12 lens on the included USB3 camera, and replaced with a 35mm C-mount lens using a C-to-M12 adapter in-between (links in the project BOM). This gave me a very narrow field-of-view along as I needed, along with much better image quality.
Next step was to mount this on my brushless gimbal. I designed and 3d-printed a bracket so that I can mount this camera on my Brushless Gimbal setup.
I uploaded the firmware on the board, and attempted to calibrate the gimbal. This is where a hit a major snag with my Gimbal setup. I found out that the motors on the gimbal were not powerful enough to hold and move the camera+lens. I was able to make the situation better by adding some metal pieces and a handsome amount of hotglue (some of it visible in the images above) to the back side of the camera for balancing. However, as soon as I connected the USB-C cable to the camera, the gimbal could not hold or move the camera much. The USB cable was too heavy and stiff to allow the feeble motors to stabilize and move the camera freely. After trying out several things, I concluded that I needed bigger brushless motors, and also print my own bigger gimbal frame, to be able to balance and move the camera better.
At this point, due to shortage of remaining time, I decided to put this part aside until I receive bigger motors, and focus more on the software/FPGA side of things.
Building the SoftwareOn the application level, my desired video processing pipeline was as following:1. Detect Motion in Scene.2. Run AI recognition on only parts of scene with movement.3. Track objects inside frame and in 3d space using gimbal motion, once objects of interest (animals, people or vehicles - as per job setup) are detected.4. Log information (timestamps, motion paths, counts, etc) in text/xml/json files. Optionally, generate alerts.5. Compress (H.264/H.265) and save short video clips to disk, or stream live video over network to basecamp as required.
Before building the full system with all of the above, I needed to prototype and get individual parts working, first.
Motions Detection In Scene:Once the camera is set to look in a certain direction, detect motion in the scene using image differencing followed by some IIR filtering and blob analysis. There were two reasons why I wanted to have this stage before AI inference:
a. Reduce power consumption by not running the DPU all the time over each frame.b. Increase detection range by using bigger input images. Memory and processing power requirements of image differencing & blob analysis is much smaller and can be done on full sized camera frames. AI can then only be run on the parts with motion, scaled appropriately, for maximum detection performance.
I prototyped and fine-tuned the algorithm on my PC by writing a script in Halcon language. The code can be found on my github page here. Note that I decided to use an IIR filter for the previous/reference image as it gave me better compromise between motion detection and ignoring minor continuous motion like leaves moving due to wind. The algorithm can be easily translated to OpenCV, and operations like image differencing, erosion, dilation etc can later be accelerated into a FPGA kernel using pre-built functions inside Vitis Vision Library.
Object Tracking:For object tracking inside frame, I used code in this post as starting point and modified as per my needs. Since this code is in opencv, this can directly be used inside my final application. The latest version of my tracking code can be found on my github page here.
Here's a short video of the code in action. I had good fun tracking birds as they went about their business strolling on roofs.
AI Detection:This brought me to the most interesting part of this project, and the part requiring the biggest learning curve. As I mentioned earlier in the "getting started" part of this post, by this point I had identified that the best option would be to build my application on top of the IVAS reference design, as it already incorporates DPU as well as VPU (both were required by the contest rules).
I started by following the build steps on this page. Before starting, I would highly recommend making sure you have enough space. For example, if you're running your host Ubuntu OS inside a VM like me, make sure you allocate at-least 200GB to the VM. This will save you at-least a day's worth of wait time if you have a modest i7 machine like mine. To increase disk space allocation, first increase allocation in VM settings, then boot the OS and use "GParted" utility to have the OS claim the newly assigned extra space.
On my Ubuntu host machine, where I had already installed the Vitis tools, I started by setting-up new directory under home and cloning the full repository. In a new terminal:
cd ~
mkdir IVAS_TRD
cd IVAS_TRD
git clone --recurse-submodules https://github.com/Xilinx/Vitis-In-Depth-Tutorial
cd ~/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml
Source Vitis and petalinux scripts
source /tools/Xilinx/Vitis/2020.1/settings64.sh
source /tools/Xilinx/PetaLinux/2020.1/settings.sh
Before moving further, I applied the recommended patch related to rootfs partition size
sudo cp vitis_patch/mkfsImage.sh ${XILINX_VITIS}/scripts/vitis/util
Now we can make the provided zcu104_vcu platform
cd platform/dev/zcu104_vcu
make
This took a very long time on my modest VM running on an i7 machine. I also ran out of allocated space, and had to start over, so make sure you don't repeat the mistake. However, in the end, somehow the BIF file was not getting created:ERROR: Failed to generate BIF file, File "/home/saadtiwana/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml/platform/dev/zcu104_vcu/petalinux/project-spec/hw-description/xilinx_zcu104_vcu_202010_1.bit" doesn't exist.Makefile:52: recipe for target 'bootimage' failed
To work around this, I manually copied the file as per the script, which solved my problem and enabled me to move on.
cp /home/$USER/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml/platform/dev/zcu104_vcu/petalinux/components/plnx_workspace/device-tree/device-tree/xilinx_zcu104_vcu_202010_1.bit /home/$USER/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml/platform/dev/zcu104_vcu/petalinux/project-spec/hw-description/
After platform is built successfully, extract and install sysroot
cd ../../repo/xilinx_zcu104_vcu_202010_1/sysroot
./sdk.sh -d `pwd` -y
Now, we need to build the Vitis design. First, source the XRT setup script:
source /opt/xilinx/xrt/setup.sh
Next, apply patch (to be applied ONE TIME ONLY!)
cd ~/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml
cd hw_src/Vitis_Libraries
patch -p1 < ../vision_lib_area_resize_ii_fix.patch
And then build the design
cd ../..
cd hw_src
make
At this point, I realized I needed to install Vitis-AI library before moving forward. And for that, I needed petalinux installed first. For petalinux, I took guidance from this post and had it installed in no time.
Afterwards, I installed Vitis-AI by following instructions on this page.
Make directory for installation
mkdir ~/petalinux_sdk
Afterwards, download the installation script (sdk-2020.1.0.0.sh) and execute it
./sdk-2020.1.0.0.sh
when prompted, I gave following path for installation, as per instructions on the page:~/petalinux_sdk
Once done, source the script
source ~/petalinux_sdk/environment-setup-aarch64-xilinx-linux
Download Vitis-AI compressed file and extract
tar -xzvf vitis_ai_2020.1-r1.2.0.tar.gz -C ~/petalinux_sdk/sysroots/aarch64-xilinx-linux
At this stage, Vitis-AI libraries were set-up. Although not required for IVAS design, I went ahead with the rest of the instruction on the page in order to try out the examples.
Now, I needed to install the Vitis-AI docker containers to be able to compile the densebox and reid models. To install these, I followed the simple steps here, which went without any issues (I used the pre-built docker images). I also found some nice resources for Vitis-AI v1.2 here that were quite helpful over the next few days.
To download the AI models, I downloaded the models from the AI model zoo
To compile, I followed the Vitis-AI guide. For example, for the densebox model, download the model from the model zoo and and extract to ~/Vitis-AI/AI-Model-Zoo/models/ (Can be put elsewhere too actually). Next, change to the quantized/edge directory
cd ~/Vitis-AI/AI-Model-Zoo/models/cf_densebox_wider_360_640_1.11G_1.2/quantized/Edge
Now, run the docker image, activate the conda environment and compile the model
. ~/Vitis-AI/docker_run.sh xilinx/vitis-ai:latest
conda activate vitis-ai-caffe
vai_c_caffe --prototxt ./deploy.prototxt --caffemodel ./deploy.caffemodel --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU104/arch.json --output_dir compiled_model --net_name densebox_640_360
The compiled.elf file will be under "compiled_model folder". Note that I was able to successfuly compile tensorflow models, but did not have the same luck with Caffe models. I asked for help here and here. For the purpose of the IVAS, the Xilinx engineers were kind enough to provide me the compiled elf files for the models so that I could move forward.
For this, I started in a fresh terminal (highly recommended by Xilinx) and sourced the environment setup script
cd ~/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml/platform/repo/xilinx_zcu104_vcu_202010_1/sysroot
source ./environment-setup-aarch64-xilinx-linux
Next, I made a directory at the appropriate place
cd ~/IVAS_TRD/Vitis-In-Depth-Tutorial/Runtime_and_System_Optimization/Design_Tutorials/02-ivas-ml/
mkdir ml_models
and then copied the two elf files inside the ml_models folder. Now all I needed to do was run the final script to do the magic and create SD card image:
./go.sh
Once this completed without errors, I burned the image at the following location to an sdcard using etcher:hw_src/sd_card_zcu104/sd_card.img
After this I followed the last few steps here until the end of page and was able to, EVENTUALLY, run the sample applications.
I say "eventually" because it was not uneventful for me, infact it took me a whole 2 days of figuring out. Initially I tried running the applications and nothing would show up on the screen. I started troubleshooting, and reading up. One very useful resource to get the DisplayPort output working was the DP Linux driver page which has a lot of information on the Display port DRM driver.
One thing I found out was that my monitor was of 2560x1440 resolution, and to see the output on the screen, the images/video being sent to monitor must be exactly the same resolution. After much reading and trying different things, I found a workaround to set resolution to 1920x1080, but it also started the test pattern generator (42 is the id of connector from modetest -m xlnx)
modetest -M xlnx -s 42:1920x1080@RG16 &
Next, to work around the issue of the test pattern, I set the opacity of the terminal layer to zero
modetest -M xlnx -w 39:alpha:0
A side-effect of this was that the terminal layer would disappear, but it was not an issue since I was logging into the board through serial uart anyway, and was more interested in the video itself. Now, I could successfuly run outputs in 1920x1080 resolutions, which I tested using the following gstreamer pipeline using a video test source:
gst-launch-1.0 videotestsrc ! "video/x-raw, width=1920, height=1080" ! queue ! videoscale ! "video/x-raw, width=2560, height=1440" ! queue ! kmssink driver-name=xlnx sync=false
Note that the autovideosink also works fine with the kmssink driver
gst-launch-1.0 videotestsrc ! "video/x-raw, width=1920, height=1080, format=YUY2" ! queue ! autovideosink sync=false
After this, I managed to get a live video from the USB3 camera displayed on the screen:
gst-launch-1.0 v4l2src device=/dev/video0 ! "video/x-raw, width=1920, height=1080" ! videoconvert ! queue ! autovideosink
And finally, I could run the IVAS demo application:
gst-launch-1.0 -v \
filesrc location=/home/root/videos/dahua2.mp4 ! \
qtdemux ! \
h264parse ! \
omxh264dec internal-entropy-buffers=3 ! \
tee name=t0 t0.src_0 ! \
queue ! \
ivas_xm2m kconfig="/home/root/jsons/kernel_resize_bgr.json" ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_densebox_640_360.json" ! \
scalem0.sink_master ivas_xmetaaffixer name=scalem0 scalem0.src_master ! \
fakesink \
t0.src_1 ! \
scalem0.sink_slave_0 scalem0.src_slave_0 ! \
queue ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_crop.json" ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_reid.json" ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_swbbox.json" ! \
queue ! kmssink driver-name=xlnx sync=false
Bonus: To display the fps using any pipeline, you can change the last element to: fpsdisplaysink video-sink="autovideosink" sync=false text-overlay=true fullscreen-overlay=1
For example in the following pipeline, where I managed to get the same application working with live video:
gst-launch-1.0 -v v4l2src device=/dev/video0 io-mode=dmabuf ! "video/x-raw, width=1920, height=1080" ! videoconvert ! video/x-raw,format=NV12 ! tee name=t0 \
t0.src_0 ! \
queue ! \
ivas_xm2m kconfig="/home/root/jsons/kernel_resize_bgr.json" ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_densebox_640_360.json" ! \
scalem0.sink_master ivas_xmetaaffixer name=scalem0 scalem0.src_master ! \
fakesink \
t0.src_1 ! \
scalem0.sink_slave_0 scalem0.src_slave_0 ! \
queue ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_crop.json" ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_reid.json" ! \
ivas_xfilter kernels-config="/home/root/jsons/kernel_swbbox.json" ! \
queue ! fpsdisplaysink video-sink="kmssink driver-name=xlnx" sync=false text-overlay=true
IMPORTANT: For all pipelines, when using the kmssink (xilinx DRM driver) it's very important to make sure that the video going to the last element is same as resolution set on your monitor!
VideoCompression:At this point, I had the video decoding working fine in the VPU, using gstreamer, and wanted to try video encoding also. This is where I hit a snag. I have written about it here on Xilinx forums. I need to look into this further.
You can try it by using the simplest encoding-decoding pipeline that I can think of
gst-launch-1.0 videotestsrc ! "video/x-raw, width=1920, height=1080" ! omxh264enc ! h264parse ! omxh264dec ! queue max-size-bytes=0 ! autovideosink
At this point, I was wondering how to integrate the VPU inside my C++/python application, and I eventually figured out a very easy way to do that: By using gstreamer in my application.
For example, to use the VCU to decode H.264/265 file or RTSP stream, use following pipeline (simplified for clarity) inside the opencv application:
<Compressed file/stream source> ! omxh264dec ! appsrc
And to encode (compress) video frames from within application, send them to the VCU using the following (simplified) gstreamer pipeline:
appsink ! omxh264enc !....<udp sink?>
You can learn more by looking at code here, and also in the C++ files on my project github page. Hope it helps!
Bringing it all together
Now that I had all the pieces building blocks figured out and tested, and with few days left until submission deadline, I started putting the pieces together.
I made a plan to build my application as follows:- Build the main application in python, run on top of IVAS reference design image- Use gstreamer pipeline with "appsink" element to bring video into the OpenCV domain. This way, video can be switched quickly between Live video from camera <or> compressed video from file or RTSP stream, decoded by VPU (using omxh264dec element in the gstreamer pipeline).- Pre-process the video frames in opencv-python- Use Vitis-AI python bindings to talk to DPU- Post-process the data and video frame. Generate overlay graphics using opencv- Display frames with overlay graphics using gstreamer pipeline using "appsink" element and sent to displayPort using "kmssink" element.- Additionally, use gstreamer "tee" element in the pipeline to send this final imagery to VPU using "omxh264enc" element and send to UDP sink as RTSP stream. Although in this I had issues with my encoder driver as mentioned previously and here.
Since the IVAS example was purely in gstreamer, with the DPU processing parts also built as gstreamer pipelines this was not very useful for me. I started writing my application in python by looking at C++ examples, which took me a lot of time and I was having a lot of issues with post-processing the DPU data (you have to do several operations to determine positions of bounding boxes, for example). Thankfully, I found a similar excellent example from Avnet's Mario Bergeron on hackster. Since the post processing (determining positions of bounding boxes) was same, I used his code as a starting point and modified it to use gstreamer input, process using a densebox model compiled for zcu104 and output to displayport using gstreamer. It was great to finally get this working, as shown in the video clip below:
Note that in this video, I am using live video from camera through gstreamer pipleine. The camera has the 35mm lens mounted, and is pointing towards my cellphone screen playing a random video from YouTube with people in it. The code and instructions to run can be found on my github link here.
Moving to the next step, since I actually needed an object detector instead of face detector, I started with modifying the code to use SSD object detector instead. When trying to run the code, my application would hang (and timeout) waiting on DPU. Initially I suspected some issue with my application, so I looked through examples to find a suitable one using the same object detector. I found a VART example from Xilinx using the same SSD model, although it was in C++. The original code uses a raw (webm) file for input, so I modified it to use gstreamer input and output, as can be seen in the beginning of the source code file:
VideoCapture cap("v4l2src device=/dev/video0 io-mode=dmabuf ! video/x-raw, width=1920, height=1080, format=UYVY ! videoconvert ! appsink", CAP_GSTREAMER);
VideoWriter writer("appsrc ! queue ! autovideosink sync=false",
0, // fourcc
30, // fps
Size(1920, 1080),
true); // isColor
I first tested my input and output using gstreamer pipelines which worked well, then started modifying the rest of code to use gstreamer data instead of frames from video frames. At this point, however, after compiling and running the application, I found that it too was hanging upon waiting for DPU to respond. After searching through the forums, the only possible issue I could think of was that perhaps the model.elf file I was using might have been compiled for a different DPU architecture than my IVAS design's DPU. So I set about downloading the model from the model zoo and trying to recompile it on my host machine. This is where I ran into some issues that I had run into previously with the Caffee models also. I have posted on the forums for help and awaiting reply. Thinking perhaps there's a versioning issue there, I tried running the quantization but ran into some issues there too, which I am mostly blaming on my limited experience working with AI models, for now. My modified code can be found on my github page.
With one day left until submission deadline, I decided to pause things here and finish documenting the progress on the project, here on hackster.
Future plansSince my idea is related to an actual project, I hope to continue working on it, with hopes of also reusing this knowledge in other projects as well. Some main points:
- Get issues sorted with model compilation and test full system.
- Dabble more indepth into technical details around AI models, especially on the post-processing side
- Train my own model on animals, perhaps even acquire some data of my own once travelling resumes.
- Implement video acceleration kernels in PL (image resizing, preprocessing for DPU, etc) for faster end-to-end acceleration- Do video stabilization on FPGA side, to augment the gimbal stabilization and get the smoothest video output.
When starting this project, I already realized time would be in short supply, considering the usual learning curve with FPGA projects. I tried to build step-by-step, dividing the project into smaller logical modules, prototyping and learning along the way. Despite running out of time in the end before hitting all the goals, I am happy with the progress I managed to make. The time constraints also helped push me to learn things at a faster pace than the usual "exploring" pace. It has also helped me learn a lot of new things in the embedded domain, and helped identify domains where I need to invest more time (e.g, AI) in future. I am thankful to Hackster and Xilinx for providing this learning opportunity. It truly has been a great experience.
Comments