The idea was to accelerate part of the DOOM using the FPGA. Soon we realized that we need an OS capable of running upon the ARM of the Zynq UltraScale+ MPSoC ZCU102 Evaluation Kit and, at the same time, able to manage HW accelerators on the FPGA-side. The work was proposed as a Final-Year University project (the challenge was accepted by David Lima, here his TFM), and, finally, a publication at a conference appeared.
While the paper focuses on the Desing Space Exploration of the different HW/SW partitioning choices, the GitHub repositories contain all you need to replicate the work and propose your own new DOOM version with new hardware accelerators.
The Zynq UltraScale+ chip on the board is a complex heterogeneous MultiProcessors System on Chip (MPSoC). It has a quad-core Arm® Cortex®-A53, a dual-core Cortex-R5F real-time processors, and a Mali™-400 MP2 graphics processing unit based on Xilinx's 16nm FinFET+ programmable logic fabric.
The first thing needed is a Linux-based OS running on the quad-core Arm® Cortex®-A53. In order to create a burning SD-card, instructions and scripts are provided within this first GitHub repo:
Instruction for creating the system.
The second step consists of profiling the running game in order to locate the candidate functions to be offloaded onto the FPGA. In this way, the CPU will be free of executing something else, and, possibly, less energy is used to compute the same task (while accelerating it!!). In the same repo already reported, there are some tips to profile the game using gprof.
The function we choose to accelerate is called I_stretch2x
in the original crispy-DOOM code. It basically re-arranges the game frame to be shown on a bigger resolution. It is possible to use just one accelerator to perform the task. However, we decide to divide the input buffer into independent pieces of images. This way, it is possible to use one accelerator for piece (of course, we do not have data dependencies among these pieces of buffer). The idea is summarized in the following image. Imagine you have 4 hardware accelerators. Then you would divide your image into 4 slices which are processed in parallel by the accelerators:
Then the HW accelerators are created using High-Level Synthesis with Vivado SDSoC 2018.1. The repo contains the source file to re-create the project.
Finally, SDSoC produces not only the Hardware but also the SW functions to send/receive the data to/from the FPGA. We just have inserted this created function within the source code of the game. The instructions and the code are reported within this GitHub repo:
crispy-DOOM v3.0 with hardware acceleration.
Following the instructions in the link, you will be able to play DOOM on the ZCU102 using HW accelerators!
Proposed homework for the readers for having fun:
- to run the game with hardware accelerators on different platforms (Pynq, Ultra96, ZynqBerry, etc.).
- to try to accelerate other functions: new bitstream should be created and new function must be used within the source of the game.
Have Fun!
Comments
Please log in or sign up to comment.