When it comes to a robot, everyone has a unique picture of what it should be and what it needs to be able to do. As diverse as these pictures are, the prevailing mindset is that a robot should understand its environment and interact with it based on some policies. This simple agenda brings so many implications to the table, and as any scientist does, the robotics engineer has to break down the system into layers of modules and abstractions to make it tractable. Traditionally, this problem breaks down into the following major modules:
- Actuation and control
- Perception and state estimation
- Knowledge representation and decision making
However, each of these modules should operate at a specific rate, and their results should be ready for further subsystems as soon as possible. On one the end of this spectrum, control, and state estimation systems must operate at high rates with minimal latencies. While on the other end, higher-level systems such as decision-making systems may tolerate lower throughputs and higher latencies. This diversity in the timing requirements alongside the computational complexities involved in each module leads implications in designing the embedded systems of the robot. In this blog, as the first in the series, we will try to study this problem and present the suitable embedded computing technologies for each of these subsystems. In future posts, we will provide example projects and tutorials covering each of these solutions.
Actuation and Control:At the lowest level, a robot has a mechanical structure that helps it interact with the environment and move around. These mechanical structures, with their actuators and motion sensors, are modeled using differential equations. The goal of a control engineer is to use this model to synthesize appropriate control topologies to stabilize the structure and make it follow arbitrary motions and force trajectories. Since the natural frequency of these dynamical equations is relatively high, the control systems must run at high rates to meet the Nyquist criterion. Frequencies as high as 1000Hz are typical.
Besides, since computational latency leads to non-minimum phase systems, control engineers often exploit cascaded control topologies. Cascade structure helps the designer hit two birds with one stone. On the one hand, the computations involved in each loop may be delegated to a separate computing element, and on the other hand, the inner-loop control systems reduce the impacts of model uncertainties through feedback.
For example, in the RSL ANIMAL robot, the joints are actuated by specialized servo modules with internal torque/angle control systems. On one end, these servo systems attach to a communication network and accept force/angle setpoints, and on the other hand, they control the internal brushless motors using a high rate internal control loop. In general, a real-time microcontroller or DSP executes this internal control loop. At the next higher level, outer-loop controllers govern the setpoints of these servo modules, which involves exploiting the kinematics and dynamics of the robot. These higher-level controllers run on high-performance Single Board Computers (SBC) with real-time operating systems installed on them( e.x., Preempt-RT Linux kernel).
Perception and Estimation:At the next level, the robot has to gather information about its environment. It has to be able to see and to make sense of the objects around it. The first step towards this goal is for the robot to know where it is in the world. Localization is the subject of broad research. One of the most significant achievements in this field is the Simultaneous Localization and Mapping (SLAM) systems. SLAM aims to utilize the combined information from sensors such as RGB-D cameras, laser scanners, joint encoders, and Inertial sensors to create a map of the environment and localize the robot within it. The SLAM pipeline itself is an amalgam of heterogeneous sensors and processes that should be fit together. For example, the motion data from IMU sensors arrive with low latencies and at high rates while the images from vision sensors are captured at considerably lower frequencies and with much higher delay.
On the other hand, the data captured from camera sensors are much more expensive to process, while the measurements from IMU and joint encoders are less demanding. The computations for the image pipelines and their related 3D geometry optimizations often run on a high-level operating system such as Linux with a preempt-RT patched kernel. Furthermore, to run at acceptable rates (~30-100Hz), these algorithms require high-performance CPUs. Since the estimation latency of SLAM systems is relatively high (~100-300ms), it may not be directly used as a feedback signal for lower-level control systems. To address this problem, one could use low-latency sensors such as IMUs and joint encoders to compensate for this delay while exploiting the vision pipeline to prevent cumulative errors. This is known as sensor fusion for localization and state estimation. This portion of the perception process should be implemented on a real-time microcontroller or SBC to guarantee the determinism required by the control modules.
Knowledge Representation and Decision MakingAt the highest level, decision and policy-making subsystem utilize the data gathered by the perception modules and provide the actuation setpoints for the low-level control systems to guide the robot throughout a task. This level is often computationally expensive but does not have to be extremely high rates and could tolerate much higher latencies. While the perception and control modules stabilize the robot, the decision-making subsystem coordinates high-level control signals. As such, this module is often implemented on a desktop operating system and requires high-performance computational resources.
AI the End-to-End ApproachWhat covered so far was the classical hierarchy-based approach to robotics. This approach integrates various modules, each isolated to a specific group of tasks. With the advent of deep learning and massive datasets, researchers have attempted to tackle the robotics problem in an end-to-end manner meaning that the raw sensor signals and actuator commands are directly attached to a deep neural network. Then, through a training/exploration process, the robot learns to accomplish given tasks.
This paradigm leads to an entirely new problem in terms of implementing the system and distributing the load between various computational resources. Even though an end-to-end system could leverage the heterogeneous computational architectures for preprocessing the data, the deep neural network produces the majority of the required computational capacity. For this reason, companies such as Google, Nvidia, and Xilinx have begun to produce specialized processors designed explicitly for neural inference tasks. These accelerators are aimed to work alongside the CPUs to provide them with the boost required for real-time neural network implementation.
Xilinx Zynq SOCs:Now one might ask, what computing platform could cover all of these paradigms? The answer is almost none. However, with the advent of the Xilinx Zynq family, we are offered a choice that can provide unified processing elements (with slight complications in some cases, though). The Zynq Ultrascale+ MPSOC family has three major processing components, which makes them a perfect choice for robotics implementations. First, they have two cortex-R5 real-time processing units that can handle the low-level, high-speed control tasks. Second, It has a quad-core Cortex-A53 application processor capable of running a Linux operating system. On top of all, they have an advanced FPGA fabric on the chip that could turn into any processing element for custom computations. All these systems are tightly interconnected through standard protocols capable of transferring massive amounts of data. This FPGA fabric can host a wide variety of computational platforms for Vision, DSP, Linear Algebra, and many more.
Recently, the Xilinx company has developed the ML-Suite for accelerating machine learning tasks on the FPGA. Even though they are still new, they show a promising future for the FPGA accelerated AI.
Conclusion and Future PostsIn this blog, we explained the various computing layers in robotic systems. We also introduced the implications of using deep learning approaches in implementing embedded computing platforms for robotic systems. Eventually, we introduced the Xilinx Zynq SOC as a unified solution to this problem.
In future posts of this series, we will provide quick projects and tutorials to show the application of Zynq FPGAs independently and alongside other embedded systems such as the Jetson family of SBCs. The following topics are the fastest to come:
- Image processing acceleration using Zynq FPGAs
- Xilinx Ultrascale+ Cortex-R5 CPUs for Fusion and Control
- AI Acceleration using Xilinx FPGAs
- Xilinx Vitis and robotics
Comments
Please log in or sign up to comment.