MicroZed Chronicles: Triple Modular Redundancy and MicroBlaze
One common theme throughout my career has been developing FPGAs and SoCs for high reliability applications. This has ranged from…
One common theme throughout my career has been developing FPGAs and SoCs for high reliability applications. This has ranged from controlling nuclear reactors to satellites, defense, and automotive applications.
Of course, there are many standards which are used to guide the development of these solutions — for example, IEC61508, ISO26262, and DO 178/254.
While there is significant analysis and system level design work required before we start with the FPGA design for these applications.
One commonly used technique within programmable logic designs for such applications is that of Triple Modular Redundancy (TMR).
At the simplest level, TMR implements three identical designs and takes a majority vote on the outputs. This voting mitigates for a single failure of one implementation.
Depending upon how the failure occurs, it may or may not be correctable. For instance, corrupted memory data maybe be corrected if you use ECC or TMR, while a input stuck high or low is not.
Rather helpfully, the Vivado IP Library contains a number of components that can assist us implement TMR in our design.
These are:
- TMR Voter — This implements a majority voter, the output value is that which two out of the three inputs agree on.
- TMR Comparator — This is used to identify faults on its input bus. It is also possible to use the comparator to check for voter errors. The difference between a TMR Voter and TMR Comparator is that the voter outputs a value which the majority vote for, a comparator outputs the results of the comparison.
- TMR Manager — This manages the overall TMR system state, including fault detection and recovery.
- TMR SEM — This implements Soft Error Mitigation to ensure soft errors in the device configuration memory cannot impact the design in the logic.
- TMR Inject — This provides the ability for fault injection, and to ensure fault detection and fault recovery logic is working correctly.
When we create our programmable logic solutions, we use many different interfaces, e.g. AXI, BRAM, Discrete, etc. We therefore need to be able to work with these different interfaces in our TMR solutions.
As such, when we instantiate both the TMR Comparator and TMR Voter in our designs, we can select from a wide range of inputs standards.
Just as in non-safety designs, softcore processors like the MicroBlaze are often used for implementing functions which are better implemented by SW.
Obviously, for safety applications these processors also need to be reliable. One common method is to implement a TMR MicroBlaze solution, which makes use of the TMR blocks outlined above.
Implementing this solution may at first seem to be complex. Which TMR blocks we include depend upon if we wish to implement a fault tolerant or fail safe design.
For a fault tolerant design, we include TMR voters, TMR Inject, and TMR SEM. A fail safe design will also include TMR Managers and TMR Comparators.
However to get us started, Vivado provides a great example design that can be used a create starting point. This reference design is provided with the TMR Manager IP block.
We can open this reference design by right clicking on a TMR manager within a design and selecting the open IP Example Design.
This will create a new project, which implements a TMR MicroBlaze solution.
This example design can then be targeted to your development board. For this example, I targeted an Arty S7–50. Adapting the design to the Arty S7–50 requires updating the clocks and UART connections at the board level.
The architecture of the triplicated MicroBlaze solution is very interesting to explore.
Between the three MicroBlazes, the TMR SEM IP is used to mitigate for soft errors, while the UART output is also voted on using a TMR Voter.
Internally, each MicroBlaze implements TMR voting on the data and instruction BRAMS, while also implementing comparison on the UART, AXI Lite, and Trace ports. This comparison takes in feeds from all three MicroBlazes to perform a comparison and feeds a TMR Manager.
When I targeted the Arty S7–50 development board, the resource utilization of the final implementation was:
The next stage is to develop the SW application using SDK.
Within the create application project dialogue, you will notice there are three MicroBlazes available.
As all three processors are running the same application, we can develop the application SW for any one. For this example, I chose the first MicroBlaze.
As this example was just to demonstrate how to create the reference design and introduce the TMR IP components, I made a simple “hello world” application.
To run the simple application on the reference design once we have the ELF, we need to merge the ELF with the bitstream and program the FPGA.
I did this using Vivado and the associate ELF files dialog, available under the Tools menu.
When this bit file was downloaded to the Arty S7–50 board, the traditional hello world output was generated as would be expected.
This example introduces the TMR library components and demonstrates just how easy it is to create an initial TMR MicroBlaze solution based off the reference example design.
Of course, we can also use the TMR library elements provided by Vivado with our logic designs as well. This is also very useful in creating safe and secure solutions.
Learn more about designing FPGA for mission critical systems in the video below.
Further Reading on TMR and MicroBlaze can be found here
See My FPGA / SoC Projects: Adam Taylor on Hackster.io
Get the Code: ATaylorCEngFIET (Adam Taylor)
Access the MicroZed Chronicles Archives with over 250 articles on the Zynq / Zynq MpSoC updated weekly at MicroZed Chronicles.