We have looked at thermal imaging before however, we did that from scratch using Vivado and Software. In this project we are going to combine the FLIR Lepton with PYNQ, doing so allows me to create a PYNQ overlay that fellow lepton users can use to get started accelerating there applications without the need to worry about how to interface with the lepton sensor.
This is going to be a fun project as we are going to create the following
- VHDL module and test bench to interface with the Lepton
- PYNQ Overlay to control the lepton from Python
- Jupyter note book to pull it all together
All of these will be available for download and use from my GitHub. It will be fun as well as it is not often I write a lot of VHDL for these projects,
FPGA ApproachThe Lepton outputs it video in a very interesting manner, it uses SPI to create something called VoSPI or video over SPI. This uses the SCLK, SS and MISO lines on a traditional SPI interface.
Over this SPI link the video is transferred as in packets, there are eighty 14 bit pixels, along with a CRC and header in a packet. For the Lepton 2 there are 60 packets, one for each row.
To comply with export restrictions, the frame rate of the Lepton is 9 Hz this means it outputs some frames which are to be discarded. These are discard frames are identified in the packet header as non-valid, valid frames are indicated by a packet number between 1 and 60.
As the frame rate is so slow we can have a simple interface which captures and processes the 16 bit serial word on the MISO signal. Determines if it is discard frame, or if valid reads in the packet.
Valid packets will be written into a Dual Port RAM such that the RAM will contain a linear increment of all pixels in the image all 4800 in total.
We need to be careful here when we develop the VHDL module to ensure the code increments addresses on the word boundary and not byte. As such it will increment by 4 for each new address.
This Dual Port RAM will be able to be accessed from the PS using PYNQ and the image taken. We can then use PYNQ and Python to do more advanced processing or even recording on the output.
The IP core will be capable of being reset if video synchronization is lost however, it is designed such that discard packets will be detected and ignored.
As such we will create the two VHDL files a source file which will be used in the overlay and a test bench which is used as to test source file. This enables us to simulate the design and ensure we have the functionality as required.
To ensure the correct behavior the test bench will apply the following
- Three Discard frames following the assertion of SS
- Sixty packets of 80 pixels, with valid header and CRC
This frame will repeat twice, to ensure the IP core works correctly once the first frame has been received.
Running a simple test using a bare metal approach you can see the RTL below functions as necessary.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity lepton_if is generic(
lines : integer := 60;
pixels : integer := 80);
port(
clk : in std_logic;
reset : in std_logic;
sclk : out std_logic;
miso : in std_logic;
ss : out std_logic;
line_out : out std_logic_vector(7 downto 0);
line_val : out std_logic;
rstb : OUT STD_LOGIC;
enb : OUT STD_LOGIC;
web : OUT STD_LOGIC_VECTOR(3 DOWNTO 0);
addrb : OUT STD_LOGIC_VECTOR(31 DOWNTO 0);
dinb : OUT STD_LOGIC_VECTOR(31 DOWNTO 0));
--doutb : IN STD_LOGIC_VECTOR(31 DOWNTO 0));
end entity;
architecture rtl of lepton_if is
constant xfer_size : integer := 16;
constant hdr : std_logic_vector(7 downto 0):= x"ff";
constant total_pix : integer := 160;
type fsm is ( idle, sync, crc, packet, check);
signal current_state : fsm;
signal shift_reg: std_logic_vector(15 downto 0);
signal int_cs : std_logic:='1';
signal shift_count : integer range 0 to 31;
signal line_num : std_logic_vector(7 downto 0); --use this plus the number of bytes in a packet to form the address
signal packet_count : integer range 0 to 60;
signal frame_delay : integer range 0 to 63;
signal pixel_count : integer range 0 to 255;
signal addr_cnt : unsigned(31 downto 0);
signal valid : std_logic;
--droping CS starts the video stream so will either get disgard frame or frame zero
begin
--for test bench send in packets for all with count with line counter
shift_process: process(clk,reset)
begin
if reset = '1' then
shift_count <= 0;
shift_reg <= (others=>'0');
elsif rising_edge(clk) then
if int_cs = '0' then
shift_reg <= shift_reg(shift_reg'high-1 downto shift_reg'low) & miso;
if shift_count = 16 then
shift_count <= 1;
else
shift_count <= (shift_count + 1);
end if;
end if;
end if;
end process;
line_out <= line_num;
addrb <= std_logic_vector(addr_cnt);
sclk <= clk when int_cs = '0' else '1';
--ss <= int_cs;
ss_process:process(clk,reset)
begin
if reset = '1' then
ss <= '1';
elsif falling_edge(clk) then
ss <= int_cs;
end if;
end process;
cntrl_process : process(clk,reset)
begin
if reset = '1' then
rstb <='0';
enb <='0';
web <="0000";
line_val <= '0';
current_state <= idle;
frame_delay <= 0;
line_num <= (others=>'0');
dinb <= (others=>'0');
addr_cnt <= (others=>'0');
pixel_count <= 0;
valid <= '0';
elsif rising_edge(clk) then -- do state machines as have 16 clocks between each
rstb <='0';
enb <='0';
web <="0000";
line_val <= '0';
case current_state is
when idle =>
if frame_delay = 63 then
int_cs <= '0';
current_state <= sync;
frame_delay <= 0;
else
int_cs <= '1';
frame_delay <= frame_delay + 1;
end if;
when sync =>
if (shift_count = 16) and (shift_reg(11 downto 8) /= x"f") then -- not a disguard packet
line_num <= shift_reg(7 downto 0);
if (shift_reg(7 downto 0) = x"00") then -- first line reset ramaddress
addr_cnt <= (others=>'0');
end if;
current_state <= crc;
line_val <= '1';
valid <= '1';
elsif (shift_count = 16) then
current_state <= crc;
valid <= '0';
end if;
when crc =>
if (shift_count = 16) then
current_state <= packet;
end if;
when packet => --write packets to memory block
if (shift_count = 16) then
pixel_count <= pixel_count + 1;
if valid = '1' then -- valid frame not corrupt
dinb <= x"0000"&shift_reg(15 downto 0);
enb <='1';
web <="1111";
addr_cnt <= addr_cnt + 4;
end if;
current_state <= check;
end if;
when check =>
if (pixel_count = 80) then --we have read in all the pixels in the current packet
current_state <= sync;
pixel_count <= 0;
else
current_state <= packet;
end if;
when others => null;
end case;
end if;
end process;
end architecture;
Vivado BuildWith the RTL looking good, it needs to be integrated into a block diagram with the processor so we can create the overlay.
To do this I created a project targeting the PYNQ Z2 board and added the following IP blocks
- Zynq 7000 processing system - configured for the PYNQ Z2
- AXI GPIO - Single bit output driving the reset to the lepton IP block
- Lepton IP block the RTL code previously created
- Block RAM Controller
- Block RAM - implemented for use with a Block RAM Controller
- ILA for debugging the system bring up
- Constant block - Set high to provide the power to the FLIR Lepton
The completed block diagram looks as below.
To complete the design we need to define the constraints for the IO, the FLIR Lepton is designed to connect into the Arduino Header.
set_property -dict { PACKAGE_PIN P16 IOSTANDARD LVCMOS33 } [get_ports { IIC_0_0_scl_io }]; #IO_L5P_T0_34 Sch=CK_IO0
set_property -dict { PACKAGE_PIN P15 IOSTANDARD LVCMOS33 } [get_ports { IIC_0_0_sda_io }]; #IO_L2N_T0_34 Sch=CK_IO1
set_property -dict { PACKAGE_PIN T16 IOSTANDARD LVCMOS33 } [get_ports { ss }]; #IO_L3P_T0_DQS_PUDC_B_34 Sch=CK_IO2
set_property -dict { PACKAGE_PIN N17 IOSTANDARD LVCMOS33 } [get_ports { sck }]; #IO_L3N_T0_DQS_34 Sch=CK_IO3
set_property -dict { PACKAGE_PIN P18 IOSTANDARD LVCMOS33 } [get_ports { miso }]; #IO_L10P_T1_34 Sch=CK_IO4
set_property -dict { PACKAGE_PIN Y13 IOSTANDARD LVCMOS33 } [get_ports { ck_ioa }]; #IO_L10P_T1_34 Sch=CK_IO4
set_property SLEW FAST [get_ports {sck}]
set_property SLEW FAST [get_ports {ss}]
set_property SLEW FAST [get_ports {miso}]
set_property DRIVE 16 [get_ports {sck}]
set_property DRIVE 16 [get_ports {ss}]
Once the implementation was completed a simple test was performed using the ILA from SDK.
What I was looking for was a linear count of line numbers as they were received from the Lepton. If the count was not linear or had lines greater than 60 it would indicate a problem with the RTL module in reality.
Luckily the ILA showed a correct linear count and only 60 lines per image which is exactly what we desire for the lepton 2.
The roll over between frames can be seen below, note discarded frames are not output on the counter.
With the module working as desired we are now able to create a PYNQ overlay for the PYNQ Z2.
Overlay CreationTo create an overlay we first need a github repository once the repository is created we need the following files
- Setup.py - Installation file for the OVerlay
- Lepton.bit - The design for the programmable logic
- Lepton.hwh - The hardware description of the programmable logic design
- __init__.py - initialization file
- Lepton.ipynb - Notebook to drive the lepton
This overlay can be installed by using the terminal and issuing the command
sudo pip3 install --upgrade git+https://github.com/ATaylorCEngFIET/FLIR_LEPTON_PYNQ
Once the overlay is installed we can run through the commands in the overlay, these commands do the following
- Configure the libraries and packages needed in the notebook
- Initialize the GPIO
- Initialize the I2C and configure the Lepton AGC
- Read the memory for the pixel values
- Arrange the pixel values in a numpy array
- Display the nump array
Once these stage have executed you will see an output image appearing in the notebook which is heat sensitive.
from pynq.overlays.lepton3 import lepton3Overlay
from pynq import MMIO
import matplotlib.pyplot as plt
import scipy.ndimage
import matplotlib.image as mpimg
import numpy as np
import cv2
from PIL import Image
from smbus2 import SMBus, i2c_msg
from pynq.ps import Clocks
overlay = lepton3Overlay('lepton3.bit')
!i2cdetect -l
!i2cdetect -r -y 0
i2c_bus = SMBus(0)
Sensor_addr = 0x2a
write = i2c_msg.write(Sensor_addr, [0x00, 0x02])
read = i2c_msg.read(Sensor_addr, 2)
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write, read)
write = i2c_msg.write(Sensor_addr, [0x00, 0x08,0x00,0x01])
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write)
write = i2c_msg.write(Sensor_addr, [0x00, 0x0a,0x00,0x00])
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write)
write = i2c_msg.write(Sensor_addr, [0x00, 0x06,0x00,0x02])
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write)
write = i2c_msg.write(Sensor_addr, [0x00, 0x04,0x01,0x01])
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write)
write = i2c_msg.write(Sensor_addr, [0x00, 0x04,0x01,0x00])
read = i2c_msg.read(Sensor_addr, 4)
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write, read)
data = list(read)
data
write = i2c_msg.write(Sensor_addr, [0x00, 0x04,0x02,0x42])
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write)
write = i2c_msg.write(Sensor_addr, [0x00, 0x02])
read = i2c_msg.read(Sensor_addr, 2)
with SMBus(0) as bus:
i2c_bus.i2c_rdwr(write, read)
data = list(read)
data
gpio = overlay.axi_gpio_0
gpio.write(0x0,0x1)
bram = overlay.ip_dict['axi_bram_ctrl_0']
IP_BASE_ADDRESS = bram['phys_addr']
IP_ADDRESS_RNGE = bram['addr_range']
bram = MMIO(IP_BASE_ADDRESS, IP_ADDRESS_RNGE)
print(hex(IP_BASE_ADDRESS))
print(IP_ADDRESS_RNGE)
addr = 0
data = []
gpio.write(0x0,0x0)
for x in range(4800):
data.append(bram.read(addr))
addr = addr + 4
pixels = np.array(data)
pixels = np.reshape(pixels,(60, 80))
pixels
plt.imshow(pixels,cmap='gray')
plt.show()
The first image can be see capturing me working at my desk
The second image has a cold can of soft drink in the image to show the scaling of the AGC.
Now we can access the Lepton data within the PYNQ environment we are able to starting doing additional image processing on it. Including scaling the image, or saving images, we can also start recording video to the SD card using the capabilities of OpenCV.
This overlay really opens up the Lepton to users of PYNQ.
The Overlay is here https://github.com/ATaylorCEngFIET/FLIR_LEPTON_PYNQ
See previous projects here.
Additional Information on Xilinx FPGA / SoC Development can be found weekly on MicroZed Chronicles.
Comments