Team kinetika:

Jallson Suryo

•

Samuel Alexander

•

Nicholas Patrick

Published February 28, 2024 © Apache-2.0

Counting for Inspection and Quality Control with TensorRT

An Nvidia Jetson Nano computer vision project with TensorRT acceleration, to perform counting for manufacturing quality control.

AdvancedFull instructions providedOver 1 day1,609

AI at the Edge Applications (Open Category): 3rd Place

AI Innovation Challenge

Counting for Inspection and Quality Control with TensorRT

Things used in this project

Hardware components

NVIDIA Jetson Nano Developer Kit

USB webcam (eg. Logitech C270 or C920)

SparkFun Qwiic Pro Micro - USB-C (ATmega32U4)

Could be any microcontroller, really.

Mini conveyor belt system

Check Bill of Materials for details

Objects: Mini pizza with toppings

Dough or printed paper scaled model

Ethernet Cable, 1 m

Software apps and online services

NVIDIA Jetpack SDK (inc Tensor RT, OpenCV, PyTorch, CUDA C++ GPU accelerated dev toolkit)

Edge Impulse Studio

EI Linux, Python, C++ SDK

Arduino IDE

Hand tools and fabrication machines

3D Printer (generic)

Story

Complete Demo

Problem Statement

The Quality Control process, especially those involving visual-based calculations and carried out repeatedly, is time consuming and prone to errors when performed by humans. Using sensors to count or calculate presence also does not provide a solution if the object you want to detect is an object made up of multiple components, each needing measurement. Food products, finished goods, or electronic manufacturing processes could be examples of this type of scenario

Image credit: Shinesegae Food

Our Solution

A computer vision system for quality/quantity inspection of product manufacturing on a conveyor belt. The setting of this project will be in a hypothetical mass-production pizza factory where a Jetson Nano with a camera will detect and count the number of toppings for each pizza that passes by on a conveyor belt to ensure the quantity of toppings (pepperoni, mushroom, and paprika) meets a predefined quality standard. Speed, reliability, and cost efficiency are the goals for this project.

1 / 3

How it works?

This project uses MobileNet V2 neural network architecture which can quickly detect objects and use them as a quality/quantity check for products on a running conveyor belt. MobileNet V2 with custom trained object detection ability to know the number and coordinates of specific multiple objects is the basis of this system. This project will explore the capability of the Nvidia Jetson Nano's GPU to handle color video (RGB) with a higher resolution (320x320) than some other TinyML projects, while still maintaining a high inference speed. The machine learning model will be deployed with the TensorRT library, which will be compiled with optimizations for the GPU and will be setup via the Linux C++ SDK. Once the model can identify different pizza toppings, an additional Python program will be added, to check each pizza for a standard quantity of pepperoni, mushrooms, and paprikas. This project is a proof-of-concept that can be widely applied in the product manufacturing and food production industries to perform quality checks based on a quantity requirement of part in a product.

Bill of Materials

ⓘ Note
This project will be divided into 2 sections, In section A (step 1 to 5) we will dive into the NVIDIA software development kit (SDK) and build the object detection model using Edge Impulse. For section B (step 6 to 8) we’ll go through the hardware assembly, electronics wiring, controlling the actuators, and integrating everything together.

---------- SECTION A ----------

1. Prepare Data / Images

In this project we can use a camera (webcam) connected to a PC/laptop to capture the images for data collection for ease of use. Take pictures of your pizza components from above, with slightly different angles and lighting conditions to ensure that the model can work under different conditions (to prevent overfitting). While using FOMO (Faster Objects, More Objects) algorithm, object size is a crucial aspect to ensure the performance of this model. You must keep the camera distance from objects consistent, because significant differences in object size will confuse the algorithm.

Capture image with a webcam connected to a PC/laptop

2. Data Acquisition and Labeling

Go to http://studio.edgeimpulse.com, login or create an account, then create a new project. Choose the Images project option, then Classify Multiple Objects. In Dashboard > Project Info, choose Bounding Boxes for the labeling method and Nvidia Jetson Nano for the target device. Then in Data acquisition, click on the Upload Data tab, choose your photo files that you captured from your webcam or phone, choose Auto split, then click Begin upload.

Next, click on the Labelling queue tab, then start dragging a box around each object and label it, then save it. Repeat until all images are labelled.

3. Train and Build Model

Once you have the dataset ready, go to Create Impulse and in the Image block, set 320 x 320 as the image width and height. Then choose Fit shortest axis, and choose Image and Object Detection as Learning and Processing blocks.

In the Image parameters section, set the color depth as RGB then press Save parameters. Then click on Generate and navigate to the Object Detection block setup using the left navigation. Leave the training setting for Neural Network as it is or check our settings — in our case everything is quite balanced, so we'll leave them alone and choose FOMO (MobileNet V2 0.35). Train the model by pressing the Start training button. You can see the progress of the training in the log to the right.

If everything is OK, the training will job will finish in a short while, then we can test the model. Go to the Model Testing section and click Classify all. Our result is above 90%, so we can move on to the next step — Deployment.

4. Deploy Model to NVIDIA Jetson Nano GPU

Click on the Deployment tab then search for NVIDIA TensorRT, then select Float32 and click Build. This will build an NVIDIA TensorRT library for running inference, targeting the Jetson Nano's GPU. Once the build finishes and the file is downloaded, open the .zip file, then we're ready for model deployment with the Linux C++ SDK on the Jetson Nano side.

On the Jetson Nano, there are several things that need to be done. Flash the Nvidia JetPack, which can be downloaded from the Nvidia Jetson website, to an SD Card. Insert the SD Card and power on the board, go through the setup process to finish the OS configuration, and connect the board to your local network. Then ssh from your PC/laptop and install the Edge Impulse tooling via the terminal:

wget -q -O - https://cdn.edgeimpulse.com/firmware/linux/jetson.sh | bash

Then install Clang as a C++ compiler:

sudo apt install -y clang

Clone this GitHub repository and install these submodules:

git clone https://github.com/edgeimpulse/example-standalone-inferencing-linux
cd example-standalone-inferencing-linux && git submodule update --init --recursive

Then install OpenCV with dependencies:

sh build-opencv-linux.sh

Now make sure the contents of the TensorRT folder you downloaded from the Edge Impulse Studio have been unzipped and moved to the example-standalone-inferencing-linux directory. For constrained object detection, we need to edit the variables in the source/eim.cpp file with:

const char *model_type = "constrained_object_detection";

To build a specific model targeting the Jetson Nano GPU with TensorRT, using Clang:

APP_EIM=1 TARGET_JETSON_NANO=1 CC=clang CXX=clang++ make -j

The resulting model will be ./build/model.eim

If your Jetson Nano is run with a dedicated power supply, its performance can be maximized by this command:

sudo /usr/bin/jetson_clocks

Now the model is ready to run in a high-level language such as the Python program we'll use in the next step. To ensure this model works, we can run the Edge Impulse Runner with the camera setup on the Jetson Nano and turn on the conveyor belt. You can see what the camera observes via your browser; the local IP address and port will be shown when the Linux Runner is started. Run this command:

edge-impulse-linux-runner --model-file <your path of build model>/model.eim

The inferencing time is about 5ms, which is an incredibly fast detection speed.

To compare, I have also used the Linux Runner with the CPU version of the model, downloaded via edge-impulse-linux-runner --download modelfile.eim then running it with the same command as above.

You can see the difference in inferencing time, which is almost 6-times faster when we compile and run on the GPU. Impressive!

5. Build the Application

With the impressive performance of live inferencing shown by the Linux Runner, now we will create a Python program to be able to calculate the number of toppings on a pizza compared to a desired amount, and that will provide an OK or Bad output if the number of toppings is incorrect.

Because we'll use Python, we need to install the EI Python SDK and clone the repository with their provided examples. Follow the steps here https://docs.edgeimpulse.com/docs/edge-impulse-for-linux/linux-python-sdk to install the Python SDK. Once the SDK is installed, be sure to git clone https://github.com/edgeimpulse/linux-sdk-python as well, so that you have the samples locally.

The program we made (topping.py) is a python code that detects the type of topping and will count on each passing pizza, and will check whether the number of toppings is correct for each pizza.

Our program will change the moving object detection input from the model file (model.eim), for example: 0 0 2 3 3 1 0 1 3 3 3 2 0 0 0 2 3 3 2 0 0 2 5 5 1 0 0 2 3 3 1 0 0 1 2 2 0 0 will record 0 as the sequence separator and record the peak value in each sequence. As an example, if the correct number of toppings on a pizza (per quality control standards) is 3, and we know that a 0 is a seperator, and anything other than 3 is bad...then 0 3 0 3 0 3 0 5 0 3 0 2 0 is: OK OK OK BAD OK BAD

Our Python program (topping.py) can be downloaded at the "code" section of this Hackster project page.

To run the program, use the command along with the path where model.eim file is located. Be sure to use the one built for the GPU, in case you have both model on the Jetson Nano:

python3 topping.py ~/build/model.eim

To see the process in action, Check our demo video:

After conveyor belt system assembled (section B), you can modify topping.py to control servo and LEDs; check topping2.py in the "code" section.

---------- SECTION B ----------

6. Assemble the conveyor belt

1 / 7

Place the 688zz bearing to the hole on each 3D printed legs using a mallet.
Insert the shorter ø8mm rod to the cylinder (use a mallet) and leave around 5mm on both sides.
Insert the longer ø8mm rod to the other cylinder leaving 5mm on one side and leave the rest of the rode protrude on the other side.
Connect the protruded rod from the cylinder to the bearing on each legs.
Mount the legs to the aluminium extrusion.
Screw the motor to the motor mount and connect it to the aluminium extrusion.
Screw the GT2 timing pulleys to the longer protruded rod and to the
Connect the motor to a 6-12V power source to use the conveyor belt.

7. Electronics wiring and soldering

1 / 7

1 / 6

Pro Micro (A2) <---> Jetson Nano (GPIO 7)

Pro Micro (A1) <---> Jetson Nano (GPIO 11)

Pro Micro (Vin/RAW) <---> Jetson Nano (5V)

Pro Micro (GND) <---> Jetson Nano (GND)

Pro Micro (5) <---> Red LED

Pro Micro (6) <---> Green LED

Pro Micro (GND) <---> LED cathode

Pro Micro (3) <---> Servo signal

Pro Micro (Vin/RAW) <---> Servo voltage

Pro Micro (GND) <---> Servo GND

8. Program the microcontroller to control the actuators

In this project we will use Sparkfun's Qwiic Pro Micro (though other microcontrollers should work as well). On our Jetson Nano python code we have used two pins (GPIO 7 and 11) which we have set to HIGH by default and when a bad pizza is detected, GPIO 11 will be set to low and when a OK/good pizza is detected, GPIO will be set to low.

The job of the microcontroller is to read the GPIO 7 and 11 from the Jetson Nano and control the actuators depending on the state of those GPIO pins.

Full code for both the Sparkfun Pro Micro and Jetson Nano python code can be downloaded in the "Code" section of this Hackster project page.

Conclusion:

We have successfully implemented an object detection computer vision model targeting NVIDIA Jetson Nano's GPU, in a food/product manufacturing setting. The MobileNet V2 object detection with RGB color and 320x320 resolution is handled by the Jetson Nano's GPU accurately, with an inference time of only 5ms. This would allow it to be applied with higher resolution for more complex objects, faster conveyor belts, and higher speed cameras (>100fps). Embedding TensorRT models with high-level languages such as Python make it easy to apply to specific use-cases and provides the capability to control lights and servos for automation and manufacturing systems, as well.

Schematics

Code

#define red 5
#define green 6
#define on 100      //brightness of LED when it is turned on
#define off 0

#define open 122    //gate open and close values
#define close 74

#include <Servo.h>
Servo gate;
int pos = open;     //position of gate
bool isClose = false;

void setup()
{
  pinMode(A2, INPUT_PULLUP);
  pinMode(A1, INPUT_PULLUP);
  pinMode(red, OUTPUT);
  pinMode(green, OUTPUT);
  gate.attach(3);
  gate.write(open);
  Serial.begin(9600);
  Serial.println("Initialize Serial Monitor");
}

void loop()
//LOW < 115 , HIGH > 690 , PULLUP > 1015
{
  //Serial.println(analogRead(A1));
  //Serial.println(analogRead(A2));
  //when GPIO 11 on jetson set to LOW (OK pizza)
  if (analogRead(A1) < 600){
    analogWrite(green, on);
    if (isClose) {
      // goes from close to open
      for (pos = close; pos <= open; pos += 1) {
        gate.write(pos);
        delay(15);
      }
      isClose = false;
    }
    delay(2000);
    analogWrite(green, off);
  }
  //when GPIO 7 on jetson set to LOW (Bad pizza)
  if (analogRead(A2) < 600){
    analogWrite(red, on);
    if (!isClose) {
      for (pos = open; pos >= close; pos -= 1) {
        gate.write(pos);
        delay(15);
      }
      isClose = true;
    }
    delay(2000);
    analogWrite(red, off);
  }
}

'''
	Author: Nicholas Patrick
	Date: 2024-02-05
	License: CC0
	Description: Program to detect whether the pizza toppings on the moving conveyer belt are the correct amount. Will give OK or Bad output for each pizza that passes.
'''
#!/usr/bin/env python

import RPi.GPIO as GPIO
GPIO.cleanup()
GPIO.setmode(GPIO.BOARD)
GPIO.setup(7,GPIO.OUT)
GPIO.setup(11,GPIO.OUT)
GPIO.output(7,1)
GPIO.output(11,1)

import device_patches       # Device specific patches for Jetson Nano (needs to be before importing cv2)

import cv2
import os
import sys, getopt
import signal
import time
from edge_impulse_linux.image import ImageImpulseRunner

runner = None
# if you don't want to see a camera preview, set this to False
show_camera = True
if (sys.platform == 'linux' and not os.environ.get('DISPLAY')):
    show_camera = False

def now():
    return round(time.time() * 1000)

def get_webcams():
    port_ids = []
    for port in range(5):
        print("Looking for a camera in port %s:" %port)
        camera = cv2.VideoCapture(port)
        if camera.isOpened():
            ret = camera.read()[0]
            if ret:
                backendName =camera.getBackendName()
                w = camera.get(3)
                h = camera.get(4)
                print("Camera %s (%s x %s) found in port %s " %(backendName,h,w, port))
                port_ids.append(port)
            camera.release()
    return port_ids

def sigint_handler(sig, frame):
    print('Interrupted')
    if (runner):
        runner.stop()
    sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def help():
    print('python classify.py <path_to_model.eim> <Camera port ID, only required when more than 1 camera is present>')

def main(argv):
    try:
        opts, args = getopt.getopt(argv, "h", ["--help"])
    except getopt.GetoptError:
        help()
        sys.exit(2)

    for opt, arg in opts:
        if opt in ('-h', '--help'):
            help()
            sys.exit()

    if len(args) == 0:
        help()
        sys.exit(2)

    model = args[0]

    dir_path = os.path.dirname(os.path.realpath(__file__))
    modelfile = os.path.join(dir_path, model)

    print('MODEL: ' + modelfile)

    with ImageImpulseRunner(modelfile) as runner:
        try:
            model_info = runner.init()
            print('Loaded runner for "' + model_info['project']['owner'] + ' / ' + model_info['project']['name'] + '"')
            labels = model_info['model_parameters']['labels']
            if len(args)>= 2:
                videoCaptureDeviceId = int(args[1])
            else:
                port_ids = get_webcams()
                if len(port_ids) == 0:
                    raise Exception('Cannot find any webcams')
                if len(args)<= 1 and len(port_ids)> 1:
                    raise Exception("Multiple cameras found. Add the camera port ID as a second argument to use to this script")
                videoCaptureDeviceId = int(port_ids[0])

            camera = cv2.VideoCapture(videoCaptureDeviceId)
            ret = camera.read()[0]
            if ret:
                backendName = camera.getBackendName()
                w = camera.get(3)
                h = camera.get(4)
                print("Camera %s (%s x %s) in port %s selected." %(backendName,h,w, videoCaptureDeviceId))
                camera.release()
            else:
                raise Exception("Couldn't initialize selected camera.")

            next_frame = 0 # limit to ~10 fps here

            # topping counting helper variables
            topping_names = ["mush", "papri", "roni"]
            topping_good_count = [3, 3, 3]
            topping_prev_count = [0] * len(topping_names)
            movingIn = True

            for res, img in runner.classifier(videoCaptureDeviceId):
                if (next_frame > now()):
                    time.sleep((next_frame - now()) / 1000)

                # print('classification runner response', res)

                if "classification" in res["result"].keys():
                    print('Result (%d ms.) ' % (res['timing']['dsp'] + res['timing']['classification']), end='')
                    for label in labels:
                        score = res['result']['classification'][label]
                        print('%s: %.2f\t' % (label, score), end='')
                    print('', flush=True)

                elif "bounding_boxes" in res["result"].keys():
                    # count the occurrence of each topping in this frame
                    topping_real_count = [0] * len(topping_names)
                    for bb in res["result"]["bounding_boxes"]:
                        topping_real_count[topping_names.index(bb['label'])] += 1

                    # determine if the pizza is currently moving out or moving in
                    switch = False
                    for i in range(len(topping_names)):
                        if movingIn:
                            # if any topping disappeared, it's moving out
                            if topping_prev_count[i] > topping_real_count[i]:
                                switch = True
                                break
                        else:
                            # if there are no toppings, it's moving in
                            if sum(topping_real_count) == 0:
                                switch = True
                                GPIO.output(7,1)
                                GPIO.output(11,1)
                                break
                    report = movingIn and switch
                    movingIn = movingIn ^ switch

                    # report the results if the pizza was moving in and now moving out
                    if report:
                        toPrint = ""
                        if topping_prev_count == topping_good_count:
                            toPrint += "Ok:"
                            GPIO.output(7,0)
                        else:
                            toPrint += "Bad:"
                            GPIO.output(11,0)
                        for i in range(len(topping_names)):
                            toPrint += " %s: %d" % (topping_names[i], topping_prev_count[i])
                            if i + 1 < len(topping_names): toPrint += ","
                        print(toPrint)

                    # set the previous count to be the current count
                    topping_prev_count = topping_real_count

                if (show_camera):
                    cv2.imshow('edgeimpulse', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
                    if cv2.waitKey(1) == ord('q'):
                        break

                next_frame = now() + 100
        finally:
            if (runner):
                runner.stop()

if __name__ == "__main__":
   main(sys.argv[1:])

'''
	Author: Nicholas Patrick
	Date: 2024-02-05
	License: CC0
	Description: Program to detect whether the pizza toppings on the moving conveyer belt are the correct amount. Will give OK or Bad output for each pizza that passes.
'''
#!/usr/bin/env python

import device_patches       # Device specific patches for Jetson Nano (needs to be before importing cv2)

import cv2
import os
import sys, getopt
import signal
import time
from edge_impulse_linux.image import ImageImpulseRunner

runner = None
# if you don't want to see a camera preview, set this to False
show_camera = True
if (sys.platform == 'linux' and not os.environ.get('DISPLAY')):
    show_camera = False

def now():
    return round(time.time() * 1000)

def get_webcams():
    port_ids = []
    for port in range(5):
        print("Looking for a camera in port %s:" %port)
        camera = cv2.VideoCapture(port)
        if camera.isOpened():
            ret = camera.read()[0]
            if ret:
                backendName =camera.getBackendName()
                w = camera.get(3)
                h = camera.get(4)
                print("Camera %s (%s x %s) found in port %s " %(backendName,h,w, port))
                port_ids.append(port)
            camera.release()
    return port_ids

def sigint_handler(sig, frame):
    print('Interrupted')
    if (runner):
        runner.stop()
    sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def help():
    print('python classify.py <path_to_model.eim> <Camera port ID, only required when more than 1 camera is present>')

def main(argv):
    try:
        opts, args = getopt.getopt(argv, "h", ["--help"])
    except getopt.GetoptError:
        help()
        sys.exit(2)

    for opt, arg in opts:
        if opt in ('-h', '--help'):
            help()
            sys.exit()

    if len(args) == 0:
        help()
        sys.exit(2)

    model = args[0]

    dir_path = os.path.dirname(os.path.realpath(__file__))
    modelfile = os.path.join(dir_path, model)

    print('MODEL: ' + modelfile)

    with ImageImpulseRunner(modelfile) as runner:
        try:
            model_info = runner.init()
            print('Loaded runner for "' + model_info['project']['owner'] + ' / ' + model_info['project']['name'] + '"')
            labels = model_info['model_parameters']['labels']
            if len(args)>= 2:
                videoCaptureDeviceId = int(args[1])
            else:
                port_ids = get_webcams()
                if len(port_ids) == 0:
                    raise Exception('Cannot find any webcams')
                if len(args)<= 1 and len(port_ids)> 1:
                    raise Exception("Multiple cameras found. Add the camera port ID as a second argument to use to this script")
                videoCaptureDeviceId = int(port_ids[0])

            camera = cv2.VideoCapture(videoCaptureDeviceId)
            ret = camera.read()[0]
            if ret:
                backendName = camera.getBackendName()
                w = camera.get(3)
                h = camera.get(4)
                print("Camera %s (%s x %s) in port %s selected." %(backendName,h,w, videoCaptureDeviceId))
                camera.release()
            else:
                raise Exception("Couldn't initialize selected camera.")

            next_frame = 0 # limit to ~10 fps here

            # topping counting helper variables
            topping_names = ["mush", "papri", "roni"]
            topping_good_count = [3, 3, 3]
            topping_prev_count = [0] * len(topping_names)
            movingIn = True

            for res, img in runner.classifier(videoCaptureDeviceId):
                if (next_frame > now()):
                    time.sleep((next_frame - now()) / 1000)

                # print('classification runner response', res)

                if "classification" in res["result"].keys():
                    print('Result (%d ms.) ' % (res['timing']['dsp'] + res['timing']['classification']), end='')
                    for label in labels:
                        score = res['result']['classification'][label]
                        print('%s: %.2f\t' % (label, score), end='')
                    print('', flush=True)

                elif "bounding_boxes" in res["result"].keys():
                    # count the occurrence of each topping in this frame
                    topping_real_count = [0] * len(topping_names)
                    for bb in res["result"]["bounding_boxes"]:
                        topping_real_count[topping_names.index(bb['label'])] += 1

                    # determine if the pizza is currently moving out or moving in
                    switch = False
                    for i in range(len(topping_names)):
                        if movingIn:
                            # if any topping disappeared, it's moving out
                            if topping_prev_count[i] > topping_real_count[i]:
                                switch = True
                                break
                        else:
                            # if there are no toppings, it's moving in
                            if sum(topping_real_count) == 0:
                                switch = True
                                break
                    report = movingIn and switch
                    movingIn = movingIn ^ switch

                    # report the results if the pizza was moving in and now moving out
                    if report:
                        toPrint = ""
                        if topping_prev_count == topping_good_count: toPrint += "Ok:"
                        else: toPrint += "Bad:"
                        for i in range(len(topping_names)):
                            toPrint += " %s: %d" % (topping_names[i], topping_prev_count[i])
                            if i + 1 < len(topping_names): toPrint += ","
                        print(toPrint)

                    # set the previous count to be the current count
                    topping_prev_count = topping_real_count

                if (show_camera):
                    cv2.imshow('edgeimpulse', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
                    if cv2.waitKey(1) == ord('q'):
                        break

                next_frame = now() + 100
        finally:
            if (runner):
                runner.stop()

if __name__ == "__main__":
   main(sys.argv[1:])