This project consists of 5 steps
Step 1: Preparation
Step 2: Data acquisition and labelling
Step 3: Training and building model using FOMO Object Detection
Step 4: Deploy the trained model and test it on the Raspberry Pi
Step 5: Build Python program to detect and count (cumulative

Published July 26, 2022 © GPL3+

Livestock / Wildlife Counting from Drone with FOMO algorithm

This project is an example of how AI can be used for counting objects in a quick and efficient way using embedded Machine Learning.

IntermediateFull instructions provided3 hours3,649

Livestock / Wildlife Counting from Drone with FOMO algorithm

Things used in this project

Hardware components

Raspberry Pi 3 Model B+

or Pi 4 model B

Raspberry Pi Camera Module

Webcam, Logitech® HD Pro

as an alternative to Pi Camera, I use Logitech C270

Drone

Actually, until this simulation stage, you don't need a drone

Software apps and online services

Edge Impulse Studio

Raspberry Pi Raspbian

You can use Debian 10 Buster or BullsEye version

Your Favorite IDE

I use Sublime Text

Terminal App

Hand tools and fabrication machines

Rubber ducks and turtles

Story

The Role of Artificial Intelligence for livestock and wildlife monitoring is expected to grow significantly. This project is an example of how AI can be used for tracking and counting objects (animals or crops) in a quick and efficient way using embedded Machine Learning. This asset tracking system uses Computer Vision from a drone flying across the field (scanning down the surface) with a camera facing down. The ML model will be able to detect and differentiate types of animals or crops and can count the cumulative number for each type of objects (animal/crop) in real time. This enables wildlife rescue teams to monitor the population of the animals/crops, it can also be used for businesses to calculate the potential revenue in the livestock and agriculture market.

This project uses Edge Impulse’s FOMO (Faster Objects, More Objects) object detection algorithm. The wildlife/livestock/asset tracking environment can be simulated and performed by selecting the grayscale Image block and FOMO Object detection with 2 output classes (e.g. turtle and duck). This project takes advantage of FOMO’s fast and efficient algorithm to count the objects while using a constrained micro-controller or a single board Linux-based computer such as the Raspberry Pi. (I’m using a Raspberry Pi 3 model B+, but the Pi 4 model B or Jetson Nano should work better in theory).

Project Idea Diagram

The Edge Impulse model is also implemented into our Python code so that it can count the object cumulatively. The algorithm compares the coordinates of the current frame to the previous frames; to see if there is a new object on camera or if the object has been previously counted. In our testing sometimes there’s still inaccuracy in the number of objects counted as this model is still in the Proof of Concept stage. We are confident that this concept can be developed further for the real world application.

This project consists of 5 steps:

Preparation
Data acquisition and labelling
Training and building model using FOMO Object Detection
Deploy and test object detection on the Raspberry Pi
Build Python application to detect and count (cumulative)

Step 1: Preparation

Prepare your Raspberry Pi with the updated Raspberry Pi OS (Buster or Bullseye). Then open your Terminal app and ssh to your Pi. Install all dependencies and Edge Impulse for Linux CLI by following the guide here.

Take pictures of the objects from above (e.g. ducks and turtles) in different positions with backgrounds of varying lighting condition to ensure that the model can work under different conditions (to prevent overfitting). In this project I use a smartphone camera to capture the images for data collection for ease of use.

Note: Try to keep the size of objects similar in size in the pictures, significant difference in object size will confuse the FOMO algorithm.

All photos are taken from smartphones with various backgrounds.

As you might already know, this project uses Edge Impulse as the Machine Learning platform, so we need to login (create an account first), then go to Edge Impulse and create new project.

Step 2: Data acquisition and labelling

Choose Images project option, then Classify multiple objects.

1 / 2

In Dashboard > Project Info, choose Bounding Boxes for labelling method and Raspberry Pi 4 for latency calculations.

1 / 2

Then in Data acquisition, click on Upload Data tab, choose your files, select auto split, then click Begin upload.

Now, it’s time for labelling. Click on Labelling queue tab then start drag a box around an object and label it (duck or turtle) and Save. Repeat until all images labelled. Make sure that the ratio between Training and Test data is ideal, around 80/20.

Step 3: Training and building model using FOMO Object Detection

Once you have dataset ready, go to Create Impulse and set 96 x 96 as image width - height (this help in keeping the model small in memory size). Then choose Fit shortest axis, and choose Image and Object Detection as learning blocks.

Go to Image parameter section, select color depth as Grayscale then press Save parameters.

Finally, click on Generate features button, you should get a result just like the one below.

Then, navigate to Object Detection section, and leave training setting for Neural Network as it is — in our case is quite balanced pre-trained model, then we choose FOMO (MobileNet V2 0.35). Train the model by pressing the Start training and you can see the progress. If everything is OK, you should see something like this:

After that we can test the model, go to Model testing section and click classify all. If the accuracy result is more than 80%, then we can move on to the next step — deployment. Note: If accuracy result is not as good as expected, re-start with quality datas, label, or just retrain the model with Training cycle and Learning rate setting changes.

Step 4: Deploy the trained model and test it on the Raspberry Pi

Now, we can switch to Raspberry Pi. Make sure your Pi has installed all dependencies and Edge Impulse for Linux CLI (as in Step 1) and connect your Pi camera (or USB webcam). Then, via terminal ssh your Pi and type:

$ edge-impulse-linux-runner

(add - - clean if you have more than one projects) During this process you will be asked to log in to your Edge Impulse account.

This will automatically download and compile your model to your Pi, and start classifying. The result will be shown in the Terminal window.

You can also launch the video stream on your browser. http:// YOUR Raspberry Pi IP ADDRESS:4912

Then you can see how this live classification works:

Live video stream from your browser

Outdoor test

Turtle and Duck have been successfully identified with x, y coordinates in real-time (very short time per inference).

Until this step, we’ve taken out data and trained an object detection model in Edge Impulse platform and running that model locally on our Raspberry Pi board. So, it can be concluded that it was successfully deployed.

Step 5: Build Python program to detect and count (cumulative)

To make this project more meaningful for a specific use case we want it to calculate the cumulative count of each type of objects taken from a moving camera (via drone). we take Edge Impulse’s example object detection program and turned it into an object tracking program by solving a weighted bipartite matching problem so the same object can be tracked across different frames. For more detail, you can check our Python file in code attachment below.

Because we use Python, so we need to install the Python 3 Edge Impulse SDK and clone the repository from the previous Edge Impulse examples. Follow the steps here.

You also need to download the trained model file so it is accessible by the program the we are running. Type this to download it:

$ edge-impulse-linux-runner --download modelfile.eim

Make sure that your/our program <count_moving_ducks> is placed in the correct directory, such as this:

$ cd linux-sdk-python/examples/image

Then, run the program using this command:

$ python3 count_moving_ducks.py ~/modelfile.eim

It's feels good when your program runs as expected.

Sometimes the results are a little inaccurate.

Yay!! Finally, we have successfully implemented Edge Impulse FOMO object detection model and run cumulative count program locally in Raspberry Pi. With the speed and accuracy level that we obtained, we are confident that this project can also be used in micro-controllers such as Arduino’s Nicla Vision or the ESP32 CAM, so it will be easier to be mounted to a drone.

Feel free to leave a comment and Thank you!

count_moving_ducks.py

'''
	Author: Jallson Suryo & Nicholas Patrick
	Date: 2022-07-25
	License: CC0
	Source: Edge Impulse python SDK example file (classify.py) -- modified
	Description: Program to count livestock or wildlife from a drone (moving camera) using
		Edge Impulse FOMO trained model.
'''
#!/usr/bin/env python

import device_patches     # Device specific patches for Jetson Nano (needs to be before importing cv2)

from math import inf, sqrt
from queue import Queue
import cv2
import os
import sys, getopt
import signal
import time
from edge_impulse_linux.image import ImageImpulseRunner

runner = None
# if you don't want to see a camera preview, set this to False
show_camera = True
if (sys.platform == 'linux' and not os.environ.get('DISPLAY')):
	show_camera = False

def now():
	return round(time.time() * 1000)

def get_webcams():
	port_ids = []
	for port in range(5):
		print("Looking for a camera in port %s:" %port)
		camera = cv2.VideoCapture(port)
		if camera.isOpened():
			ret = camera.read()[0]
			if ret:
				backendName =camera.getBackendName()
				w = camera.get(3)
				h = camera.get(4)
				print("Camera %s (%s x %s) found in port %s " %(backendName,h,w, port))
				port_ids.append(port)
			camera.release()
	return port_ids

def sigint_handler(sig, frame):
	print('Interrupted')
	if (runner):
		runner.stop()
	sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def help():
	print('python classify.py <path_to_model.eim> <Camera port ID, only required when more than 1 camera is present>')

def main(argv):
	try:
		opts, args = getopt.getopt(argv, "h", ["--help"])
	except getopt.GetoptError:
		help()
		sys.exit(2)

	for opt, arg in opts:
		if opt in ('-h', '--help'):
			help()
			sys.exit()

	if len(args) == 0:
		help()
		sys.exit(2)

	model = args[0]

	dir_path = os.path.dirname(os.path.realpath(__file__))
	modelfile = os.path.join(dir_path, model)

	print('MODEL: ' + modelfile)

	with ImageImpulseRunner(modelfile) as runner:
		try:
			model_info = runner.init()
			print('Loaded runner for "' + model_info['project']['owner'] + ' / ' + model_info['project']['name'] + '"')
			labels = model_info['model_parameters']['labels']
			if len(args)>= 2:
				videoCaptureDeviceId = int(args[1])
			else:
				port_ids = get_webcams()
				if len(port_ids) == 0:
					raise Exception('Cannot find any webcams')
				if len(args)<= 1 and len(port_ids)> 1:
					raise Exception("Multiple cameras found. Add the camera port ID as a second argument to use to this script")
				videoCaptureDeviceId = int(port_ids[0])

			camera = cv2.VideoCapture(videoCaptureDeviceId)
			ret = camera.read()[0]
			if ret:
				backendName = camera.getBackendName()
				w = camera.get(3)
				h = camera.get(4)
				print("Camera %s (%s x %s) in port %s selected." %(backendName,h,w, videoCaptureDeviceId))
				camera.release()
			else:
				raise Exception("Couldn't initialize selected camera.")

			HEIGHT = 96
			WIDTH = 96

			next_frame_start_time = 0
			prev_frame_objects = []
			cumulative_counts = {'duck' : 0, 'turtle' : 0}

			# iterate through frames
			for res, img in runner.classifier(videoCaptureDeviceId):
				# print('classification runner response', res)

				if "classification" in res["result"].keys():
					print('Result (%d ms.) ' % (res['timing']['dsp'] + res['timing']['classification']), end='')
					for label in labels:
						score = res['result']['classification'][label]
						print('%s: %.2f\t' % (label, score), end='')
					print('', flush=True)

				elif "bounding_boxes" in res["result"].keys():
					curr_frame_objects = res["result"]["bounding_boxes"]
					m, n = len(prev_frame_objects), len(curr_frame_objects)
					print('Found %d bounding boxes (%d ms.)' % (n, res['timing']['dsp'] + res['timing']['classification']))
					# iterate through identified objects
					for bb in curr_frame_objects:
						print('\t%s (%.2f): x=%d y=%d w=%d h=%d' % (bb['label'], bb['value'], bb['x'], bb['y'], bb['width'], bb['height']))
						img = cv2.rectangle(img, (bb['x'], bb['y']), (bb['x'] + bb['width'], bb['y'] + bb['height']), (255, 0, 0), 1)

					# Pairs objects seen in both the previous frame and the current frame.
					# To get a good pairing, each potential pair is given a cost. The problem
					# then transforms into minimum cost maximum cardinality bipartite matching.

					# populate table
					def get_c(a0, a1):
						# computes cost of pairs. A cost of inf implies no edge.
						A, B = sqrt(HEIGHT ** 2 + WIDTH ** 2) / 8, 5
						if a0['label'] != a1['label']: return inf
						d2 = (a0['x'] - a1['x']) ** 2 + (a0['x'] - a1['x']) ** 2
						dn4 = d2 ** -2 if d2 else 10**20
						val = a0['value'] * a1['value'] * (((1 + B) * dn4) / (dn4 + A ** -4) - B)
						return inf if val <= 0 else 1 - val
					match_c = [[get_c(i, j) for j in curr_frame_objects] for i in prev_frame_objects]

					# solves the matching problem in O(V^2E) by repeatedly finding augmenting paths
					# using shortest path faster algorithm (SPFA).
					# A modified Hungarian algorithm could also have been used.
					# 0..m-1: prev, left
					# m..m+n-1: this, right
					# m+n: source
					# m+n+1: sink
					source, sink, V = m + n, m + n + 1, m + n + 2
					matched = [-1] * (m + n + 2)
					adjLis = [[] for i in range(m)] + [[(sink, 0)] for _ in range(n)] + [[(i, 0) for i in range(m)], []]
					#        left                     right                              source                     sink
					for i in range(m):
						for j in range(n):
							if match_c[i][j] != inf:
								adjLis[i].append((j + m, match_c[i][j]))

					# finds augmenting paths until no more are found.
					while True:
						# SPFA
						distance = [inf] * V
						distance[source] = 0
						parent = [-1] * V
						Q, inQ = Queue(), [False] * V
						Q.put(source); inQ[source] = True
						while not Q.empty():
							u = Q.get(); inQ[u] = False
							for v, w in adjLis[u]:
								if u < m and matched[u] == v: continue
								if u == source and matched[v] != -1: continue
								if distance[u] + w < distance[v]:
									distance[v] = distance[u] + w
									parent[v] = u
									if not inQ[v]: Q.put(v); inQ[v] = True
						aug = parent[sink]
						if aug == -1: break
						# augment the shortest path
						while aug != source:
							v = aug
							aug = parent[aug]
							u = aug
							aug = parent[aug]
							adjLis[v] = [(u, -match_c[u][v - m])]
							matched[u], matched[v] = v, u

					# updating cumulative_counts by the unmatched new objects
					for i in range(n):
						if matched[m + i] == -1:
							cumulative_counts[curr_frame_objects[i]['label']] += 1

					# preparing prev_frame_objects for the next frame
					next_prev_frame_objects = curr_frame_objects
					# considering objects that became invisible (false negative) for a few frames.
					for i in range(m):
						if matched[i] != -1: continue
						prev_frame_objects[i]['value'] *= 0.7
						if prev_frame_objects[i]['value'] >= 0.35: 
							next_prev_frame_objects.append(prev_frame_objects[i])
					prev_frame_objects = next_prev_frame_objects

					print("current cumulative_counts:\n %d ducks, %d turtles" % (cumulative_counts['duck'], cumulative_counts['turtle']))

				if (show_camera):
					cv2.imshow('edgeimpulse', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
					if cv2.waitKey(1) == ord('q'):
						break

				if (next_frame_start_time > now()):
					time.sleep((next_frame_start_time - now()) / 1000)
				# operates at a maximum of 5fps
				next_frame_start_time = now() + 200
		finally:
			if (runner):
				runner.stop()

if __name__ == "__main__":
	main(sys.argv[1:])