Published June 17, 2021 © MIT

HapticCV: SpatialAI & Haptic Stimulus Based Cycle Assistance

HapticCV - SpatialAI based Computer Vision for revolutionizing Bicycle Commute with Haptic Stimulus, an algorithm at a time.

IntermediateFull instructions providedOver 4 days1,806

Runner Ups

2022 China-US Young Maker Competition

Early Birds

2022 China-US Young Maker Competition

HapticCV: SpatialAI & Haptic Stimulus Based Cycle Assistance

Things used in this project

Hardware components

OpenCV AI Kit: OAK-D

Device used for Spatial Perception to perform Object Detection and Depth Estimation in real time.

Neosensory Buzz

Raspberry Pi 4 Model B

Software apps and online services

Neosensory Python SDK

Python SDK used to Communicate between OAK-D and Buzz.

OpenCV DepthAI Python API

Python API used to control and perform inference on OAK-D.

Microsoft VS Code

OpenCV

Intel OpenVINO™ toolkit

Raspberry Pi Raspbian

Story

Before Proceeding, you can have a look at the Demo Video to understand what the project's about:

Problem Statement:

The bicycle is a cheap and ecological way of transport, and it is also a healthy option. This is why the number of cyclists in cities has increased in recent years, but so has the accident rate. A study confirms that these incidents are caused by a combination of inadequate infrastructures and risk behaviour on the part of drivers and cyclists.

Constant increase in Bicyle Accidents worldwide

Bicycles may be the most eco-friendly mode of transportation that can help ease the choking of our cities. It may also be a viable solution to free up our roads from the burgeoning number of vehicles. It may be a great cost-effective alternative to tackle the bloating fuel costs and it may also be a good way to replace a sedentary lifestyle with a more healthy one. But despite all its positives, the humble bicycle invented 200 years ago, has today become one of the most dangerous vehicles, putting its riders to great risk.On India’s roads 16 fatalities take place every hour and cyclists are among those most vulnerable among the roads users apart from 2-wheeler riders and pedestrians. According to the Ministry of Road Transport and Highways Data, 2015 report, states the Vulnerable Road Users make up for 46.3 per cent of the total fatalities. Yet another report, the Analysis of Global Road Safety 2015 done by SaveLIFE Foundation, found that road traffic deaths among pedestrians, cyclists and motorcyclists comprise almost half of all the deaths on roads across the world. According to the report by Transport Research Wing (TRW) of the Ministry of Road Transport and Highways in the five years between 2011 and 2015, 25, 435 cyclists have been killed.

cyclists are usually unaware about their rear view

unexpected car overtakes are dangerous for cyclists

Over the last thirty years, in both the European Union and Spain the number of traffic accidents has decreased considerably. However, accidents with cyclists have not followed the same trend and have undergone a systematic increase.

Proof of this is that from 2007 to 2016 in Spain, 47, 574 cyclists have been involved in some of these accidents, both minor and serious, and 656 have been killed. More specifically, in 2016, of the 1, 810 people who died in traffic accidents in Spain, 67 were cyclists. In addition, 7, 371 were injured to a greater or lesser degree.

"Given the characteristics of the vehicle and the little use made of passive safety measures, cyclists are, along with pedestrians, the most vulnerable road users to serious injuries in the event of an accident, " Sinc has been told by Sergio Alejandro Useche, a researcher at INTRAS, the Traffic Research and Road Safety Institute of the University of Valencia. 70.7% of accidents and 67.4% of injuries or deaths of victims happen in urban centres, compared to rural roads, where 29.3% of accidents and 32.6% of the victims have been recorded. 47.2% of serious injuries to cyclists occur on conventional urban roads. Today, US roadways are dominated by automobiles; inefficient, aggressive, modes of human transport. Unfortunately, bikers are considered second-class citizens as they attempt to share roadways with motorists. In fact, this has been the situation for most of the lifetime of the bicycle. In 1896, the first automobile accident ever to occur in the U.S. took place in New York City between an automobile and a bicycle, and proved fatal for the cyclist.

Factors Affecting Cycle Accidents:

VehicleSpeed:

Speed was found to be one of the major factors involved in around 10% of all accidents and 30% of the fatal ones. Speed of vehicles involved in a crash is the single most important factor in determining the severity of injuries. There are two distinct factors when considering speed. Not only higher speeds are known to be responsible for an increased rate of accidents, injuries and deaths, but also large speed differences. Roads with high speed variance are more unpredictable, once they favour the number of encounters and an increased number of overtaking manoeuvres. Consequently, reducing speed limits sometimes may only result in the decrease of the vehicles’ average speed and not its variance. In the core of the danger posed by vehicles’ high speeds are the increase in the braking distance and kinetic energy that is transferred from the vehicle to the cyclist. Once both increase with the square of the velocity, the possibility of avoiding or surviving a crash decreases quadratically. In a biological perspective, it is known the human body can only resist the transference of a limited amount of kinetic energy in a crash. This amount varies for different body parts, age groups and gender. Considering the best-designed car, if the vehicle exceeds 30 km/h, this limit can be exceeded. Studies also show if a car travels at a speed lower than 30 km/h, the probability for a pedestrian to survive a crash is higher than 90%. When hit by a car at 45 km/h, the chance of surviving decreases to 50%. Or, as the speed of a car rises from 30 km/h to 50 km/h, the probability of surviving a crash decreases by a factor of 8. This was considered the second most relevant factor. In the case of an on-road cycle lane next to a low-speed limit road, the risk factors related to parallel traffic were considered negligible, regardless of the width of the lane.

● Lorries and other Large Vehicles:

In the last years, economic development and consumer demand have been increasing, and so as the number of trucks in the cities. While cycling has following the same trend, the number of encounters among them has significantly increased. As an example, in New York City, 15% of bicycle networks overlap with 11% of truck networks. The increased number of encounters has contributed to a higher accident and mortality rates involving trucks. Truck-bicycle accidents have usually more severe consequences than any other type of accidents. In some EU countries, 30% of all cycling fatalities are associated to trucks. Studies in the past 2 decades have identified trucks as the most common vehicle category involved in cyclist deaths, in London.

BendVisibility:

Several sources identify bends as a risk factor. Bends and intersections are often jointly considered as posing similar risks to the cyclist. Low visibility, in the cyclist’s perspective, make risky situations that usually are not — sudden presence of pedestrians or intrusive vegetation. In the driver’s perspective, it can make cyclists unnoticeable and, consequently, a vulnerable element. There is no clear statistical data showing how bends affect cyclists’ accident, injury or fatality rates.

Pedestrians:

Among all age-groups, pedestrian fatalities most often occur in children younger than 14 years old, when comparing with adults aged between 15 and 64 or 65 or more. In terms of gender, men are at a greater risk than women. [34] For these reasons, locations with higher concentration of people satisfying these criteria (e.g. — school areas) are at additional risk. Nevertheless, in car-free zones, accidents between pedestrians and cyclists are extremely rare and almost never serious. This was considered the least important of the risk factors.

The most unsafe situation was the absence of cycle lane. Secondly, high speed limits with a narrow lane in an on-road scenario. Thirdly, high-speed limits but with a wider lane. Fourthly, low-speed limits regardless of the presence of a narrow or wide on-road lane. Finally, a physically separated lane was considered the safest scenario. In the next two sections are described the tools used to capture objects and structures from imagery.

Sustainable Development Goals:

This project aims to impact on the community over the goal, "By 2030, significantly reduce the number of deaths and the number of people affected and substantially decrease the direct economic losses relative to global gross domestic product caused by disasters, including water-related disasters, with a focus on protecting the poor and people in vulnerable situations:" by developing a resilient and reliable framework to help prevent bicycle accidents over commute thereby significantly reducing one of the most frequent forms of on-road accidents.

Project Idea and Implementation:

All Such Factors contribute to cyclist fatalities every year. To solve this, we propose an aggregate system, HapticCV.

I propose to use on-cycle Object Detection and Image Segmentation methods to estimate traffic flow in the back, where commuters don’t have their vision. Inference and Classification from the back, is then sent to Neosensory Buzz for “immediate” response about the Motion of the Vehicle at the back, at what Distance it is from the Cycle( Using Spatial Perception), and What is the Vehicle Type in the background( Car, Truck, Motorcycle, Pedestrian etc.)

Hardwarewe'll use:

Our solution includes cumulative integration of Spatial Perception and Haptic Stimulus, just as the Title suggests. For the same reason, bicycles alerts have to be made in real time, from Object Detection and Depth Perception to Haptic Stimulus with low latency. So, I chose a device called the OAK-D abbreviated term for OpenCV AI Kit with Depth(Thanks to Brandon Gilles for sending this over :)), and thanks to the Neosensory Team for sending the Neosensory Buzz over for receiving real time haptics about the Spatial position of the object behind the cycle.

1)TheOpenCV Spatial AI Kit OAK-D

source - luxonis.com

OAK-D is a device to inference Monocular Neural Inference fused with stereo depth and Staereo Neural Inference.

DepthAI is the embedded spatial AI platform built around Myriad X - a complete ecosystem of custom hardware, firmware, software, and AI training. It combines neural inference, depth vision, and feature tracking into an easy-to-use, works-in-30-seconds solution.

DepthAI offloads AI, depth vision and more - processed direct from built-in cameras - freeing your host to process application-specific data.

DepthAI gives you power of AI, depth, and tracking in a single device with a simple, easy-to-use API, written in Python and C++.

Best of all, it is modular (System on Module) and built on MIT-licensed open source hardware, affording adding these Spatial AI/CV super powers to real commercial products.

2) Neosensory Buzz:

Source - Neosensory.com

Buzz is intended for “deaf and hard-of-hearing individuals, musicians, app developers, and others who want to create unique sensory experiences.”

”When we become fluent in a language, learn to ride a bike, or refine our golf swing, we form associations with patterns of information from our physical world. Buzz is a wearable device that captures sound and translates it into vibrational patterns on the skin. With practice, these associations become automatic and a new sense is born. You’ll experience the difference on the first day you wear it. And it only gets better from there.”

Accidents are unprecedented events. They happen in split seconds. Human Brain reacts the fastest to Haptics or Tactile Stimulus. Here Neosensory Buzz plays an important role.

Let's say you prefer using a speaker for surrounding awareness on bicycles, rather than the buzz. Interrogating on that, How quick would that be to alert you to prevent you from an accident? There are millions of tones in the environment when you're commuting on a cycle, and your brain requires time to process all information and later alert you for the same. Scrapping that idea, let's say you prefer to get notifications on your mobile while cycling. How convenient would it be to pay attentation to two locations at a time?

Preferably, you'd decide to opt for buzz. Haptic Stimulus provides a quick real time understanding of the Rearview, with the buzz motor intensity indicating the depth of the vehicle, with an on-lane focus maintained too. So, with all those clear benefits, I decided to build this project further.

What's the Latency between the object detected and the message sent to Neosensory Buzz:

Technically, none. That is what we want, the car behind the cycle to be immediately detected and send the same to Neosensory Buzz, so that cyclists know in real time about what's there behind them. After Doing the maths, the latency comes to 0.03s for the car to be detected, ~ 50ms for the command to be sent to Neosensory Buzz via Bluetooth and 50ms for the Buzz to send a continuous vibration on the skin, and for the skin to feel it. So it comes down to ~ 130ms of latency i.e. 1/10th of a second!!

Data for the Latency - (Based on Observations and Records)

30FPS Face Detection Demo, 0.03 seconds to detect a face approaching from the back. And, less than 50ms for the Buzz to recieve the ping and send an impulse on the skin.
15 - 17FPS Pedestrian and Vehicle Detection Demo, 0.06 seconds to detect a pedestrian or vehicle and less than 50-100ms for the Buzz to recieve a ping and send a haptic response.
20FPS Vehicle Detection Demo, 0.05 seconds to detect a vehicle and less than 50ms for the buzz to recieve a ping and send a haptic response.

Now on the actual implementation -

1)Identifying Pedestrian Mobility in the hind, and sending real time alerts to Buzz:

source- The Guardian

With pedestrians right behind you as well as ahead you on a heavy traffic road, it is usually not too easy to halt taking care of the varying depth and velocities of the pedestrians behind you. Usually bicylists prefer to have a look at the back while driving and..... might end up in a crash with the person at the front. This is quite common to end up in such a situation, which happens frequently with me while cycling.

So, the first implemented object detection and spatial perception implementation is based on understanding mobility of pedestrians behind and alerting the buzz in real time.

This begins with implementing a Pedestrian detection and Face detection model on OAK-D to accurately locate the people behind the cycle. Remember, we aim to obtain the least false negatives as much as possible, so a potential threat from behind is not missed out.

The Geometry behind Calculation of Depth and Disparity from Stereo Images on OAK-D:

By tracking the displacement of points between the alternate images, the distance of those points from the camera can be determined. Different disparities between the points in the two images of a stereo pair are the result of parallax. When a point in the scene is projected onto the image planes of two horizontally displaced cameras, the closer the point is to the camera baseline, the more difference is observed in its relative location on the image planes. Stereo matching aims to identify the corresponding points and retrieve their displacement to reconstruct the geometry of the scene as a depth map

Source - cs.tut.fi

A similar approach works behind the software of the OAK-D device, developed by the Luxonis team. Here's a small example demonstrating how it works -

Stereo captured from my OAK-D further converted to disparity maps

Connecting and Setting up OAK-D with Neosensory Buzz -

OAK-D needs a host device for OS support and to initialise scripts, examples and commands sent to be computed over the board. This host could be an ESP-32 or an laptop or even a Raspberry Pi. Personally, the compact form of Raspberry Pi has always been adored, and so I chose a Raspberry Pi 4B as the host device for OAK-D. The Raspberry Pi has built in bluetooth system, perfect to be connected to the Neosensory Buzz. Connecting the OAK-D to the Neosensory Buzz via Neosensory Python SDK was pretty straightforward, and apart from a few pitfalls while setting up, the overall process was quite smooth!

For the OAK-D, I will be using depthai-python Gen2 API, and open source end-to-end API for enabling and integrating on-board-processing and depth calculation -

To set-up the OAK-D from scratch, follow a quick guide here

We'll go ahead and connect the buzz with the OAK-D to see if it all goes well! -

Head over to https://github.com/neosensory/neosensory-sdk-for-python and follow the instructions to install and setup the neosensory python library on your raspberry pi.

Steps -

$ git clone https://github.com/neosensory/neosensory-sdk-for-python.git

Create a virtual environment on your Raspberry Pi and from your venv, follow these steps. Virtual environment is necessary for the newly installed libraries to not mess with the existing system

$ python3 setup.py develop
$ pip3 insall bleak==0.10.0

For troubleshooting, follow - https://github.com/neosensory/neosensory-sdk-for-python/issues/2

And bingo! you've set everything up. Now simply pair your Buzz with the Raspberry Pi via bluetooth and run one of the examples in the script to see if it works -

To put Buzz into pairing mode, hold down the (+) and (-) buttons until the LEDs flash blue.

If you see something like this return on your terminal, you're doing good -

(venv2) pi@raspberrypi:~/neosensory-sdk-for-python/examples $ python3 buzz_a_buzz.py 
Found a Buzz! D8:4A:B7:23:4A:40: BuzzD84AB7234A40
Address substring: D8:4A:B7:23:4A:40
Connection State: True

To get started with pairing and connecting your buzz with the OAK-D, run this demo script - https://github.com/dhruvsheth-ai/HapticCV/blob/main/test-a-buzz/buzz_with_oak.py

To run the script -

All the models have been uploaded in the models folder to be used throughout the project.

$ git clone https://github.com/dhruvsheth-ai/HapticCV.git
$ cd test-a-buzz
$ python3 buzz_with_oak.py

After a couple of seconds, both devices have callibrated and perform inference in sync. In this demo, it simply connects the OAK-D and Buzz through a script, to be callibrated for further demos. Press `q` or ctrl + c to stop the inference.

Computer Vision Models we'll be using-

We'll be using three different models depending on our use case. The first model, face-detection for densely populated pedestrian routes, to avoid accidents with pedestrians. The second, pedestrian-vehicle-detector to alert based on both, pedestrians as well as vehicles, the most robust model, and the third, vehicle-detection, specifically built to alert based on cars, trucks, vans or lorries while cycling on highways to prevent the root cause of accidents. The user can definitely choose which model he wishes to run depending on the route, which makes this project practical and robust.

TL;DR? Here's a flow chart to sum it up!

1) face-detection-retail-0004 Model, which is a state-of-the-art face detection model that performs inference at high FPS, with low energy consumption suited for our use-case.

This is a pretrained OpenVINO Model, which is an Optimised MobileNet SSD Model used to run on Edge AI devices.

Convolutional Architecture of the Model we'll be using -

The network features a default MobileNet backbone that includes depth-wise convolutions to reduce the amount of computation for the 3x3 convolution block.

Face detector based on SqueezeNet light (half-channels) as a backbone with a single SSD for indoor/outdoor scenes shot by a front-facing camera. The backbone consists of fire modules to reduce the number of computations. The single SSD head from 1/16 scale feature map has nine clustered prior boxes.

More info here

These images show how Face detection Model accurately detects my face. This model is running on the OAK-D, and the left preview shows the Depth Map of the corresponding RGB image. Bounding Boxes are resized to be displayed on Depth Map as well, for a better view of the Detected object in Spatial Plane. Also, the X, Y and Z coordinates will be further used to send signals to the Buzz to indicate which motor has to be vibrated. This will be shown ahead.

Left- Depth MAp. Right- RGB preview with detected Objects

Accurate Inference of the Face Detection Model despite occlusions.

This is a disparity view of the same depth map. Better to visualize.

This is a disparity view of a different colormap. I'll be using various colormaps throughout this post.

The pedestrian detection and buzz alerting demo would work out in the following manner -

OAK-D retrieves spatial information from each object detected in terms of metre, and then, I use this value to trigger the intensity of vibration of motors on buzz. So, as the person approaches closer to the bicycle, the intensity of the motor on buzz increases, and as it moves away, the intensity decreases. The formula used to map the intensity of buzz motors from 1 to 255 is as follows -

Here, int(detection.spatialCoordinates.z) is in millimetres. To simplify things a bit, I simply mapped the Intensity on Motor based on certain Depth Threshold(as also mentioned in the demo section ahead) The OAK-D cannot compute depth information for distances less than or equal to 0.35metre. So, this formula is valid for all values of Depth of the object. Further, the x coordinate of the object is also considered, when mapping motor intensity on buzz. So, the X coordinates of the detected objects are divided into 4 quadrants corresponding to the motor on the buzz. This looks something like this -

for detection in detections:
    if -1800 < Xframe < -300:
        cv2.putText(frame, "motor", (50, 50), cv2.FONT_HERSHEY_TRIPLEX, 0.5, color)
        await send_vibration_frame([ZDepth, 0, 0, 0])
        await send_vibration_frame([ZDepth, 0, 0, 0])

        await stop_vibration_frame()

    if -300 < Xframe < 0:
        await send_vibration_frame([0, ZDepth, 0, 0])
        await send_vibration_frame([0, ZDepth, 0, 0])

        await stop_vibration_frame()

    if 0 < Xframe < 300:
        await send_vibration_frame([0, 0, ZDepth, 0])
        await send_vibration_frame([0, 0, ZDepth, 0])

        await stop_vibration_frame()

    if 300 < Xframe < 1800:
        await send_vibration_frame([0, 0, 0, ZDepth])
        await send_vibration_frame([0, 0, 0, ZDepth])

        await stop_vibration_frame()

Here, the corresponding motor on buzz is activated when the X coordinate is within the range of the quadrant as seen below.

Divided the Depth Map into Quadrants for better understanding of the explanation.

Demo Video for Face detection with motor vibration on buzz. This video demonstrates how the Buzz vibrates corresponding to the X and Z Coordinate of the frame. The video records the sound captured from Buzz Motors from a microphone, and the left - right vibration of motors. Use a headphone to better understand the location of the vibration of each motor corresponding to the detection as well as understanding the intensity of the vibration.

Replicating the Demo:

It's pretty straightforward replicating this demo if you have an OAK-D and a buzz around you. Best part, it's open source!

Two steps and you're done!

$ https://github.com/dhruvsheth-ai/HapticCV.git
$ cd HapticCV/collisions/face-detection/

Now simply run the demo - Also press and hold the two buttons on Buzz to connect it, thereafter it will self calibrate!

$ python3 buzz_spatial_face.py

Demo with explanation

Demo potraying Buzz capablities

The second Computer Vision Model which we'll be using for our project.

Pedestrian-and-vehicle-detector-adas which is a low-latency, high- accuracy model used to detect Pedestrians and Vehicles on Road trained on ADAS dataset for a better classified use-case

How is it different than face-detection? Well, this model features an additional class trained on an ADAS(Advanced Driver Assistance Systems) dataset, which includes Vehicles and Automobiles. So, while face-detection is meant for the category where the population of the region where you're driving a cycle is more, and you're more afraid of pedestrian accidents than automobile accidents. While, this model is more robust, and can be used in a wider use-case. So, lets say that you're in a region where the accident rate is high with automobiles as well as pedestrians. Here, the pedestrian and vehicle model suits the best!

Convolutional Architecture of the Model -

Pedestrian and vehicle detection network based on MobileNet v1.0 + SSD. MobileNet is a type of convolutional neural network designed for mobile and embedded vision applications. They are based on a streamlined architecture that uses depthwise separable convolutions to build lightweight deep neural networks that can have low latency for mobile and embedded devices.

Specifications of the Model -

Tested on challenging internal datasets with 1001 pedestrian and 12585 vehicles to detect.

An example of the pedestrian and vehicle adas model inference results on an example image -

So, this definitely demonstrates how accurate this model is! This model potrays around 14-15 FPS as compared to 30 FPS of face-detection model. But, as I said, it depends on the use-case. this model does some additional work as compared to the first model, also has high accuracy.

Here's the Setup I am using for my project. A power bank of 2.4Ampere, 5V minimum is used to power the Raspberry Pi as well as the OAK-D simultaneously. Neosensory Buzz is connected to the Raspberry Pi via Bluetooth.

In case you're wondering how Buzz would sound on a microphone, this is an exxample - (This example shows vibration of motors from left to right on buzz)

Demo: https://voca.ro/18aMCHjZq3ti

Here's how you can hear the motor intensities of the Neosensory Buzz in the demo Videos. Feel the Buzz!

In case you're wondering how does the intensity of Buzz Motors change with Depth, here's a high level overview-

Here's an example of person detection demo:

If the Person is closer than a specified threshold, the detection turns red, indicating that this might lead to a collision.

The Yellow detection indicates that the Person is a bit far away from the given threshold, but may cause an accident if it comes any closer. Similarly, a green detection indicates that everything's safe, enjoy your ride! These detections are converted to increasing intensities of motor on buzz. So closer the person is, the higher will be the frequesncy and intensity of vibration of motor. This can be heard in the video below.

Here's a demo video with the sound of Neosenosry Buzz motors vibrating:

There we go! This demo demonstrates the capablities of the project as well!

Replicating the Demo:

2 steps and done!

$ cd collision-pedestrians/
$ python3 main.py

The next demo will require some RoadTest! So, the examples captured in this model will be captured on-cycle!

vehicle-detection-adas demo to detect different types of Vehicles like Cars, truck, lorries, vans etc. Also useful for motorcycle detection. This demo can be used in Highways or areas prone to automobile-cycle accidents.

How is this model different from the other two? This model doesn't detect pedestrians, but detects the most proominient threat, vehicles. Trucks and Cars always have been the source to most cycle accidents. Many of such accidents take place on crossroads, flyovers or roundabouts. Sometimes, even near pavements. This model also, helps the cyclist navigate better in heavy traffic conditions and understand the rear view of the rider through haptic impulses.

Imagine the situation to be something similar to this.

The Field Of View(FOV) for the OAK-D is pretty large, allowing the camera to observe the vehicles in multi-directions in the back.

Field of View of RGB camera - 68.7938 deg

Field of View of Mono Cameras - 73.5 deg (nearly 1280 pixels)

Field of View of Cameras on ADAS cars looks something similar to this-

source - autonomous-driving.org

This is a vehicle detection network based on an SSD framework with tuned MobileNet v1 as a feature extractor. Since this uses a MobileNet v1 Architecture, the Convolutional Architecture is same as the second model. They are based on a streamlined architecture that uses depthwise separable convolutions to build lightweight deep neural networks that can have low latency for mobile and embedded devices. -

Tested on a challenging internal dataset with 3000 images and 12585 vehicles to detect.

Example -

source - OpenVINO

Replicating the Demo:

Just as all other demos, replicating the demo here, is pretty straightforward.

$ cd collision-vehicles/
$ python3 buzz_sptial_car.py

That's it? Yes! This runs the demo for you, once you have cloned the repo following the instructions in the previous section.

Since Vehicle detection wasn't possible indoors like the other two, finally I decided to go out and get a Roadtest! So, the car detection demo was a part of the the actual roadtest, and I have picked a few examples from the demo to display the accuracy of the model running real time on OAK-D:

demo ofVehicel Detection real-time on OAK-D

Depth Map of the same location

Note: These are actual images of real-time vehicle detection and depth estimation on OAK-D from Roadtest on roads of Mumbai.

Example of pedesrtrians and vehicles detected in RGB and depth map

An example of motorcycle detected in Vehicle Detection demo

So these, are all the demos from Face-Detection, Vehicle Detection, and Person Detection.

After creating the Examples for these models using Depthai and Neosensory API, I was ready to assemble this on my cycle and get ready for a roadtest. So, here we go!

So, here's the OAK-D and Raspberry Pi with 2.4A Power Bank connected to the Cycle, up and running in real time! This was connected to the Neosenosry Buzz(Not visible in the frame, since I had been wearing it).

1 / 8

After I set up the OAK-D and Raspberry Pi on my bicycle, I went ahead and took the Examples I had designed to a roadtest. I configured my Raspberry Pi to run the buzz_spatial_person.py python script each time it boots up.

In order to replicate the process of auto starting python script each time it boots, follow this, which is the documentation I followed to do the same.

This process is fairly simple, and as soon as it boots, the script begins to run and you'll notice a `click` sound on your OAK-D as soon as it's initiated. That's when you have to hold and press the + and - sign on the Neosensory Buzz for it to Calibrate with the OAK-D. And bingo!, we have real time object detection and depth estimation being mapped to the buzz in real time. Since, it's not possible to observe the visualisation of the objects detected on a cycle, for demonstration purposes, I took the OAK-D with me in a car and recorded visualisations to demonstrate Buzz capablities. Feel the Buzz!

Roadtest and Demo:

I tested the object detection and mapping spatial coordinates on Neosensory Buzz script initially in my complex and later on the roads of the main road.

This Demo includes 3 videos, initially in my complex, then the drive to Gas Station and later back from there. Due to some glitch in microphone, I wasn't able to record the motor intensity of buzz in the first round, but don't fret! I managed to sort it out in the second round and got the Intensity Mapping on Buzz corresponding to depth of each object.

Here are the Roadtests -

In this video you can observe that initially you could hear the buzz, which later couldn't be(as I said, some glitch. Apologies).

This was the Demo within my Complex, and the time where I just hit the roads. The accuracy with which the objects were detected was quite impressive! and real time estimation of Spatial location with the exact distance of vehicle, in metres, made the complete demo successful. The object which is the closest, i.e with least depth was taken while Mapping intensities on Buzz. This was to ensure, that the object closest is the one that is alerted instead of the one far away. Also, you might notice, in some cases the objects are detected, but the intensity of the buzz isn't quite distinguishable. This is because, I have set certain thresholds for the user to be alerted for, so that far away objects aren't taken as the ones which might be of any threat to the bicylist. So, the pedestrians/cars missed out are those which are pretty far away.

Thresholds and Intensity Corelation -

Depth - 0.1m to 2.8m - Intensity = 255

Depth - 2.8m to 3.7m - Intensity = 170

Depth - 3.7m to 5.0m - Intensity = 120

Depth - 5.0m to 7.0m - Intensity =80

else, Intensity = 40

The above demo also presents an audio of the Intensity of Buzz motors recorded on a microphone. This is the demo on roads, in the return trip.

Just for Fun, I tried to retrieve a Velocity map of the Vehicles, which is also called as Optical Flow. Despite getting a successful map, I couldn't manage to map the data on the buzz. And i'll definitely keep this for a future implementation!

1 / 2 • Optical Flow Velocity Estimation on Raspberry Pi Demo

Roadmap from Here:

There's simply lot of room to develop in improving Bicycle Assistance Systems using SpatialAI and Tactile Tech Stack. This project was inspired by Brandon Gilles's(CEO - luxonis) Efforts on Creating Commute Guardian, software and hardware to prevent bicycle accidents using SpatialAI. This project, HapticCV is contributing to Open Source Software system using DepthAI and Neosensory Buzz to imrpove bicycle safety with real time alerts through Haptic Stimulus, for not just detection objects but also alerting the rider. Upon developing this novel stream of Bicycle Safety in SpatialAI, I'll be working on integrating Velocity Estimation as well, to offer Alerts based on Speed of the Vehicle as well as depth, so that the rider is very well aware about potential overtaking from either side. There are plenty of opportunities in this field and I look forward to improving types of buzz paterns and varying intensity with Depth and Velocity to be easy to distinguish and understand rear view better. I also aim at integrating Person Tracking and Vehicle Tracking, to only track the depth and velocity of a particular type of object, or an object that exceeds certain velocity threshold. I've already scripted this using DepthAI gen2 API, however due to time constraint I wasn't able to implement and test this out on the Neosensory Buzz.

Keeping this in mind, If I manage to develop this further, I also plan to submit a research proposal on ResearchGate, for fellows like me, keen in CV and Haptic Stimulus, explore and contribute to this OpeSource Network. Final Words: Feel the Future!