Published October 13, 2020 © GPL3+

Distracted Drivers Detection and Classification using Vision

We Detect Distracted Behaviour of Car Drivers using Cameras and OpenVino to Speed up Inference

IntermediateFull instructions provided2 hours993

Distracted Drivers Detection and Classification using Vision

Things used in this project

Software apps and online services

Intel OpenVINO™ toolkit

TensorFlow

Story

Around 3, 700 people are killed globally in road traffic crashes involving cars, buses, motorcycles, bicycles, trucks, or pedestrians.Most of these accidents are caused by Distractions of Driver, Either Driver is checking his phone, eating or looking somewhere else. To avoid this I have created a Deep Learning Model to Predict and Classify whether the driver is distracted while driving by using cameras on road. To speed up the Inference of prediction I have Used OpenVino Toolkit by Intel Which Speeds up prediction by more than 2 Times.

It will help in reducing Road Accidents by 20% by alerting Car drivers before Hazard

All Codes are written and Tested on Google Colab for Easy Testing

Data

Dataset is taken from Kaggle's State Farm Distracted Driver Dataset , it contains 10 classes for prediction

safe driving
texting - right
talking on the phone - right
texting - left
talking on the phone - left
operating the radio
drinking
reaching behind
hair and makeup
talking to passenger

Algorithm

It is a simple CNN model which contains Max Pooling and Convoluted Layers

Model: "sequential" 
_________________________________________________________________ 
Layer (type)                 Output Shape              Param #    
================================================================= 
Conv_1 (Conv2D)              (None, 128, 128, 32)      2432       
_________________________________________________________________ 
Pool_1 (MaxPooling2D)        (None, 64, 64, 32)        0          
_________________________________________________________________ 
Conv_2 (Conv2D)              (None, 64, 64, 64)        51264      
_________________________________________________________________ 
Pool_2 (MaxPooling2D)        (None, 32, 32, 64)        0          
_________________________________________________________________ 
Conv_3 (Conv2D)              (None, 32, 32, 128)       204928     
_________________________________________________________________ 
Pool_3 (MaxPooling2D)        (None, 16, 16, 128)       0          
_________________________________________________________________ 
Conv_4 (Conv2D)              (None, 16, 16, 256)       819456     
_________________________________________________________________ 
Pool_4 (MaxPooling2D)        (None, 8, 8, 256)         0          
_________________________________________________________________ 
flatten (Flatten)            (None, 16384)             0          
_________________________________________________________________ 
fc_1 (Dense)                 (None, 1024)              16778240   
_________________________________________________________________ 
dropout (Dropout)            (None, 1024)              0          
_________________________________________________________________ 
fc_2 (Dense)                 (None, 512)               524800     
_________________________________________________________________ 
fc_3 (Dense)                 (None, 10)                5130       
=================================================================

Training

The Model was Trained for 10 Epochs and getting Accuracy of 99%

Inference on Normal CPU :

The Inferences were taken on 50 images of Test Data from Dataset

Results :

Time Taken Per Image : 0.0551 Seconds

Inference using OpenVino Toolkit:

All Codes are written and Tested on Google Colab for Easy Testing

Installation:

Windows - link

Linux- link

OpenVino was used on Final Trained Model

Steps:

1. Converting.h5 model to.pb (frozen model)

2. Converting.pb (frozen model) to.xml and.bin file required for OpenVino.

3. Running the final inference model using OpenVino's Python Libraries.

The Inferences were taken on 50 images of Test Data from Dataset

Results :

Time Taken Per Image : 0.0258 Seconds

Conclusions:

OpenVino Gave a Boost to Inference by 2.1 Times than Normal Inferencing. It can be easily deployed to embedded devices in Cameras used by Traffic Departments.