This project is one's first step into Artificial Intelligence. Facial recognition has many applications in the real world. For this project, I have developed a face recognition system integrated with Google Home.
STEP 1: Install LibrariesWe need to first install the following libraries for the code to work:
- Open CV:
pip install opencv-python
- to get video input and save image when a face is detected.
- Face recognition:
pip install face_recognition
- to detect and identify faces from video input.
- PyChromecast:
pip install pychromecast
- to cast the audio file to Google home, notifying the user of the detected face.
- Pickle:
pip install pickle
- to save and get the names in our dataset and their face encodings.
- gTTS:
pip install gtts
- to generate and save the audio files to notify the user of the face which is detected.
Install each of these libraries by opening a command prompt window and typing in the commands mentioned above. For example, to install the opencv library, type in pip install opencv-python
in the command prompt window.
Note: Installing the face_recognition library is a bit hairy. Open a command prompt window where you have installed Visual Studio (Not VS code) and type in the command. You may also have to install the cmake library as a dependency to do this.
All the other libraries which are required are:
smtplib
- for setting up a simple server for email
email.message
- for sending an email to the user, notifying them on a face detection
http.server
- for setting up a simple HTTP server to cast local audio files to Google Home.
multiprocessing-
to run the HTTP server parallelly with the rest of the code, instead of manually creating an HTTP server
For any AI machine to work, we need 2 components:
- Data: Training and testing data
- Proper algorithms to drive the AI machine.
We need to make a directory for storing the images/training data of people whose faces we want the programme to recognize. It is recommended to have at least 5 images of each person, to have a more accurate result. In AI, the more the data, the higher the accuracy of the system. Make subfolders for each person, and put their images in there.
STEP 3: Code to generate face data and corresponding audio filesHere, we will generate the face data for each image in the directory where we have stored the images and save the data. We will also generate and save audio files for each person, which will notify the user of the person detected. For example, if the detected name is Jake, the audio file will say "Jake's at the door"
Note: To get the names of each person whose picture is in the known directory, we need to save the picture in the name of the person.
import pickle
import face_recognition
import os
import gtts
training_encodings = []
Names = []
image_dir = r" *Path to directory of known images* "
for root, dirs, files in os.walk(image_dir):
for file in files:
path = os.path.join(root,file)
name = os.path.splitext(file)[0]
person = face_recognition.load_image_file(path)
encoding = face_recognition.face_encodings(person)[0]
training_encodings.append(encoding)
if "_" in name:
name = name.split("_")[0]
Names.append(name)
print(name)
with open('train.pkl','wb') as f:
pickle.dump(Names,f)
pickle.dump(training_encodings,f)
print(Names)
for name in Names:
msg = name+"is at the door"
tts = gtts.gTTS(msg,lang="en",tld = "co.uk")
url = name+".mp3"
tts.save(url)
- First, we import all the necessary libraries
- Next, we declare variables for storing the names and their corresponding face encodings. You can print out the encoding list to see the output... it is pretty interesting to see your face encoding. We also need a variable to store the location of the image directory.
- Next, we walk through each file in the image directory and save the encodings of the face of each person. Since I am integrating this project with Google Home, I also generate the customized audio files for each user. This will notify the main user of the face detected. (Check step 6)
This is the simple part... now on to the main code !!
STEP 4: The Main Code !!import smtplib
from email.message import EmailMessage
import face_recognition
import cv2
import pickle
import pychromecast
import http.server
import multiprocessing
train_encodings = []
Names = []
scale_factor = 0.25
cast = pychromecast.Chromecast(" *Your Google home's local IP address* ")
cast.wait()
print(cast.name)
mc = cast.media_controller
speak = True
mail = True
nameOld = ''
def caster(url):
global speak
if speak == True:
mc.play_media(url, 'audio/mp3')
mc.block_until_active()
mc.pause()
mc.play()
speak = False
def email_alert(subject, body, to):
global mail
if mail == True:
msg = EmailMessage()
msg.set_content(body)
msg['subject'] = subject
msg['to'] = to
user = " *Your email ID* "
msg['from'] = user
password = " *Your Google account's double authentication password* "
server = smtplib.SMTP('smtp.gmail.com',587)
server.starttls()
server.login(user,password)
with open(r'login.jpg', 'rb') as f:
image_data = f.read()
image_name = f.name
image_type = image_name.split(".")[1]
msg.add_attachment(image_data, maintype='image', subtype=image_type, filename=image_name)
server.send_message(msg)
server.quit()
mail = False
def web_server():
httpd = http.server.HTTPServer(server_address=('',8000),RequestHandlerClass=http.server.SimpleHTTPRequestHandler)
httpd.serve_forever(poll_interval=0.5)
with open ('train.pkl','rb') as f:
Names = pickle.load(f)
train_encodings = pickle.load(f)
print(Names)
cam = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_SIMPLEX
while __name__ == '__main__':
p = multiprocessing.Process(target=web_server, args=())
p.daemon = True
p.start()
r,img = cam.read()
img_small = cv2.resize(img,(0,0),fx = scale_factor,fy = scale_factor)
img_rgb = cv2.cvtColor(img_small,cv2.COLOR_BGR2RGB)
face_positions = face_recognition.face_locations(img_rgb,model = 'cnn')
if not face_positions:
continue
all_encodings = face_recognition.face_encodings(img_rgb,face_positions)
for (top,right,bottom,left),face_encoding in zip(face_positions,all_encodings):
name = "Unknown person"
matches = face_recognition.compare_faces(train_encodings,face_encoding)
if True in matches:
image_index = matches.index(True)
name = Names[image_index]
email = "Log in by: "+name
url = "http:// *Your computer's local IP address* :8000/"+name+".mp3"
print(url)
if name != nameOld:
speak = True
nameOld = name
caster(url)
top = int(top//scale_factor)
left = int(left//scale_factor)
bottom = int(bottom//scale_factor)
right = int(right//scale_factor)
cv2.rectangle(img,(top,left),(bottom,right),(0,255,0),2)
cv2.putText(img,name,(left,top),font,0.75,(0,255,0),thickness=2)
cv2.imwrite("login.jpg",img)
email_alert("Update",email," *To email address* ")
In this piece of code, we are doing 3 main things:
- Load the training data into the code using the
pickle.dump
command
- Acquiring data from the camera and comparing it with the face encodings using the
face_recognition.compare_faces(train_encodings,face_encoding)
command. We need to save this output in a variable, I'm calling itmatches
. This returns a list of true and false values, comparing the face found in the camera to each face in our database.
- Lastly, if we do find a match, we give the corresponding result. I have chosen to integrate this program with Google Home. The Google Home will notify the user of the name(s) detected.
*This is an extra part of the main code. You can customize your output however you want it.
I wanted to integrate this code with Google Home. The goal was to make the Home notify the user of the person detected in the camera. There are multiple libraries out there to cast an audio file to Google or make it say something; the process remains the same. I considered using the googlehomepush
library to do this, however, the library wasn't compatible with the gtts version I had. So, I decided to use an alternate library: pychromecast
. As the name suggests, this library can cast an audio /video file to any cast-enabled device. So, I have used the gtts
library to generate the audio files for the customized message for each user. For instance, if the name detected is "Jake", the automated voice would say "Jake's at the door". You can alternatively customize your message for each user, or keep a generic message for all users.
The googlehomepush
library does the same, except it stores the generated audio files as cache.
import pychromecast
cast = pychromecast.Cast(" *Your Google home's local IP address* ")
cast.wait()
mc = cast.media_controller
def caster(url):
global speak
if speak == True:
mc.play_media(url, 'audio/mp3')
mc.block_until_active()
mc.pause()
mc.play()
speak = False
The url
variable is normally the path to the audio file. However, local files need an HTTP server to be cast to the Google Home. We can create one by typing in python -m http.server
in a command prompt window. Note that this needs to be done in the directory where your program is.
In the code, the URL to the audio file to be cast is going to be of the form http://*Your computer's IP Address*:8000/*Name of file*
. The 8000 is there because the default port at which the server is serving is 8000.
This is the final step of this project. It would be pretty annoying to open a command prompt window to start an HTTP server each time before you run the program. So, we can create a server each time the program runs with this piece of code:
import http.server
import multiprocessing
def web_server():
httpd = http.server.HTTPServer(server_address=('',8000),RequestHandlerClass=http.server.SimpleHTTPRequestHandler)
httpd.serve_forever(poll_interval=0.5)
while __name__ == '__main__':
p = multiprocessing.Process(target=web_server, args=(), daemon=True)
p.start()
For the code and the server to run continuously, we need to run them parallelly. This is why we have used the multiprocessing
library.
That's it!! You now have a face recognition system integrated with Google Home !
Comments
Please log in or sign up to comment.