Title: Facial Expression Detection and Object Tracking for Educational AI
Problem Statement: This project aims to develop an AI that can help students in real time by tracking their facial expressions and the objects that are present in the webcam. The AI will then provide the necessary help immediately and limit any distractions caused by the objects.
Why: As a high school student, I often found it difficult to get help when I needed it. I wanted to create an intelligent system that could automatically provide me with help whenever it detected that I was facing trouble.
How: Currently, I do not have a working prototype of this AI model. I am still training the model, and the accuracy is not yet where I want it to be. The model uses a dataset of faces and household objects to detect and recognize objects that are present in the webcam. It will then perform relevant actions to help the student.
Future Work:
- Collect a larger dataset of facial expressions and objects.
- Improve the accuracy of the AI.
- Develop a mobile app or web application to deploy the AI.
- Take steps to protect the privacy of the students.
Note: I will provide the Jupyter Notebook (.ipynb) that I am currently working on in the code section.
Extras: I have not yet used the OpenCV library during the model training phase, but I plan to do so in the final submission. OpenCV is a popular computer vision library that can be used for a variety of tasks, including image processing, object detection, and video analysis. I plan to use OpenCV to improve the performance of my model by providing it with more features and functions.
In the final submission, I will use OpenCV to:
- Pre-process the images, such as resizing them and converting them to grayscale.
- Detect objects in the images.
- Classify the objects.
I believe that using OpenCV will improve the performance of my model and make it more competitive in the OpenCV2023 competition.
education-ai.ipynb
PythonTo use it, download the file as .ipynb, then use the interactive user interface of the notebook to run each individual cell. Alternatively, you can convert it to a python file using nbconvert, or follow this link: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwjXtLz4vveAAxX8S2wGHcB4BSEQFnoECAgQAw&url=https%3A%2F%2Fcode.visualstudio.com%2Fdocs%2Fpython%2Fjupyter-support-py%23%3A~%3Atext%3DWhen%2520you%2520open%2520a%2520notebook%2CPython%2520file%2520(.py).&usg=AOvVaw1sqMAdZem_dg8GMOfvJMqC&opi=89978449.
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# libraries\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nUse print if the data is found to be missing\\n'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# preparing the data\n",
"data = pd.read_csv('fer2013.csv')\n",
"train = data.loc[data['Usage'] == 'Training']\n",
"test = data.loc[data['Usage'] == 'PrivateTest']\n",
"\n",
"'''\n",
"Use print if the data is found to be missing\n",
"'''"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1e+03 ns, sys: 0 ns, total: 1e+03 ns\n",
"Wall time: 3.1 s\n",
"Preprocessing Done\n",
"Number of Training Samples: 28709\n",
"Number of Test Samples: 3589\n",
"Training Label Counts: [3995 436 4097 7215 4830 3171 4965]\n",
"Training Data Shape: (28709, 48, 48, 1)\n",
"Training Label Shape: (28709,)\n",
"Training Label Counts: [491 55 528 879 594 416 626]\n"
]
}
],
"source": [
"# prepare training data\n",
"%time\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")\n",
"\n",
"width, height = 48, 48\n",
"\n",
"train_datapoints = train['pixels'].tolist()\n",
"\n",
"# getting features for training\n",
"X_train = []\n",
"for xseq in train_datapoints:\n",
" xx = [int(xp) for xp in xseq.split(' ')]\n",
" xx = np.asarray(xx).reshape(width, height)\n",
" X_train.append(xx.astype('float32'))\n",
"\n",
"X_train = np.asarray(X_train)\n",
"X_train = np.expand_dims(X_train, -1)\n",
"\n",
"test_datapoints = test['pixels'].tolist()\n",
"\n",
"# getting features for testing\n",
"X_test = []\n",
"for xseq in test_datapoints:\n",
" xx = [int(xp) for xp in xseq.split(' ')]\n",
" xx = np.asarray(xx).reshape(width, height)\n",
" X_test.append(xx.astype('float32'))\n",
"\n",
"X_test = np.asarray(X_test)\n",
"X_test = np.expand_dims(X_test, -1)\n",
"\n",
"# getting labels for training\n",
"y_train = train['emotion']\n",
"y_train = np.asarray(y_train)\n",
"\n",
"# getting labels for testing\n",
"y_test = test['emotion']\n",
"y_test = np.asarray(y_test)\n",
"\n",
"print(\"Preprocessing Done\")\n",
"print(\"Number of Training Samples: \", X_train.shape[0])\n",
"print(\"Number of Test Samples: \", X_test.shape[0])\n",
"print(\"Training Label Counts: \", np.bincount(y_train))\n",
"print(\"Training Data Shape: \", X_train.shape)\n",
"print(\"Training Label Shape: \", y_train.shape)\n",
"print(\"Training Label Counts: \", np.bincount(y_test))"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# visualise training data\n",
"\n",
"visualise = False\n",
"\n",
"if visualise:\n",
"\n",
" # Mapping emotion label with expression\n",
" labels = ['Angry', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral']\n",
"\n",
" # Create a figure with 4 rows and 5 columns\n",
" width = 10\n",
" height = 10\n",
" fig = plt.figure(figsize=(10, 15))\n",
" columns = 4\n",
" rows = 5\n",
"\n",
" import random\n",
"\n",
" # For each row and column, randomly select an image from the dataset and display it\n",
" for i in range(1, columns*rows +1):\n",
" # img = np.random.randint(10, size=(h,w))\n",
" fig.add_subplot(rows, columns, i)\n",
" random_factor = random.randint(1,len(X_train))\n",
" plt.imshow(X_train[random_factor].reshape(48,48), cmap = 'gray')\n",
" plt.title(labels[y_train[random_factor]])\n",
" plt.show()\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" conv2d (Conv2D) (None, 46, 46, 128) 1280 \n",
" \n",
" max_pooling2d (MaxPooling2D (None, 23, 23, 128) 0 \n",
" ) \n",
" \n",
" dropout (Dropout) (None, 23, 23, 128) 0 \n",
" \n",
" conv2d_1 (Conv2D) (None, 19, 19, 256) 819456 \n",
" \n",
" max_pooling2d_1 (MaxPooling (None, 9, 9, 256) 0 \n",
" 2D) \n",
" \n",
" dropout_1 (Dropout) (None, 9, 9, 256) 0 \n",
" \n",
" conv2d_2 (Conv2D) (None, 7, 7, 512) 1180160 \n",
" \n",
" max_pooling2d_2 (MaxPooling (None, 3, 3, 512) 0 \n",
" 2D) \n",
" \n",
" dropout_2 (Dropout) (None, 3, 3, 512) 0 \n",
" \n",
" conv2d_3 (Conv2D) (None, 1, 1, 1024) 4719616 \n",
" \n",
" dropout_3 (Dropout) (None, 1, 1, 1024) 0 \n",
" \n",
" flatten (Flatten) (None, 1024) 0 \n",
" \n",
" dense (Dense) (None, 256) 262400 \n",
" \n",
" dropout_4 (Dropout) (None, 256) 0 \n",
" \n",
" dense_1 (Dense) (None, 512) 131584 \n",
" \n",
" dropout_5 (Dropout) (None, 512) 0 \n",
" \n",
" dense_2 (Dense) (None, 7) 3591 \n",
" \n",
"=================================================================\n",
"Total params: 7,118,087\n",
"Trainable params: 7,118,087\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"# Normalise the data\n",
"X_train /= 255.0\n",
"X_train -= np.mean(X_train, axis=0)\n",
"X_train /= np.std(X_train, axis=0)\n",
"X_test /= 255.0\n",
"X_test -= np.mean(X_test, axis=0)\n",
"X_test /= np.std(X_test, axis=0)\n",
"\n",
"import tensorflow as tf\n",
"\n",
"# Convolutional layer 1\n",
"model = tf.keras.models.Sequential([\n",
" tf.keras.layers.Conv2D(128, (3, 3), activation='relu', input_shape=(48, 48, 1)),\n",
" tf.keras.layers.MaxPooling2D(2, 2),\n",
" tf.keras.layers.Dropout(0.25),\n",
"])\n",
"\n",
"# Convolutional layer 2\n",
"model.add(tf.keras.layers.Conv2D(256, (5, 5), activation='relu'))\n",
"model.add(tf.keras.layers.MaxPooling2D(2, 2))\n",
"model.add(tf.keras.layers.Dropout(0.25))\n",
"\n",
"# Convolutional layer 3\n",
"model.add(tf.keras.layers.Conv2D(512, (3, 3), activation='relu'))\n",
"model.add(tf.keras.layers.MaxPooling2D(2, 2))\n",
"model.add(tf.keras.layers.Dropout(0.25))\n",
"\n",
"# Convolutional layer 4\n",
"model.add(tf.keras.layers.Conv2D(1024, (3, 3), activation='relu'))\n",
"# model.add(tf.keras.layers.MaxPooling2D(2, 2))\n",
"model.add(tf.keras.layers.Dropout(0.25))\n",
"\n",
"# Compile the model\n",
"model.add(tf.keras.layers.Flatten())\n",
"model.add(tf.keras.layers.Dense(256, activation='relu'))\n",
"model.add(tf.keras.layers.Dropout(0.25))\n",
"model.add(tf.keras.layers.Dense(512, activation='relu'))\n",
"model.add(tf.keras.layers.Dropout(0.25))\n",
"model.add(tf.keras.layers.Dense(7, activation='softmax'))\n",
"\n",
"# Compile the model\n",
"model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])\n",
"model.summary()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/100\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-08-24 17:23:30.528140: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"898/898 [==============================] - 401s 446ms/step - loss: 1.7023 - accuracy: 0.3076 - val_loss: 1.5260 - val_accuracy: 0.4032\n",
"Epoch 2/100\n",
"898/898 [==============================] - 411s 457ms/step - loss: 1.5013 - accuracy: 0.4121 - val_loss: 1.4014 - val_accuracy: 0.4609\n",
"Epoch 3/100\n",
"898/898 [==============================] - 413s 459ms/step - loss: 1.4200 - accuracy: 0.4492 - val_loss: 1.3395 - val_accuracy: 0.4806\n",
"Epoch 4/100\n",
"898/898 [==============================] - 414s 461ms/step - loss: 1.3729 - accuracy: 0.4686 - val_loss: 1.3020 - val_accuracy: 0.5032\n",
"Epoch 5/100\n",
"898/898 [==============================] - 414s 461ms/step - loss: 1.3317 - accuracy: 0.4889 - val_loss: 1.2610 - val_accuracy: 0.5230\n",
"Epoch 6/100\n",
"588/898 [==================>...........] - ETA: 2:20 - loss: 1.3050 - accuracy: 0.5041"
]
}
],
"source": [
"# Train the model with the training=True flag.\n",
"model.fit(X_train, y_train, epochs=100, validation_data=(X_test, y_test))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Plot the accuracy and loss\n",
"plt.plot(history.history['accuracy'])\n",
"plt.plot(history.history['val_accuracy'])\n",
"plt.title('Model Accuracy')\n",
"plt.ylabel('Accuracy')\n",
"plt.xlabel('Epoch')\n",
"plt.legend(['Training', 'Testing'], loc='upper left')\n",
"plt.savefig(\"initial_accuracy.png\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Save the model\n",
"model.save(\"initial_model.h5\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "machine-learning",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Comments