Introduction
Project Structure
What You'll Need
Sampling
Sampling biases
Data manipulation
Generating a dataset with images information
Develop Machine Learning model
Running Grad-CAM
Summary

Published April 29, 2022 © LGPL

Grad-CAM for your Machine Learning projects

Apply Grad-CAM to have a better understanding of your Machine Learning image classification projects

IntermediateFull instructions provided1.5 hours709

Grad-CAM for your Machine Learning projects

Things used in this project

Software apps and online services

MicroPython

Story

Introduction

Image classification using Machine Learning could have many useful applications for daily tasks and has the potential to make our lives easier.

Convolutional Neural Network (CNN) is one of the most common techniques when it comes to image classification but sometimes could be thought as black-boxes where we do not really know what characteristics of the images are triggering the responses. Thus, there is the need to improve the understanding of how the model operates. This greater understanding could lead to improve model accuracy and also to avoid biases in the datasets.

Regarding to biases in CNN models and its importance there is a very interesting urban legend involving the US Army and a tank detection program that could be read here: https://pyimagesearch.com/2020/03/09/grad-cam-visualize-class-activation-maps-with-keras-tensorflow-and-deep-learning/

Grad-CAM can help us to shine some light into the black-box by showing us which pixels of the image were the ones that the model found important to pick the class.

Left: Original picture. Centre: Heat map given by Grad-CAM. Right: Overlapping of the previous images

This project will:

Develop Machine Learning programs to perform image classification on cutlery
Implement Grad-CAM programs to highlight the areas of the image that were the most influential for image classification

Project Structure

You can choose between training your own model or by taking one that has already been trained on Edge Impulse following this link. If you choose to take the Edge Impulse way you can go directly to the Running Grad-CAM step.

Machine Learning model will give as an output one of four possible categories:

Background
Fork
Knife
Spoon

Grad-CAM will help us by highlighting which pixels of the image were the ones that the model found important to make the classification.

What You'll Need

Optional: Edge Impulse account (edgeimpulse.com)

Sampling

First step in every ML project is to get the data, which in this case will be images. As bigger and less biased the dataset is the better the model will perform.

For this project I picked my mobile phone and I took 75 pictures of each of the 4 categories (making up to 300 images in total). In my case, each of the images has a size of 4000x2250 pixels and will be later shrinked to lighten the training model.

Once that you have finished taking the pictures download them to your computer.

Its always a good practice to tag the pictures and number them to make further steps easier.

Sampling biases

Its important to be extremely careful while taking the samples because a biased dataset could seriously affect model performance in its real world application. In this case, there are a few biases that I considered as acceptable as:

All the images have the same background
I only used one sort of fork, knife and spoon
Photographs were taken from the top

Model will perform poorly in a different background, or with other spoon design or with images taken from a side for example.

Data manipulation

Once that you have your images tagged in your computer you can run JPG Image editor - Resizing images.py to reduce its size by ten times (in my case from 4000x2250 to 400x225).

Note 1: I'm assuming that the images are in JPG format

Note 2: Save and run this JPG Image editor - Resizing images.py file in a folder that only has the images to be shrinked.

import os
import numpy as np
import PIL
from PIL import Image
from PIL import ImageFilter
from PIL import ImageEnhance
from PIL import ImageOps

files_in_path = os.listdir()
files_in_path_lst = []

for fip in files_in_path:
    files_in_path_lst.append(fip)

for i in files_in_path_lst:

    aux = i[-4:]
    name = i[:-4]
    ext = ".jpg"

    if aux == ext:

        orig = Image.open(i)
        width, height = orig.size

        print(i, width, height)

        newsize = (width//10, height//10) # width and heights reduced 10 times
        orig = orig.resize(newsize)

        raw = ImageOps.grayscale(orig)
        ty = "raw "
        fn = ty + name + ext
        raw.save(fn)

Once that the code has run you should now have the same images but in a smaller size. Note that we made a new file for each image while the original ones remain unchanged.

We'll continue by grouping the images of the same class in different folders. This will make the following step easier to run and to understand.

Generating a dataset with images information

Its time to generate an excel file where we are going to store in one column the image location and in the second column image class name.

Images location and classes.py will go through the folders that are in the same path and record images location and its class according to its folder name. After it went through all the images in the same folder will generate an excel file called Image Classification database.xlsx.

import os

import pandas as pd

df = pd.DataFrame({'filename': [], 'category': []}) # Create a blank dataframe with two columns: image file name and the waste category

cd = os.getcwd()
folders = os.listdir(cd)

for fd in folders:
    if (".py" not in fd and ".h5" not in fd and ".xlsx" not in fd):
        folderpath = cd + "\\" + fd
        files = os.listdir(folderpath)
        for fl in files:
            filepath = folderpath + "\\" + fl
            df = df.append({'filename': filepath, 'category': fd}, ignore_index=True)

df = df.dropna()
print("Number of images = {}".format(df.size))
print (df['category'].value_counts()) # Quick look of the dataframe

with pd.ExcelWriter('Image Classification database.xlsx', mode='w') as writer:
    df.to_excel(writer)

This excel file, along with the images, will be used to train the Machine Learning model.

1 / 2 • Structure that the program expects (1/2)

Develop Machine Learning model

In this step we are going to write our own code to train our own CNN model. I'm using as reference the following code that I picked from Kaggle (https://www.kaggle.com/code/roy2004/cnn-waste-classification-from-jpg-op-3)

Upload Define, train and test model -h5.py in the same folder as the previous ones

Model summary:

Goes through a dataset that has stored the path to the images and their category
Definition and training of the model
Comparison of the predicted class with the actual ones
Saving the model in a .h5 file that we will use later on for running Grad-CAM

import os
from random import randint

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

from PIL import Image

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential # Importing models from Keras
from tensorflow.keras.callbacks import EarlyStopping, LearningRateScheduler
from tensorflow.keras.layers import Dense, InputLayer, Dropout, Conv1D, Conv2D, Flatten, Reshape, MaxPooling1D, MaxPooling2D, BatchNormalization, TimeDistributed
from tensorflow.keras.optimizers import Adam

df = pd.read_excel('Image Classification database.xlsx', index_col=0) # Import a table with location to specific images and its category

df['category'].value_counts().plot.bar()
print (df['category'].value_counts())

df_train=df.sample(frac=0.8,replace=False) # Randomly sample 80% of data from dataframe for training. Replace = False to prevent repeat sampling
df_valid=df.drop(df_train.index.values) # The rest 20% is the validation images

df_train['category'].value_counts().plot.bar()
print (df_train['category'].value_counts())

df_valid['category'].value_counts().plot.bar()
print(df_valid['category'].value_counts())

#Image.open(random.choice(df_train['filename'])).show()

FAST_RUN = False # True if you want to quick test you model (training for 3 epochs). False to have a full train (50 epochs).
epochs = 3 if FAST_RUN else 100

IMAGE_WIDTH = 400 # Enter the width and height of images
IMAGE_HEIGHT = 225
IMAGE_SIZE = (IMAGE_WIDTH, IMAGE_HEIGHT)

IMAGE_CHANNELS = 1 # 3 if RGB. 1 if Grayscale
batch_size = 32
d = 0.1 # Dropout rate

# Use when training on pre-trained weights
START_EPOCH = 0 # if fresh train, enter 0
Transfer = False
Pretrained_Link = os.getcwd() + "/model.h5"

#rdm = randint(0,len(df_train['filename']))
#sample = df_train['filename'].iloc[rdm]
#pic = Image.open(sample)
#pic.show()

classes_values = ["background", "fork", "knife", "spoon" ]
classes = len(classes_values)

# Create Keras Sequential Model
model = Sequential()
model.add(Conv2D(32, kernel_size=3, activation='relu', kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same'))
model.add(MaxPooling2D(pool_size=2, strides=2, padding='same'))
model.add(Conv2D(16, kernel_size=3, activation='relu', kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same', name = "last_conv2d"))
model.add(MaxPooling2D(pool_size=2, strides=2, padding='same'))
model.add(Flatten())
model.add(Dropout(0.25))
model.add(Dense(classes, activation='softmax', name='y_pred'))

if Transfer:
    model.load_weights(Pretrained_Link)

opt = Adam(learning_rate=0.0005, beta_1=0.9, beta_2=0.999)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

#model.summary()

earlystop = EarlyStopping(patience=10,restore_best_weights=True)

LR_START = .001 # Learning rate (LR) schedule for TPU, GPU and CPU
LR_MIN = 1e-6
LR_EXP_DECAY = .94

# Define a Learning Rate function on epoch that will decrease exponentially.
def lrfn(epoch):
    lr = (LR_START - LR_MIN) * LR_EXP_DECAY ** (epoch + START_EPOCH) + LR_MIN
    return lr

lr_callback = LearningRateScheduler(lrfn, verbose=True)

rng = [i for i in range(START_EPOCH, epochs + START_EPOCH)] # Visualize the change in learning rate
y = [lrfn(x) for x in rng]
#plt.plot(rng, y)
#plt.show()
print("Learning rate schedule: {:.3g} to {:.3g}".format(y[0], y[-1]))

total_train = df_train.shape[0] # Total number of images for training
total_validate = df_valid.shape[0] # Total number of images for validation

print("Training: {}, Validation: {}".format(total_train,total_validate))

train_datagen = ImageDataGenerator(rotation_range=15, rescale=1./255, shear_range=0.1, horizontal_flip=True, vertical_flip=True)

train_generator = train_datagen.flow_from_dataframe(df_train, "", x_col='filename', y_col='category', target_size=IMAGE_SIZE, class_mode='categorical', batch_size=batch_size, color_mode = "grayscale") # According to the dataframe, pull images one by one from image directory

validation_datagen = ImageDataGenerator(rescale=1./255) # Validation doesn't need much data Augmentation
validation_generator = validation_datagen.flow_from_dataframe(df_valid, "", x_col='filename', y_col='category', target_size=IMAGE_SIZE, class_mode='categorical', batch_size=batch_size, color_mode = "grayscale") # According to the dataframe, pull images one by one from image directory

history = model.fit(train_generator, batch_size=batch_size, epochs=epochs, validation_data=validation_generator, validation_steps=total_validate//batch_size, steps_per_epoch=total_train//batch_size, callbacks=[earlystop, lr_callback])

model.save("model.h5") # Save Model in h5 format (old one). New models are saved as SaveModel
#model.save("model_raw") # Save Model

test_df = df.sample(frac = 0.3) # Randomly select 30% of data
nb_samples = test_df.shape[0] # Number of testing samples

test_gen = ImageDataGenerator(rescale=1./255) # Test generator in the same fashion of the train/validation generators
test_generator = test_gen.flow_from_dataframe(test_df, "", x_col='filename', y_col='category', class_mode=None, target_size=IMAGE_SIZE, batch_size=batch_size, shuffle=False, color_mode = "grayscale")

predict = model.predict_generator(test_generator, steps=np.ceil(nb_samples/batch_size))

test_df['pred_category'] = np.argmax(predict, axis=-1)

label_map = dict((v,k) for k,v in train_generator.class_indices.items())
test_df['pred_category'] = test_df['pred_category'].replace(label_map)

test_df["background"] = predict[:, [0]]
test_df["fork"] = predict[:, [1]]
test_df["knife"] = predict[:, [2]]
test_df["spoon"] = predict[:, [3]]

submission_df = test_df.copy()

with pd.ExcelWriter('Summary.xlsx', mode='w') as writer:
    submission_df.to_excel(writer)

Please note that IMAGE_WIDTH = 400, IMAGE_HEIGHT = 225 and IMAGE_CHANNELS = 1 are according to my image specifications (being IMAGE_CHANNELS 1 for grayscale. If you are using coloured images IMAGE_CHANNELS should be 3)

Once the model has finished its training (might take a while according to how long is your image dataset, image sizes, colours, batch size, learning rate, etc.) it should create an .h5 file that we need for running Grad-CAM.

In this case, CNN model performance is really poor (accuracy is around 50%). This might be explained given the small dataset and the size of the images that doesn't allow the model to quickly recognize patterns.

It's really interesting to mention than in a previous project where I used Edge Impulse Transfer Learning for the exact same image dataset models accuracy was above 90%. Thus, Transfer Learning has a huge positive impact in models performance.

h5 file created after running Define, train and test model - h5.py

Running Grad-CAM

Finally, we are ready to run Grad-CAM to understand how the CNN is reasoning.

I'll be using as guide the following code that I took from Github. It's important to mention that this Grad-CAM code only works for CNN projects and its not currently working for Transfer Learning projects created on Edge Impulse. That's why we are performing Grad-CAM in this project with low accuracy instead that running it on the better one.

import PIL
import cv2
import numpy as np
import tensorflow as tf
import os

from tensorflow import keras
from keras import activations, layers, models, backend
from skimage.transform import resize

import matplotlib.pyplot as plt

LABELS = ["background", "fork", "knife", "spoon"] # Labels

IMAGE_PATH = r"your\image\path" # Change this based on your image sample
TRUE_LABEL = "yourimageclass"

# If you wrote your own model should match your image size
# If you are importing from Edge Impulse go to Image resolution (Edge Impulse project > Impulse design > Image data)
WIDTH = 400
HEIGHT = 225

true_idx = LABELS.index(TRUE_LABEL) # Find index of true label in label list

model = tf.keras.models.load_model("model.h5") # Load model file
model.summary()

img = PIL.Image.open(IMAGE_PATH) # Load image
img = img.convert('L') # Convert the image to grayscale
img = np.asarray(img) # Convert the image to a Numpy array

img = resize(img, (WIDTH, HEIGHT, 1), anti_aliasing=True) # Resize the image and normalize the values (to be between 0.0 and 1.0)

print("Actual label:", TRUE_LABEL) # Show the ground-truth label

plt.imshow(img, cmap='gray', vmin=0.0, vmax=1.0) # Display image (make sure we're looking at the right thing)
plt.show()

# The Keras model expects images in a 4D array with dimensions (sample, height, width, channel)

img_0 = img.reshape(img.shape + (1,)) # Add extra dimension to the image (placeholder for color channels)
images = np.array([img_0]) # Keras expects more than one image (in Numpy array), so convert image(s) to such array
print(images.shape) # Print dimensions of inference input

preds = model.predict(images) # Inference

# Print out predictions
for i, pred in enumerate(preds[0]):
  print(LABELS[i] + ": " + str(pred))

model.layers[-1].activation = None # For either algorithm, we need to remove the Softmax activation function of the last layer

# Based on: https://github.com/keisen/tf-keras-vis/blob/master/tf_keras_vis/saliency.py
def get_saliency_map(img_array, model, class_idx):

  img_tensor = tf.convert_to_tensor(img_array) # Gradient calculation requires input to be a tensor

  # Do a forward pass of model with image and track the computations on the "tape"
  with tf.GradientTape(watch_accessed_variables=False, persistent=True) as tape:

    tape.watch(img_tensor) # Compute (non-softmax) outputs of model with given image
    outputs = model(img_tensor, training=False)

    score = outputs[:, true_idx] # Get score (predicted value) of actual class

  grads = tape.gradient(score, img_tensor) # Compute gradients of the loss with respect to the input image

  grads_disp = [np.max(g, axis=-1) for g in grads] # Finds max value in each color channel of the gradient (should be grayscale for this demo)

  grad_disp = grads_disp[0] # There should be only one gradient heatmap for this demo

  grad_disp = tf.abs(grad_disp) # The absolute value of the gradient shows the effect of change at each pixel. Source: https://christophm.github.io/interpretable-ml-book/pixel-attribution.html

  heatmap_min = np.min(grad_disp) # Normalize to between 0 and 1 (use epsilon, a very small float, to prevent divide-by-zero error)
  heatmap_max = np.max(grad_disp)
  heatmap = (grad_disp - heatmap_min) / (heatmap_max - heatmap_min + tf.keras.backend.epsilon())

  return heatmap.numpy()


saliency_map = get_saliency_map(images, model, true_idx) # Generate saliency map for the given input image

plt.imshow(saliency_map, cmap='magma', vmin=0.0, vmax=1.0) # Draw map
plt.show()

idx = 0 # Overlay the saliency map on top of the original input image
ax = plt.subplot()
ax.imshow(images[idx,:,:,0], cmap='gray', vmin=0.0, vmax=1.0)
ax.imshow(saliency_map, cmap='magma', alpha=0.25)
plt.show()

### This function comes from https://keras.io/examples/vision/grad_cam/
def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):

  grad_model = tf.keras.models.Model([model.inputs], [model.get_layer(last_conv_layer_name).output, model.output]) # First, we create a model that maps the input image to the activations of the last conv layer as well as the output predictions

  # Then, we compute the gradient of the top predicted class for our input image with respect to the activations of the last conv layer
  with tf.GradientTape() as tape:
      last_conv_layer_output, preds = grad_model(img_array)
      if pred_index is None:
          pred_index = tf.argmax(preds[0])
      class_channel = preds[:, pred_index]

  grads = tape.gradient(class_channel, last_conv_layer_output) # This is the gradient of the output neuron (top predicted or chosen) with regard to the output feature map of the last conv layer

  pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2)) # This is a vector where each entry is the mean intensity of the gradient over a specific feature map channel

  last_conv_layer_output = last_conv_layer_output[0] # We multiply each channel in the feature map array by "how important this channel is" with regard to the top predicted class then sum all the channels to obtain the heatmap class activation
  heatmap = last_conv_layer_output @ pooled_grads[..., tf.newaxis]
  heatmap = tf.squeeze(heatmap)

  heatmap = tf.abs(heatmap) # The absolute value of the gradient shows the effect of change at each pixel. Source: https://christophm.github.io/interpretable-ml-book/pixel-attribution.html

  heatmap_min = np.min(heatmap) # Normalize to between 0 and 1 (use epsilon, a very small float, to prevent divide-by-zero error)
  heatmap_max = np.max(heatmap)
  heatmap = (heatmap - heatmap_min) / (heatmap_max - heatmap_min + tf.keras.backend.epsilon())

  return heatmap.numpy()

# We need to tell Grad-CAM where to find the last convolution layer

#for layer in model.layers:
#  print(layer, layer.name) # Print out the layers in the model

last_conv_layer = None # Go backwards through the model to find the last convolution layer
for layer in reversed(model.layers):
    if 'conv' in layer.name:
        last_conv_layer = layer.name
        break

if last_conv_layer is not None:
  print("Last convolution layer found:", last_conv_layer) # Give a warning if the last convolution layer could not be found
else:
  print("ERROR: Last convolution layer could not be found. Do not continue.")

heatmap = make_gradcam_heatmap(images, model, last_conv_layer) # Generate class activation heatmap

plt.imshow(heatmap, cmap='magma', vmin=0.0, vmax=1.0) # Draw map
plt.show()

# Overlay the saliency map on top of the original input image

big_heatmap = cv2.resize(heatmap, dsize=(HEIGHT, WIDTH), interpolation=cv2.INTER_CUBIC) # The heatmap is a lot smaller than the original image, so we upsample it

idx = 0 # Draw original image with heatmap superimposed over it
ax = plt.subplot()
ax.imshow(images[idx,:,:,0], cmap='gray', vmin=0.0, vmax=1.0)
ax.imshow(big_heatmap, cmap='magma', alpha=0.25)
plt.show()

You'll need to modify IMAGE_PATH = r"your\image\path" to the image path that we are using for testing and TRUE_LABEL = "yourimageclass" to images class.

The same model should work for the Edge Impulse h5 that you could download from the Dashboard of the project.

Keep in mind that if your are using some Edge Impulse project you might need to change models name and image size in Grad-CAM for CNN model - h5.py

1 / 5 • Correct classification of background (1/5)

As we can see, models poor accuracy is also shown in how balanced the percentages are between the classes. Just one of the images has a confidence of above 90%.

In some images we can see that Grad-CAM gives importance to the shadow of my hands. Thus, fails to identify the object, and instead related the shadow to the class. This is a good example on how a biased dataset could lead to inaccurate model predictions in the real world.

1 / 2 • Incorrect classification of a fork as background (1/2)

With these samples and insights we now have a much deeper understanding about how the model is making its decisions. And allows us to re-train it.

If we re-train the model with new images taken from this learning we should see an improvement in the accuracy because we've helped it to make better segregations and better identification of the features that make each utensil what it is.

Summary

We've been able to develop a CNN Machine Learning for image classification.

Then we took its .h5 file to shine some light over how does the model take its decisions by using Grad-CAM.

This better understanding about models structure could later help us to improve its performance by taking more pictures for the cases were it made wrong classifications.

Code

#!/usr/bin/env python
# coding: utf-8

# https://www.kaggle.com/code/roy2004/cnn-waste-classification-from-jpg-op-3

import os

import pandas as pd

df = pd.DataFrame({'filename': [], 'category': []}) # Create a blank dataframe with two columns: image file name and the waste category

cd = os.getcwd()
folders = os.listdir(cd)

for fd in folders:
    if (".py" not in fd and ".h5" not in fd and ".xlsx" not in fd):
        folderpath = cd + "\\" + fd
        files = os.listdir(folderpath)
        for fl in files:
            filepath = folderpath + "\\" + fl
            df = df.append({'filename': filepath, 'category': fd}, ignore_index=True)

df = df.dropna()
print("Number of images = {}".format(df.size))
print (df['category'].value_counts()) # Quick look of the dataframe

with pd.ExcelWriter('Image Classification database.xlsx', mode='w') as writer:
    df.to_excel(writer)

#!/usr/bin/env python
# coding: utf-8

# https://www.kaggle.com/code/roy2004/cnn-waste-classification-from-jpg-op-3

import os
from random import randint

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

from PIL import Image

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential # Importing models from Keras
from tensorflow.keras.callbacks import EarlyStopping, LearningRateScheduler
from tensorflow.keras.layers import Dense, InputLayer, Dropout, Conv1D, Conv2D, Flatten, Reshape, MaxPooling1D, MaxPooling2D, BatchNormalization, TimeDistributed
from tensorflow.keras.optimizers import Adam

df = pd.read_excel('Image Classification database.xlsx', index_col=0) # Import a table with location to specific images and its category

df['category'].value_counts().plot.bar()
print (df['category'].value_counts())

df_train=df.sample(frac=0.8,replace=False) # Randomly sample 80% of data from dataframe for training. Replace = False to prevent repeat sampling
df_valid=df.drop(df_train.index.values) # The rest 20% is the validation images

df_train['category'].value_counts().plot.bar()
print (df_train['category'].value_counts())

df_valid['category'].value_counts().plot.bar()
print(df_valid['category'].value_counts())

#Image.open(random.choice(df_train['filename'])).show()

FAST_RUN = False # True if you want to quick test you model (training for 3 epochs). False to have a full train (50 epochs).
epochs = 3 if FAST_RUN else 100

IMAGE_WIDTH = 400 # Enter the width and height of images
IMAGE_HEIGHT = 225
IMAGE_SIZE = (IMAGE_WIDTH, IMAGE_HEIGHT)

IMAGE_CHANNELS = 1 # 3 if RGB. 1 if Grayscale
batch_size = 32
d = 0.1 # Dropout rate

# Use when training on pre-trained weights
START_EPOCH = 0 # if fresh train, enter 0
Transfer = False
Pretrained_Link = os.getcwd() + "/model.h5"

#rdm = randint(0,len(df_train['filename']))
#sample = df_train['filename'].iloc[rdm]
#pic = Image.open(sample)
#pic.show()

classes_values = ["background", "fork", "knife", "spoon" ]
classes = len(classes_values)

# Create Keras Sequential Model
model = Sequential()
model.add(Conv2D(32, kernel_size=3, activation='relu', kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same'))
model.add(MaxPooling2D(pool_size=2, strides=2, padding='same'))
model.add(Conv2D(16, kernel_size=3, activation='relu', kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same', name = "last_conv2d"))
model.add(MaxPooling2D(pool_size=2, strides=2, padding='same'))
model.add(Flatten())
model.add(Dropout(0.25))
model.add(Dense(classes, activation='softmax', name='y_pred'))

if Transfer:
    model.load_weights(Pretrained_Link)

opt = Adam(learning_rate=0.0005, beta_1=0.9, beta_2=0.999)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

#model.summary()

earlystop = EarlyStopping(patience=10,restore_best_weights=True)

LR_START = .001 # Learning rate (LR) schedule for TPU, GPU and CPU
LR_MIN = 1e-6
LR_EXP_DECAY = .94

# Define a Learning Rate function on epoch that will decrease exponentially.
def lrfn(epoch):
    lr = (LR_START - LR_MIN) * LR_EXP_DECAY ** (epoch + START_EPOCH) + LR_MIN
    return lr

lr_callback = LearningRateScheduler(lrfn, verbose=True)

rng = [i for i in range(START_EPOCH, epochs + START_EPOCH)] # Visualize the change in learning rate
y = [lrfn(x) for x in rng]
#plt.plot(rng, y)
#plt.show()
print("Learning rate schedule: {:.3g} to {:.3g}".format(y[0], y[-1]))

total_train = df_train.shape[0] # Total number of images for training
total_validate = df_valid.shape[0] # Total number of images for validation

print("Training: {}, Validation: {}".format(total_train,total_validate))

train_datagen = ImageDataGenerator(rotation_range=15, rescale=1./255, shear_range=0.1, horizontal_flip=True, vertical_flip=True)

train_generator = train_datagen.flow_from_dataframe(df_train, "", x_col='filename', y_col='category', target_size=IMAGE_SIZE, class_mode='categorical', batch_size=batch_size, color_mode = "grayscale") # According to the dataframe, pull images one by one from image directory

validation_datagen = ImageDataGenerator(rescale=1./255) # Validation doesn't need much data Augmentation
validation_generator = validation_datagen.flow_from_dataframe(df_valid, "", x_col='filename', y_col='category', target_size=IMAGE_SIZE, class_mode='categorical', batch_size=batch_size, color_mode = "grayscale") # According to the dataframe, pull images one by one from image directory

history = model.fit(train_generator, batch_size=batch_size, epochs=epochs, validation_data=validation_generator, validation_steps=total_validate//batch_size, steps_per_epoch=total_train//batch_size, callbacks=[earlystop, lr_callback])

model.save("model.h5") # Save Model in h5 format (old one). New models are saved as SaveModel
#model.save("model_raw") # Save Model

test_df = df.sample(frac = 0.3) # Randomly select 30% of data
nb_samples = test_df.shape[0] # Number of testing samples

test_gen = ImageDataGenerator(rescale=1./255) # Test generator in the same fashion of the train/validation generators
test_generator = test_gen.flow_from_dataframe(test_df, "", x_col='filename', y_col='category', class_mode=None, target_size=IMAGE_SIZE, batch_size=batch_size, shuffle=False, color_mode = "grayscale")

predict = model.predict_generator(test_generator, steps=np.ceil(nb_samples/batch_size))

test_df['pred_category'] = np.argmax(predict, axis=-1)

label_map = dict((v,k) for k,v in train_generator.class_indices.items())
test_df['pred_category'] = test_df['pred_category'].replace(label_map)

test_df["background"] = predict[:, [0]]
test_df["fork"] = predict[:, [1]]
test_df["knife"] = predict[:, [2]]
test_df["spoon"] = predict[:, [3]]

submission_df = test_df.copy()

with pd.ExcelWriter('Summary.xlsx', mode='w') as writer:
    submission_df.to_excel(writer)

#!/usr/bin/env python
# coding: utf-8

import os
import numpy as np
import PIL
from PIL import Image
from PIL import ImageFilter
from PIL import ImageEnhance
from PIL import ImageOps

files_in_path = os.listdir()
files_in_path_lst = []

for fip in files_in_path:
    files_in_path_lst.append(fip)

for i in files_in_path_lst:

    aux = i[-4:]
    name = i[:-4]
    ext = ".jpg"

    if aux == ext:

        orig = Image.open(i)
        width, height = orig.size

        print(i, width, height)

        newsize = (width//10, height//10) # width and heights reduced 10 times
        orig = orig.resize(newsize)

        raw = ImageOps.grayscale(orig)
        ty = "raw "
        fn = ty + name + ext
        raw.save(fn)

#https://github.com/ShawnHymel/ei-workshop-image-data-augmentation/blob/master/workshop_01_saliency_and_grad_cam.ipynb

import PIL
import cv2
import numpy as np
import tensorflow as tf
import os

from tensorflow import keras
from keras import activations, layers, models, backend
from skimage.transform import resize

import matplotlib.pyplot as plt

LABELS = ["background", "fork", "knife", "spoon"] # Labels

IMAGE_PATH = r"your\image\path" # Change this based on your image sample
TRUE_LABEL = "yourimageclass"

# If you wrote your own model should match your image size
# If you are importing from Edge Impulse go to Image resolution (Edge Impulse project > Impulse design > Image data)
WIDTH = 400
HEIGHT = 225

true_idx = LABELS.index(TRUE_LABEL) # Find index of true label in label list

model = tf.keras.models.load_model("model.h5") # Load model file
model.summary()

img = PIL.Image.open(IMAGE_PATH) # Load image
img = img.convert('L') # Convert the image to grayscale
img = np.asarray(img) # Convert the image to a Numpy array

img = resize(img, (WIDTH, HEIGHT, 1), anti_aliasing=True) # Resize the image and normalize the values (to be between 0.0 and 1.0)

print("Actual label:", TRUE_LABEL) # Show the ground-truth label

plt.imshow(img, cmap='gray', vmin=0.0, vmax=1.0) # Display image (make sure we're looking at the right thing)
plt.show()

# The Keras model expects images in a 4D array with dimensions (sample, height, width, channel)

img_0 = img.reshape(img.shape + (1,)) # Add extra dimension to the image (placeholder for color channels)
images = np.array([img_0]) # Keras expects more than one image (in Numpy array), so convert image(s) to such array
print(images.shape) # Print dimensions of inference input

preds = model.predict(images) # Inference

# Print out predictions
for i, pred in enumerate(preds[0]):
  print(LABELS[i] + ": " + str(pred))

model.layers[-1].activation = None # For either algorithm, we need to remove the Softmax activation function of the last layer

# Based on: https://github.com/keisen/tf-keras-vis/blob/master/tf_keras_vis/saliency.py
def get_saliency_map(img_array, model, class_idx):

  img_tensor = tf.convert_to_tensor(img_array) # Gradient calculation requires input to be a tensor

  # Do a forward pass of model with image and track the computations on the "tape"
  with tf.GradientTape(watch_accessed_variables=False, persistent=True) as tape:

    tape.watch(img_tensor) # Compute (non-softmax) outputs of model with given image
    outputs = model(img_tensor, training=False)

    score = outputs[:, true_idx] # Get score (predicted value) of actual class

  grads = tape.gradient(score, img_tensor) # Compute gradients of the loss with respect to the input image

  grads_disp = [np.max(g, axis=-1) for g in grads] # Finds max value in each color channel of the gradient (should be grayscale for this demo)

  grad_disp = grads_disp[0] # There should be only one gradient heatmap for this demo

  grad_disp = tf.abs(grad_disp) # The absolute value of the gradient shows the effect of change at each pixel. Source: https://christophm.github.io/interpretable-ml-book/pixel-attribution.html

  heatmap_min = np.min(grad_disp) # Normalize to between 0 and 1 (use epsilon, a very small float, to prevent divide-by-zero error)
  heatmap_max = np.max(grad_disp)
  heatmap = (grad_disp - heatmap_min) / (heatmap_max - heatmap_min + tf.keras.backend.epsilon())

  return heatmap.numpy()


saliency_map = get_saliency_map(images, model, true_idx) # Generate saliency map for the given input image

plt.imshow(saliency_map, cmap='magma', vmin=0.0, vmax=1.0) # Draw map
plt.show()

idx = 0 # Overlay the saliency map on top of the original input image
ax = plt.subplot()
ax.imshow(images[idx,:,:,0], cmap='gray', vmin=0.0, vmax=1.0)
ax.imshow(saliency_map, cmap='magma', alpha=0.25)
plt.show()

### This function comes from https://keras.io/examples/vision/grad_cam/
def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):

  grad_model = tf.keras.models.Model([model.inputs], [model.get_layer(last_conv_layer_name).output, model.output]) # First, we create a model that maps the input image to the activations of the last conv layer as well as the output predictions

  # Then, we compute the gradient of the top predicted class for our input image with respect to the activations of the last conv layer
  with tf.GradientTape() as tape:
      last_conv_layer_output, preds = grad_model(img_array)
      if pred_index is None:
          pred_index = tf.argmax(preds[0])
      class_channel = preds[:, pred_index]

  grads = tape.gradient(class_channel, last_conv_layer_output) # This is the gradient of the output neuron (top predicted or chosen) with regard to the output feature map of the last conv layer

  pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2)) # This is a vector where each entry is the mean intensity of the gradient over a specific feature map channel

  last_conv_layer_output = last_conv_layer_output[0] # We multiply each channel in the feature map array by "how important this channel is" with regard to the top predicted class then sum all the channels to obtain the heatmap class activation
  heatmap = last_conv_layer_output @ pooled_grads[..., tf.newaxis]
  heatmap = tf.squeeze(heatmap)

  heatmap = tf.abs(heatmap) # The absolute value of the gradient shows the effect of change at each pixel. Source: https://christophm.github.io/interpretable-ml-book/pixel-attribution.html

  heatmap_min = np.min(heatmap) # Normalize to between 0 and 1 (use epsilon, a very small float, to prevent divide-by-zero error)
  heatmap_max = np.max(heatmap)
  heatmap = (heatmap - heatmap_min) / (heatmap_max - heatmap_min + tf.keras.backend.epsilon())

  return heatmap.numpy()

# We need to tell Grad-CAM where to find the last convolution layer

#for layer in model.layers:
#  print(layer, layer.name) # Print out the layers in the model

last_conv_layer = None # Go backwards through the model to find the last convolution layer
for layer in reversed(model.layers):
    if 'conv' in layer.name:
        last_conv_layer = layer.name
        break

if last_conv_layer is not None:
  print("Last convolution layer found:", last_conv_layer) # Give a warning if the last convolution layer could not be found
else:
  print("ERROR: Last convolution layer could not be found. Do not continue.")

heatmap = make_gradcam_heatmap(images, model, last_conv_layer) # Generate class activation heatmap

plt.imshow(heatmap, cmap='magma', vmin=0.0, vmax=1.0) # Draw map
plt.show()

# Overlay the saliency map on top of the original input image

big_heatmap = cv2.resize(heatmap, dsize=(HEIGHT, WIDTH), interpolation=cv2.INTER_CUBIC) # The heatmap is a lot smaller than the original image, so we upsample it

idx = 0 # Draw original image with heatmap superimposed over it
ax = plt.subplot()
ax.imshow(images[idx,:,:,0], cmap='gray', vmin=0.0, vmax=1.0)
ax.imshow(big_heatmap, cmap='magma', alpha=0.25)
plt.show()

Credits

Kevin Richmond

7 projects • 2 followers

Engineer passionate about automation, IoT, Machine Learning and sustainability.

Contact

Thanks to Shawn Hymel.

Comments

Please log in or sign up to comment.

Grad-CAM for your Machine Learning projects

Things used in this project

Software apps and online services

Story

Introduction

Project Structure

What You'll Need

Sampling

Sampling biases

Data manipulation

Generating a dataset with images information

Develop Machine Learning model

Running Grad-CAM

Summary

Schematics

model

Code

Images location and classes.py

Define, train and test model - h5.py

JPG Image editor - Resizing.py

Grad-CAM for CNN model - h5.py

Credits

Kevin Richmond

Comments

Embed the widget on your own site

Grad-CAM for your Machine Learning projects

Grad-CAM for your Machine Learning projects

Things used in this project

Software apps and online services

Story

Introduction

Project Structure

What You'll Need

Sampling

Sampling biases

Data manipulation

Generating a dataset with images information

Develop Machine Learning model

Running Grad-CAM

Summary

Schematics

model

Code

Images location and classes.py

Define, train and test model - h5.py

JPG Image editor - Resizing.py

Grad-CAM for CNN model - h5.py

Credits

Kevin Richmond

Comments

Related channels and tags