Introduction
Steps involved
1. Acquiring Image Samples
2. Labelling the images
3. Training the Model
4. Model Testing
5. Deployment
6. Flashing the code
Conclusion
Troubleshoot(if required

Published July 26, 2024 © GPL3+

Hand Gesture detection on ESP-CAM

Control LED brightness using hand/fist detection using ML on an ESP32-CAM

IntermediateFull instructions provided1,440

Things used in this project

Hardware components

ESP-CAM

Software apps and online services

Edge Impulse Studio

Arduino IDE

Story

Introduction

The introduction of the ESP-CAM has opened up possibilities for adding a camera to existing IoT projects. After exploring embedded ML using Edge Impulse on the RP2040 in my previous project, I was intrigued by the potential for object detection using the ESP32-CAM. Given that the ESP32-S features 520KB of RAM and a CPU frequency of up to 240 MHz, it seems promising that a simple object detection model can be deployed on the ESP-CAM.

In this project, the ESP camera is designed to recognize a human hand and fist on a white background. If a hand is detected, the LED brightens; if a fist is detected, the LED dims.

Steps involved

The overall process involves certain offline methods and online tools from gathering the data -> training the model -> deploying the model. The initial and final steps is usually done offline while the model training is done by the Edge Impulse.

Workflow

1. Acquiring Image Samples

Dataset can be provided to the Edge Impulse by connecting directly from a smartphone(the easiest) or by selecting files from the local PC.

In this case, I have selected the latter option that captures samples from the ESP-CAM since I will be using the same camera sensor for inference.

Acquisition from ESP CAM

Although it may seem tedious to gather sample images from the ESP-CAM, I created a script that hosts a webpage allowing the user to capture photos. The user can decide if the image is good enough to be saved on the SD card attached to the board. The images can later be retrieved from the SD card and uploaded to Edge Impulse.

Working on this task helped me better understand the file systems and memory architecture of the ESP32-S, and taught me how to transfer files from the flash memory to the SD card.

Working of ESP CAM data acquisition

Steps to take Image Samples

After uploading the sketch, check the local IP from the Serial Monitor once the connection is established. Upon entering the IP on the Web browser, a basic web interface can be obtained for capturing, viewing and saving the image.

Locate the IP address from the serial monitor and enter it to the browser

Swipe the images to follow the next steps.

1 / 4 • Step-1: Click on "CAPTURE" and "RELOAD" to preview the image

Loading Images to EdgeImpulse

This step is fairly straightforward where the files are manually uploaded to Edge Impulse from the local computer.

Step 1: Select the Upload data button to upload files manually

Step-1: On the data Acquisition tab, click on "Upload data"

Step 2: Connecting the SD card to the computer and uploading the files. The data acquisition sketch loads the image in the root directory, so uploading as a folder is not possible.

Step-2: Select all Images from the SD card and upload

2. Labelling the images

Once the samples are collected, it is necessary to label the images to let the model know what to categorize.

The labels are created by bounding the desired region in the box.

Make a box for the desired region to label

The labels used in this project are "hand" and "fist".

Enter label name

I have collected about 40 samples for each and made sure that the training and test set ratio is 80%. The samples stored in test set is later used for validation after the model is created.

Training and Test sample split

3. Training the Model

Creating the Impulse

On the "create impulse" tab of Edge Impulse, I have set the image width and height as small as possible i.e 48 x 48 pixels. The main reason is that less image resolution will need less processing time and faster response. However, the accuracy will be affected.Additionally, I have added the existing image processing block and the "object detection(images)" learning block.

Smaller image resolution selected for better performance

Extracting Image features

This is the area where features are extracted for each labels. It is necessary to set the color depth to "grayscale" to minimize the processing load on the ESP32.

Upon generating the features, this is the spectrum I obtained. The farther the features are for each labels, the easier is its classification. In this case, some features are closer in the middle which can cause errors in detecting.

Spectrum of features

Configuring the Neural Network

This is the workspace where we feed the parameters for the neural network. The training cycles normally refers to the number of epochs to train the neural network. More training cycles can take more time to train the model.

The learning rate defines how fast the neural network adapts. This number should be optimal as faster learning rates can also cause ambiguity in the results.

For the ESP32-CAM, the model is set to FOMO (Faster Objects, More Objects) MobileNetV2 0.1 because it is one of the optimized models that the ESP32 can support.

FOMO model works perfect for ESP32 CAM

To check the performance of the model, certain samples are used as a validation set. Among the datasets used to train the model, roughly 20% of the samples were reserved for validation and obtaining the F1 score.

I experimented with these parameters to achieve maximum performance. Upon initially setting the training cycles to 30 and the learning rate to 0.005, I was getting an F1 score of 86%. However, upon changing these to 60 and 0.001, the score jumped by 5%.

Higher F1 score with Training cycles: 60 and Learning rate: 0.005

In the log, It also mentioned that a TensorFlow Lite model is created.

4. Model Testing

During data acquisition, it was mentioned that the training and test sets were divided into approximately 80% and 20%, respectively. In this step, the model is tested on roughly 24 samples of hands and fists that it had not seen previously. This provides an accurate picture of the model's performance based on real-world data..

Model testing results

It can be seen the out of 24 test samples, 4 have failed giving an overall accuracy of 83%. One way to improve the accuracy would be taking additional samples for training. The training cycles and learning rate can also be tweaked to perform better.

5. Deployment

Edge Impulse is quite flexible for model deployment. Options are available to export the model as a firmware, library, for use in a browser, and more.

Since the ESP32 firmware is already supported on the Arduino, I used the Arduino Library for this use case. The library must be downloaded and installed on the Arduino IDE.

Steps to add model as a library

6. Flashing the code

As the ESP-CAM does not include a built-in programmer, an external one has to be used. I have made use of an FTDI programmer(USB to TTL) to flash the code. The IO0 pin must be connected to GND to enable programming mode on the ESP-CAM.

FTDI programmer used to flash the code

Conclusion

Deploying an ML model on an ESP-CAM was successful, focusing solely on image classification. Upon reviewing the code, I realized that image localization is also possible, as the model returns the coordinates of recognized images.

ESP camera classifying hand and fist

I even played around with the ESP's CPU performance by varying the frequency. With a lower CPU frequency i.e 80 MHz, the inference time was around 300 ms. Setting the frequency to 240 MHz resulted in a significant reduction inference time of around 130 ms.

Inference time of 130ms obtained for 240 MHz clock frequency

Troubleshoot(if required)

The initial compilation gave me an error w.r.t. to the ESP32 firmware code.

c:\users\....\packages\esp32\tools\xtensa-esp32-elf-gcc\gcc8_4_0-esp-2021r2-patch3\xtensa-esp32-elf\include\c++\8.4.0\system_error:39:10: fatal error: bits/error_constants.h: No such file or directory
 #include <bits/error_constants.h>

I noticed that the header file mentioned above was missing from the ESP32 arduino firmware directory. I checked about the issue on several forums until I found a solution on GitHub(link here). It appears that adding a header file and its contents(available on the link) solved the problem.

/*********************************************************************
****** ESP-CAM Image acquisition on SD Card using web interface*******
************* By Shubham Santosh ************************************/

#include<WiFi.h>
#include "esp_camera.h"
#include "esp_timer.h"
#include "img_converters.h"
#include "Arduino.h"
#include "FS.h"  

#include<LittleFS.h>   // LittleFS for temp image storage
#include "SD_MMC.h"    // For permanenet storage
#include "soc/soc.h"           // Disable brownout problems
#include "soc/rtc_cntl_reg.h"  // Disable brownout problems
#include "driver/rtc_io.h"
#include <ESPAsyncWebServer.h> 

#include "esp_task_wdt.h" //disable watchdog timer during photo capture

// Camera pins initialization
#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27

#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

#define LED_PIN 33
AsyncWebServer server(80);

const char* ssid="ENTER SSID";
const char* password="ENTER PASSWORD";

// Variabales used for flash. Does not work after SD card is initialized
String ledState="";  
bool led_state=0;

uint8_t image_number=0,image_log_no=0;
//char* parent_path="/photos";
char* image_name="/picture";
const char* input1="input1";

#define FILE_PHOTO "/image.jpg"

String path="";     //this variable stores the image path obtained from the webpage

// Get all the inputs from the webpage. For e.g. LED Flash state, file name,etc.
String processor(const String& var){
  Serial.println(var);
  if(var == "STATE"){
    if(led_state){
      ledState = "ON";
    }
    else{
      ledState = "OFF";
    }
    Serial.print(ledState);
    return ledState;
  }
  else if(var=="PHOTO_NUMBER")
  {
    return String(image_number);
  }
  else if(var== "FILE_NAME")
  {
    return String(path);
  }
  return String();
}

// Manipulate image path for unique identification
String image_path(String img_name)
{
path="/img_"+img_name+"_"+String(image_number)+".jpg";
return(path);
}

// Check if photo capture was successful
bool checkPhoto( fs::FS &fs ) {
  File f_pic = fs.open( FILE_PHOTO );
  unsigned int pic_sz = f_pic.size();
  Serial.println("File size checked:"+f_pic.size());
  return ( pic_sz > 100 );
}

//This function captures image from the OV2640 and stores in the Flash storage using LittleFS
void take_photo(void)
{
  esp_task_wdt_init(60, true);
   camera_fb_t * fb = NULL;
  // image_path(image_number);
    fb = esp_camera_fb_get();  
    if(!fb) 
    {
    Serial.println("Camera capture failed");
    return;
    }
    File file=LittleFS.open("/image.jpg",FILE_WRITE);
    if(!file)
    {
      Serial.println("Photo not saved in LITTE_FS");
    }
    else
    {
      file.write(fb->buf,fb->len);
      Serial.println("photo saved in LittleFS: /image.jpg");
      //Serial.print("Image path: ");
      //Serial.println(path);
     // Serial.print("Size: ");
      Serial.println(fb->len);
    }
     file.close();
    esp_camera_fb_return(fb);
   
}

// Main HTML page
char html[] =
  R"rawliteral(
  <!DOCTYPE html><html>
 <head>
        <title>ESP CAM Wi-Fi capture</title>
         </head>
          <body>
          <h1>ESP Camera Capture and Monitor</h1>  
       <p>
       <button onclick="capture_pic();">CAPTURE</button>
       <button onclick="location.reload();">RELOAD</button>
       </p>
       <form action="/get">
    Label Name: <input type="text" name="input1" autocomplete="on">
    <input type="submit" value="Save to SD card">
  </form><br>
       <p><button onclick="toggle_flash();">TOGGLE FLASH</button>Use LOAD to check FLASH state : %STATE%</p>
        <hr>
        <p>Number of photos captured: %PHOTO_NUMBER%</p>
        <p>File name saved as: %FILE_NAME%</p>
        <hr>
        <img src="photo" id="photo" width="70%"></div>
        </body>
        <script>
        function capture_pic()
        {
          var xhr= new XMLHttpRequest();
          xhr.open('GET',"/capture",true);
          xhr.send();
        }
         function toggle_flash()
        {
          var xhr= new XMLHttpRequest();
          xhr.open('GET',"/toggle_flash",true);
          xhr.send();
          
        }
        </script>
        </html> 
)rawliteral";


void setup() {
 //disable brownout detector
  Serial.begin(115200);
  
  pinMode(4,OUTPUT);  //LED Flash pin
  pinMode(33,OUTPUT); //Red LED pin behind the camera. Used as status indicator to check if wifi is connected.    


  //digitalWrite(4,HIGH);
  camera_config_t config;
  config.ledc_channel = LEDC_CHANNEL_0;
  config.ledc_timer = LEDC_TIMER_0;
  config.pin_d0 = Y2_GPIO_NUM;
  config.pin_d1 = Y3_GPIO_NUM;
  config.pin_d2 = Y4_GPIO_NUM;
  config.pin_d3 = Y5_GPIO_NUM;
  config.pin_d4 = Y6_GPIO_NUM;
  config.pin_d5 = Y7_GPIO_NUM;
  config.pin_d6 = Y8_GPIO_NUM;
  config.pin_d7 = Y9_GPIO_NUM;
  config.pin_xclk = XCLK_GPIO_NUM;
  config.pin_pclk = PCLK_GPIO_NUM;
  config.pin_vsync = VSYNC_GPIO_NUM;
  config.pin_href = HREF_GPIO_NUM;
  config.pin_sscb_sda = SIOD_GPIO_NUM;
  config.pin_sscb_scl = SIOC_GPIO_NUM;
  config.pin_pwdn = PWDN_GPIO_NUM;
  config.pin_reset = RESET_GPIO_NUM;
  config.xclk_freq_hz = 20000000;
  config.pixel_format = PIXFORMAT_JPEG; 
  
  WiFi.begin(ssid,password);
  while(WiFi.status()!=WL_CONNECTED)
  {
    digitalWrite(33,LOW);
    Serial.print(".");
    delay(200);
    digitalWrite(33,HIGH);
  }
  Serial.println(WiFi.localIP());  // pritn IP address
  WRITE_PERI_REG(RTC_CNTL_BROWN_OUT_REG, 0); 
  if(psramFound()){
    config.frame_size = FRAMESIZE_UXGA; // FRAMESIZE_ + QVGA|CIF|VGA|SVGA|XGA|SXGA|UXGA
    config.jpeg_quality = 10;
    config.fb_count = 2;
  } else {
    config.frame_size = FRAMESIZE_SVGA;
    config.jpeg_quality = 12;
    config.fb_count = 1;
  }
  //SD mmc begin and end here

//SD_MMC.end();
if (!LittleFS.begin(true)) {
  Serial.println("An Error has occurred while mounting LittleFS");
  ESP.restart();
}

  LittleFS.format();
  //LittleFS.mkdir(parent_path);

   esp_err_t err = esp_camera_init(&config);
  if (err != ESP_OK) {
    Serial.printf("Camera init failed with error 0x%x", err);
    return;
  }

  // HTTP 
   server.on("/", HTTP_GET, [](AsyncWebServerRequest * request) {
     
    request->send_P(200, "text/html",html,processor);
   });

   server.on("/get", HTTP_GET, [] (AsyncWebServerRequest *request) {
    String inputMessage;
    String inputParam;
    // GET input1 value on <ESP_IP>/get?input1=<inputMessage>
    if (request->hasParam(input1)) {
      inputMessage = request->getParam(input1)->value();
      inputParam = input1;
    }
    //Serial.println(inputMessage);
    image_path(inputMessage);
    if (!SD_MMC.begin()) {
  Serial.println("An Error has occurred while mounting SD Card");
  ESP.restart();
}
    //fs::FS &fs = SD_MMC;
    File saved_file = SD_MMC.open(path.c_str(), FILE_WRITE);
    File temp_file=LittleFS.open("/image.jpg",FILE_READ);
  if(!saved_file){
    Serial.println("Failed to open file in writing mode");
  } 
  else {
    size_t fileSize = temp_file.size();
    uint8_t buffer[512];  // image transferred in buffer size of 512 bytes
    size_t bytesRead;

    while (fileSize > 0) {
        bytesRead = temp_file.read(buffer, sizeof(buffer));
        saved_file.write(buffer, bytesRead);
        fileSize -= bytesRead;
    }
    Serial.printf("Saved file to path: %s\n", path.c_str());
    request->send(200, "text/html", "File saved as:" + path +
                                     "<br><a href=\"/\">Return to Home Page</a>");
  }
  saved_file.close();
  //esp_camera_fb_return(fb);
  temp_file.close();
  SD_MMC.end();
  //digitalWrite(4,LOW);
  //SD_MMC.close();
  });
 
    server.on("/toggle_flash", HTTP_GET, [](AsyncWebServerRequest * request) {
     led_state=1^led_state;
    request->send_P(200, "text/html","FLASH toggled");
   });


    server.on("/capture", HTTP_GET, [](AsyncWebServerRequest * request) {
    Serial.println("Photo Capturing");
    request->send_P(200, "text/plain","capturing pic");
    //image_path(image_number);
    
    //capturePhotoSaveSpiffs();
    
    // For every capture request, the pic is captured and stored in LittleFS 3 times.
    // Single iteration of capture also appears to store previous image.
    for(int i=0;i<3;i++)
    {
      digitalWrite(4,led_state);
      take_photo();
      digitalWrite(4,LOW);
    }

    image_number+=1;
  });


  server.on("/photo", HTTP_GET, [](AsyncWebServerRequest * request) {
    request->send(LittleFS,FILE_PHOTO,"image/jpg",false);

  });
  
 server.begin();
  // put your setup code here, to run once:
}

void loop() {
 delay(1);
}

/* Includes ---------------------------------------------------------------- */
//#include <ESP_CAM_inferencing.h>   // Model Library file
#include "edge-impulse-sdk/dsp/image/image.hpp"

#include "esp_camera.h"


#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27

#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22


/* Constant defines -------------------------------------------------------- */
#define EI_CAMERA_RAW_FRAME_BUFFER_COLS           320
#define EI_CAMERA_RAW_FRAME_BUFFER_ROWS           240
#define EI_CAMERA_FRAME_BYTE_SIZE                 3

#define LED_PIN 33

uint16_t brightness=0;  // LED brightness level
/* Private variables ------------------------------------------------------- */
static bool debug_nn = false; // Set this to true to see e.g. features generated from the raw signal
static bool is_initialised = false;
uint8_t *snapshot_buf; //points to the output of the capture

static camera_config_t camera_config = {
    .pin_pwdn = PWDN_GPIO_NUM,
    .pin_reset = RESET_GPIO_NUM,
    .pin_xclk = XCLK_GPIO_NUM,
    .pin_sscb_sda = SIOD_GPIO_NUM,
    .pin_sscb_scl = SIOC_GPIO_NUM,

    .pin_d7 = Y9_GPIO_NUM,
    .pin_d6 = Y8_GPIO_NUM,
    .pin_d5 = Y7_GPIO_NUM,
    .pin_d4 = Y6_GPIO_NUM,
    .pin_d3 = Y5_GPIO_NUM,
    .pin_d2 = Y4_GPIO_NUM,
    .pin_d1 = Y3_GPIO_NUM,
    .pin_d0 = Y2_GPIO_NUM,
    .pin_vsync = VSYNC_GPIO_NUM,
    .pin_href = HREF_GPIO_NUM,
    .pin_pclk = PCLK_GPIO_NUM,

    //XCLK 20MHz or 10MHz for OV2640 double FPS (Experimental)
    .xclk_freq_hz = 20000000,
    .ledc_timer = LEDC_TIMER_0,
    .ledc_channel = LEDC_CHANNEL_0,

    .pixel_format = PIXFORMAT_JPEG, //YUV422,GRAYSCALE,RGB565,JPEG
    .frame_size = FRAMESIZE_QVGA,    //QQVGA-UXGA Do not use sizes above QVGA when not JPEG

    .jpeg_quality = 12, //0-63 lower number means higher quality
    .fb_count = 1,       //if more than one, i2s runs in continuous mode. Use only with JPEG
    .fb_location = CAMERA_FB_IN_PSRAM,
    .grab_mode = CAMERA_GRAB_WHEN_EMPTY,
};

/* Function definitions ------------------------------------------------------- */
bool ei_camera_init(void);
void ei_camera_deinit(void);
bool ei_camera_capture(uint32_t img_width, uint32_t img_height, uint8_t *out_buf) ;

/**
* @brief      Arduino setup function
*/
void setup()
{
    // put your setup code here, to run once:
    Serial.begin(115200);
    pinMode(LED_PIN,OUTPUT);
    //comment out the below line to start inference immediately after upload
    while (!Serial);
    Serial.println("Edge Impulse Inferencing Demo");
    if (ei_camera_init() == false) {
        ei_printf("Failed to initialize Camera!\r\n");
    }
    else {
        ei_printf("Camera initialized\r\n");
    }

    ei_printf("\nStarting continious inference in 2 seconds...\n");
    ei_sleep(2000);
}

/**
* @brief      Get data and run inferencing
*
* @param[in]  debug  Get debug info if true
*/
void loop()
{

    // instead of wait_ms, we'll wait on the signal, this allows threads to cancel us...
    if (ei_sleep(5) != EI_IMPULSE_OK) {
        return;
    }

    snapshot_buf = (uint8_t*)malloc(EI_CAMERA_RAW_FRAME_BUFFER_COLS * EI_CAMERA_RAW_FRAME_BUFFER_ROWS * EI_CAMERA_FRAME_BYTE_SIZE);

    // check if allocation was successful
    if(snapshot_buf == nullptr) {
        ei_printf("ERR: Failed to allocate snapshot buffer!\n");
        return;
    }

    ei::signal_t signal;
    signal.total_length = EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_HEIGHT;
    signal.get_data = &ei_camera_get_data;

    if (ei_camera_capture((size_t)EI_CLASSIFIER_INPUT_WIDTH, (size_t)EI_CLASSIFIER_INPUT_HEIGHT, snapshot_buf) == false) {
        ei_printf("Failed to capture image\r\n");
        free(snapshot_buf);
        return;
    }

    // Run the classifier
    ei_impulse_result_t result = { 0 };

    EI_IMPULSE_ERROR err = run_classifier(&signal, &result, debug_nn);
    if (err != EI_IMPULSE_OK) {
        ei_printf("ERR: Failed to run classifier (%d)\n", err);
        return;
    }

    // print the predictions
    ei_printf("Predictions (DSP: %d ms., Classification: %d ms., Anomaly: %d ms.): \n",
                result.timing.dsp, result.timing.classification, result.timing.anomaly);

#if EI_CLASSIFIER_OBJECT_DETECTION == 1
    ei_printf("Object detection bounding boxes:\r\n");
    for (uint32_t i = 0; i < result.bounding_boxes_count; i++) {
        ei_impulse_result_bounding_box_t bb = result.bounding_boxes[i];
        if (bb.value == 0) {
            continue;
        }
        if(bb.label=="hand")
        brightness-=50;
        else if(bb.label=="fist")
        brightness+=50;
        if(brightness<=0)
        brightness=0;
        if(brightness>=255)
        brightness=255;
        ei_printf("  %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\r\n",
                bb.label
                bb.value,
                bb.x,
                bb.y,
                bb.width,
                bb.height);
    }

    // Print the prediction results (classification)
#else
    ei_printf("Predictions:\r\n");
    for (uint16_t i = 0; i < EI_CLASSIFIER_LABEL_COUNT; i++) {
        ei_printf("  %s: ", ei_classifier_inferencing_categories[i]);
        ei_printf("%.5f\r\n", result.classification[i].value);
    }
#endif

    // Print anomaly result (if it exists)
#if EI_CLASSIFIER_HAS_ANOMALY
    ei_printf("Anomaly prediction: %.3f\r\n", result.anomaly);
#endif

#if EI_CLASSIFIER_HAS_VISUAL_ANOMALY
    ei_printf("Visual anomalies:\r\n");
    for (uint32_t i = 0; i < result.visual_ad_count; i++) {
        ei_impulse_result_bounding_box_t bb = result.visual_ad_grid_cells[i];
        if (bb.value == 0) {
            continue;
        }
        ei_printf("  %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\r\n",
                bb.label,
                bb.value,
                bb.x,
                bb.y,
                bb.width,
                bb.height);
    }
#endif

    analogWrite(LED_PIN,brightness);
    free(snapshot_buf);

}

/**
 * @brief   Setup image sensor & start streaming
 *
 * @retval  false if initialisation failed
 */
bool ei_camera_init(void) {

    if (is_initialised) return true;

#if defined(CAMERA_MODEL_ESP_EYE)
  pinMode(13, INPUT_PULLUP);
  pinMode(14, INPUT_PULLUP);
#endif

    //initialize the camera
    esp_err_t err = esp_camera_init(&camera_config);
    if (err != ESP_OK) {
      Serial.printf("Camera init failed with error 0x%x\n", err);
      return false;
    }

    sensor_t * s = esp_camera_sensor_get();
    // initial sensors are flipped vertically and colors are a bit saturated
    if (s->id.PID == OV3660_PID) {
      s->set_vflip(s, 1); // flip it back
      s->set_brightness(s, 1); // up the brightness just a bit
      s->set_saturation(s, 0); // lower the saturation
    }

#if defined(CAMERA_MODEL_M5STACK_WIDE)
    s->set_vflip(s, 1);
    s->set_hmirror(s, 1);
#elif defined(CAMERA_MODEL_ESP_EYE)
    s->set_vflip(s, 1);
    s->set_hmirror(s, 1);
    s->set_awb_gain(s, 1);
#endif

    is_initialised = true;
    return true;
}

/**
 * @brief      Stop streaming of sensor data
 */
void ei_camera_deinit(void) {

    //deinitialize the camera
    esp_err_t err = esp_camera_deinit();

    if (err != ESP_OK)
    {
        ei_printf("Camera deinit failed\n");
        return;
    }

    is_initialised = false;
    return;
}


/**
 * @brief      Capture, rescale and crop image
 *
 * @param[in]  img_width     width of output image
 * @param[in]  img_height    height of output image
 * @param[in]  out_buf       pointer to store output image, NULL may be used
 *                           if ei_camera_frame_buffer is to be used for capture and resize/cropping.
 *
 * @retval     false if not initialised, image captured, rescaled or cropped failed
 *
 */
bool ei_camera_capture(uint32_t img_width, uint32_t img_height, uint8_t *out_buf) {
    bool do_resize = false;

    if (!is_initialised) {
        ei_printf("ERR: Camera is not initialized\r\n");
        return false;
    }

    camera_fb_t *fb = esp_camera_fb_get();

    if (!fb) {
        ei_printf("Camera capture failed\n");
        return false;
    }

   bool converted = fmt2rgb888(fb->buf, fb->len, PIXFORMAT_JPEG, snapshot_buf);

   esp_camera_fb_return(fb);

   if(!converted){
       ei_printf("Conversion failed\n");
       return false;
   }

    if ((img_width != EI_CAMERA_RAW_FRAME_BUFFER_COLS)
        || (img_height != EI_CAMERA_RAW_FRAME_BUFFER_ROWS)) {
        do_resize = true;
    }

    if (do_resize) {
        ei::image::processing::crop_and_interpolate_rgb888(
        out_buf,
        EI_CAMERA_RAW_FRAME_BUFFER_COLS,
        EI_CAMERA_RAW_FRAME_BUFFER_ROWS,
        out_buf,
        img_width,
        img_height);
    }


    return true;
}

static int ei_camera_get_data(size_t offset, size_t length, float *out_ptr)
{
    // we already have a RGB888 buffer, so recalculate offset into pixel index
    size_t pixel_ix = offset * 3;
    size_t pixels_left = length;
    size_t out_ptr_ix = 0;

    while (pixels_left != 0) {
        // Swap BGR to RGB here
        // due to https://github.com/espressif/esp32-camera/issues/379
        out_ptr[out_ptr_ix] = (snapshot_buf[pixel_ix + 2] << 16) + (snapshot_buf[pixel_ix + 1] << 8) + snapshot_buf[pixel_ix];

        // go to the next pixel
        out_ptr_ix++;
        pixel_ix+=3;
        pixels_left--;
    }
    // and done!
    return 0;
}

#if !defined(EI_CLASSIFIER_SENSOR) || EI_CLASSIFIER_SENSOR != EI_CLASSIFIER_SENSOR_CAMERA
#error "Invalid model for current sensor"
#endif

Credits

Shubham Santosh

14 projects • 104 followers

I am a Graduate student at Texas A&M University and I am currently interested in exploring Embedded ML.

Contact

Comments

Please log in or sign up to comment.

Hand Gesture detection on ESP-CAM

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Steps involved

1. Acquiring Image Samples

2. Labelling the images

3. Training the Model

4. Model Testing

5. Deployment

6. Flashing the code

Conclusion

Troubleshoot(if required)

Code

ESP Camera data Acquisition

ESP CAM gesture detection

Credits

Shubham Santosh

Comments

Embed the widget on your own site

Hand Gesture detection on ESP-CAM

Hand Gesture detection on ESP-CAM

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Steps involved

1. Acquiring Image Samples

2. Labelling the images

3. Training the Model

4. Model Testing

5. Deployment

6. Flashing the code

Conclusion

Troubleshoot(if required)

Code

ESP Camera data Acquisition

ESP CAM gesture detection

Credits

Shubham Santosh

Comments

Related channels and tags