Published July 2, 2024

MIniature Voice Assistant base on XIAO ESP32S3 Sense

The system is designed to detect both Chinese and English greetings, allowing it to respond with the appropriate introductory messages

IntermediateFull instructions provided782

MIniature Voice Assistant base on XIAO ESP32S3 Sense

Things used in this project

Hardware components

Seeed Studio XIAO ESP32S3 Sense

Seeed Studio Grove - MP3 v2.0

Seeed Studio Grove - PIR Motion Sensor

SparkFun Big Red Dome Button

SparkFun Load Cell Amplifier - HX711

Software apps and online services

Arduino IDE

Edge Impulse Studio

Hand tools and fabrication machines

Laser cutter (generic)

Story

BackGround

The Chaihuo Makerspace showcases a wide array of innovative products and projects. However, the absence of front desk personnel has resulted in a lack of personalized introductions and guidance for incoming visitors. To address this issue, I have developed an intelligent voice recognition system that serves as an interactive tour guide. This system is designed to provide detailed explanations of the products and projects on display, as well as to offer navigational assistance to guests exploring the Chaihuo Makerspace.

The core of the system is the XIAO ESP32S3 microcontroller, which is integrated with the Edge Impulse platform to facilitate advanced speech recognition capabilities. When visitors issue voice commands, the system promptly recognizes them and executes the appropriate responses. By implementing this intelligent voice guide, the Chaihuo Makerspace can significantly enhance the visitor experience

Introduction to Edge Impluse

Edge Impulse is an innovative platform tailored for the rapid development of machine learning models for edge devices and embedded systems. It arms developers with a robust toolkit and services that simplify the creation, training, and deployment of these models, all without necessitating an in-depth knowledge of machine learning theory.

The platform is equipped with user-friendly data collection utilities that streamline the process of gathering data from diverse sensors and devices. This data is then effortlessly uploaded to the Edge Impulse platform for efficient management and labeling. Advanced preprocessing and feature extraction algorithms are also at hand, automatically converting raw data into meaningful features that are essential for training accurate models.

Once a model is fully trained, Edge Impulse simplifies deployment to a spectrum of edge devices and embedded systems, including popular options like Arduino, Raspberry Pi, and various microcontrollers. Deployment methods are flexible, with options to generate optimized C++ code, binaries, or tailored SDKs.

One of Edge Impulse's standout features is its accessibility. The platform's intuitive graphical interface and guided workflow empower even novices in machine learning to quickly achieve proficiency and craft high-caliber models. A wealth of tutorials, sample projects, and a supportive community further facilitate learning and knowledge exchange. Edge Impulse's seamless integration with numerous hardware platforms and sensor ecosystems also accelerates the deployment of machine learning capabilities on edge devices.

In summary, Edge Impulse is a formidable platform that demolishes the entry barriers to machine learning, enabling developers of all levels to efficiently create and deploy sophisticated intelligent applications on edge devices. It stands as a versatile ally for both novices and seasoned professionals aiming to forge ahead in the realms of IoT and embedded intelligence.

XIAO ESP32S3 Sense Introduction Features:

Powerful MCU board: Integrated ESP32S3 32-bit dual-core Xtensa processor chip operating at up to 240 MHz, multiple development ports installed, Arduino / MicroPython support

Advanced features: detachable OV2640 camera sensor with 1600*1200 resolution, compatible with OV5640 camera sensor, integrated additional digital microphone

Large memory for more possibilities: 8MB PSRAM and 8MB flash memory available, supports SD card slot for external 32GB FAT memory.

Outstanding RF performance: Supports 2.4GHz Wi-Fi and BLE dual radio communication and 100m+ remote communication when connected to U.FL antenna.

Thumb-sized compact design: 21 x 17.5mm in XIAO's classic form factor for space-constrained projects such as wearables

Speech Recognition Models

Capture (local) audio data

Step 1. Save recorded sound samples as.wav audio files to a microSD card.

Let's use the onboard SD Card reader to save.wav audio files, we need to habilitate the XIAO PSRAM first.

Then compile and upload the Sketch 1 to XIAO ESP32S3.

After upload the code to the XIAO, get samples from the keywords (hello and other). You can also capture noise and other words. The Serial monitor will prompt you to receive the label to be recorded.

Send the label (for example, hello). The program will wait for another command: rec.

And the program will start recording new samples every time a command rec is sent. The files will be saved as hello.1.wav, hello.2.wav, hello.3.wav, etc. until a new label (for example, stop) is sent. In this case, you should send the command rec for each new sample, which will be saved as stop.1.wav, stop.2.wav, stop.3.wav, etc.

Ultimately, we will get the saved files on the SD card.

Training data acquisition

Step 2. Uploading collected sound data

When the raw dataset is defined and collected, we should initiate a new project at Edge Impulse. Once the project is created, select the Upload Existing Data tool in the Data Acquisition section. Choose the files to be uploaded.

And upload them to the Studio (You can automatically split data in train/test). Repete to all classes and all raw data.

All data on dataset have a 1s length, but the samples recorded in the previous section have 10s and must be split into 1s samples to be compatible. Click on three dots after the sample name and select Split sample.

Once inside de tool, split the data into 1-second records. If necessary, add or remove segments.

This procedure should be repeated for all samples.

Step 3. Creating Impulse (Pre-Process / Model definition)

An impulse takes raw data, uses signal processing to extract features, and then uses a learning block to classify new data.

First, we will take the data points with a 1-second window, augmenting the data, sliding that window each 500ms. Note that the option zero-pad data is set. This is important to fill with zeros samples smaller than 1 second (in some cases, I reduced the 1000 ms window on the split tool to avoid noises and spikes).

Each 1-second audio sample should be pre-processed and converted to an image (for example, 13 x 49 x 1). We will use MFCC, which extracts features from audio signals using Mel Frequency Cepstral Coefficients, which are great for the human voice.

Next, we select KERAS for classification that builds our model from scratch by doing Image Classification using Convolution Neural Network.

Step 4. Pre-Processing (MFCC)

The next step is to create the images to be trained in the next phase. We can keep the default parameter values or take advantage of the DSP Autotuneparameters option, which we will do.

Building a mechine learning model

Step 5. Model Design and Training

We will use a Convolution Neural Network (CNN) model. The basic architecture is defined with two blocks of Conv1D + MaxPooling (with 8 and 16 neurons, respectively) and a 0.25 Dropout. And on the last layer, after Flattening four neurons, one for each class.

As hyper-parameters, we will have a Learning Rate of 0.005 and a model that will be trained by 100 epochs. We will also include data augmentation, as some noise. The result seems OK.

Deploying to XIAO ESP32S3 Sense

Step 6. Deploying to XIAO ESP32S3 Sense

1、 after the completion of training, click the left side [Deployment] deployment options

2、 click the search text box, pop-up menu select Arduino library.

3、 Click the bottom of the 【Build】 button to generate and download as a library file

4、 Wait for a period of time, will pop up to prompt the generation of Arduino library window. At the same time, it will automatically download an Arduino zip library file.

5、 Add this library to the Arduino.

Test the model

We can use the sketch 2 to test the model.

The idea of this sketch is that the LED will be ON whenever the keyword HELLO is detected. In the same way, instead of turn-on a LED, this could be a "trigger" for an external device, as we saw in the introduction.

Grove MP3 module

Reference code to test if the MP3 module is working correctly and to check if the files in the TF card are correct. The libraries we need can be downloaded from the link https://github.com/Seeed-Studio/Seeed_Serial_MP3_Player .

If there is error occurs, like this

fatal error: circular_queue.h: No such file or directory
#include <circular_queue.h>
^~~~~~~~~~~~~~~~~~

You might need to remove the EspSoftwareSerial library from the library manager and download its 8.1.0 version.

Since the module's AUX audio output cannot change the volume and the output volume is very low we need to add an amplifier.

Button Control

In noisy environments, the speech recognition system may be disturbed, resulting in reduced recognition accuracy. In order to improve the user experience and system reliability, we can introduce a button control mechanism so that the user can easily manage the audio playback in noisy environments by using physical buttons. This design not only increases the interaction of the system, but also ensures that the user can accurately control the content of the music playback even in the presence of high background noise. By combining button control and speech recognition, we were able to create a more flexible and user-friendly voice playback system. This can be referenced to sketch 3.

Multi-threaded Control

Multithreading is a technique that enables concurrent execution in computer programming. With multithreading, a programme can perform multiple tasks simultaneously, thus improving the efficiency and responsiveness of the programme. In the button control scenario, if the button control logic is embedded directly into the main loop, it will result in a delay in receiving the button signal due to the fact that recognising speech takes up a certain amount of time to record, and a long press of the button is required to capture the button signal. To solve this problem, we can use multi-threading technique to receive the button signal.

Specifically, we can run the reception and processing of the button signal as a separate thread. When the button is pressed, this independent thread will immediately respond and execute the corresponding processing logic without being interfered by the speech recognition task in the main loop. In this way, we can achieve a fast response to the button signal and improve the user experience.

In conclusion, the application of multi-threading technology in button control can effectively solve the problem of delayed reception of button signals due to the speech recognition task, and improve the response speed and user experience of the program. Sketch 4 will show you how to use multithreading control on XIAO ESP32S3.

RIP digital Sensors

In the final programme design, we had to give full consideration to the working habits and needs of long-term members in the space to avoid frequent voice announcements interfering with their concentration and efficiency. At the same time, considering that the project requires long-term operation of the hardware equipment, the continuous accumulation of heat may lead to premature damage to the equipment, and even affect the stability and reliability of the entire project. In order to achieve the dual goals of energy saving and prolonging device life, we will enable the sleep mode of the devices to make them enter a low-power state during non-working hours, thus effectively reducing energy consumption and prolonging device life.

However, the key issue is how to wake up the device instantly when needed to ensure the smooth running of the project and the members' experience. To this end, we plan to adopt the PIR motion sensing technology to automatically activate the XIAO ESP32S3 when someone is near, thus realising intelligent wake-up. This design ensures immediate response from the device and avoids unnecessary energy waste, achieving a perfect balance between efficiency and energy saving. Reference to Sketch 5.

Circuit Diagram

Final procedure

The final sketch 6 will combine and iterate on the code above.

In Summary

The project’s development was marked by several challenges, predominantly stemming from my initial lack of familiarity with the hardware components. This learning curve inevitably prolonged the project timeline. Moreover, I observed that speech recognition and image recognition have distinct processing demands, which can result in noticeable latency if executed in a single-threaded manner. To address this, I explored the use of multi-threaded processing to optimize system performance. Multithreading enables concurrent processing of multiple tasks, enhancing the control system’s responsiveness and ensuring a more fluid and intuitive user interaction.

In bringing this project to fruition, I chose the XIAO ESP32S3 as the core hardware platform. This microcontroller offers formidable processing power and a rich set of peripheral interfaces, making it highly suitable for sophisticated intelligent speech recognition tasks. To empower the system with the capabilities of an intelligent speech guide, I leveraged a speech model trained using the Edge Impulse platform. This model is designed to accurately recognize specific voice commands and execute the appropriate actions in response, thereby delivering the intended interactive functionality.Summary.

The process of realising the project presented me with a number of challenges, mainly from unfamiliarity with the hardware, which definitely increased the time taken to complete the project. In addition, when dealing with speech recognition and image recognition, it was noticed that they differ in processing, which leads to a certain latency that may occur during single-threaded execution. In order to optimise the performance of the system, I considered introducing multi-threaded processing. With multithreading, we can process multiple tasks simultaneously, which improves the smoothness and reasonableness of the control system and enables it to better satisfy the user's interactive experience. In realising this project, we used the XIAO ESP32S3 as the core hardware platform. This microcontroller has powerful processing capabilities and rich peripheral interfaces, making it ideal for intelligent speech recognition applications. In order to provide the functionality of intelligent speech wizard, I used a speech model trained in Edge impluse, which is capable of recognising specific speech commands and performing the corresponding actions accordingly.

Code

#include<Arduino.h>
#define USE_MULTOCRE 0

int num = 0;

void xTaskOne(void *xTask1){
  int count = 0;
  while (count < 10) {
    Serial.println("Task1");
    delay(500);
    count++;
    num++;
  }

  vTaskDelete(NULL);
}

void xTaskTwo(void *xTask2){
  int count = 0;
  while (count < 10) {
    Serial.println("Task2");
    delay(1000);
    count++;
    // Serial.println("count");
  }
  vTaskDelete(NULL);
}

void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  delay(10);

#if !USE_MULTCORE
  xTaskCreate(
    xTaskOne,/* Task function. */
    "TaskOne",/* String with name of task. */
    4096,/* Stack size in bytes.*/
    NULL,/* parameter passed as input of the task */
    1,/* priority of the task.(configMAx PRIORITIES - 1 being the highest, and @ being the lowest.) */
    NULL);/* Task handle.*/
  
  
  
  xTaskCreate(
    xTaskTwo,/* Task function.*/
    "TaskTwo",/* String with name of task. */
    4096,/* Stack size in bytes.*/
    NULL,/* parameter passed as input of the task */
    2,/* priority of the task.(configMax PRIORITIES - 1 being the highest, and  being the lowest.) */
    NULL);  /* Task handle.*/

#else
  xTaskCreatepinnedToCore(xTaskOne,"TaskOne",4096,NULL,1,NULL,0);
  xTaskCreatepinnedToCore(xTaskTwo,"TaskTwo",4896,NULL,2,NULL,1);

#endif
    

}

void loop() {
  // put your main code here, to run repeatedly:
    Serial.println("XTask is running");
    Serial.println(num);
    delay(1000);

}

/* 
 * WAV Recorder for Seeed XIAO ESP32S3 Sense 
 * 
 * NOTE: To execute this code, we will need to use the PSRAM 
 * function of the ESP-32 chip, so please turn it on before uploading.
 * Tools>PSRAM: "OPI PSRAM"
 * 
 * Adapted by M.Rovai @May23 from original Seeed code
*/

#include <ESP_I2S.h>
#include "FS.h"
#include "SD.h"
#include "SPI.h"

// make changes as needed
#define RECORD_TIME   10  // seconds, The maximum value is 240
#define WAV_FILE_NAME "data"

// do not change for best
#define SAMPLE_RATE 16000U
#define SAMPLE_BITS 16
#define WAV_HEADER_SIZE 44
#define VOLUME_GAIN 2

I2SClass I2S;
String baseFileName;

int fileNumber = 1;
bool isRecording = false;

void setup() {
  Serial.begin(115200);
  while (!Serial) ;
  
  // setup 42 PDM clock and 41 PDM data pins
  I2S.setPinsPdmRx(42, 41);
  if (!I2S.begin(I2S_MODE_PDM_RX, 16000, I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_MONO)) {
    Serial.println("Failed to initialize I2S!");
    while (1) ;
  }
  if(!SD.begin(21)){
    Serial.println("Failed to mount SD Card!");
    while (1) ;
  }
  Serial.printf("Enter with the label name\n");
  //record_wav();
}

void loop() {
  if (Serial.available() > 0) {
    String command = Serial.readStringUntil('\n');
    command.trim();
    if (command == "rec") {
      isRecording = true;
    } else {
      baseFileName = command;
      fileNumber = 1; // reset file number each time a new base file name is set
      Serial.printf("Send rec for starting recording label \n");
    }
  }
  if (isRecording && baseFileName != "") {
    String fileName = "/" + baseFileName + "." + String(fileNumber) + ".wav";
    fileNumber++;
    record_wav(fileName);
    delay(1000); // delay to avoid recording multiple files at once
    isRecording = false;
  }
}

void record_wav(String fileName)
{
  uint32_t sample_size = 0;
  uint32_t record_size = (SAMPLE_RATE * SAMPLE_BITS / 8) * RECORD_TIME;
  uint8_t *rec_buffer = NULL;
  Serial.printf("Start recording ...\n");
   
  File file = SD.open(fileName.c_str(), FILE_WRITE);
  // Write the header to the WAV file
  uint8_t wav_header[WAV_HEADER_SIZE];
  generate_wav_header(wav_header, record_size, SAMPLE_RATE);
  file.write(wav_header, WAV_HEADER_SIZE);

  // PSRAM malloc for recording
  rec_buffer = (uint8_t *)ps_malloc(record_size);
  if (rec_buffer == NULL) {
    Serial.printf("malloc failed!\n");
    while(1) ;
  }
  Serial.printf("Buffer: %d bytes\n", ESP.getPsramSize() - ESP.getFreePsram());

  // Start recording
  esp_i2s::i2s_read(esp_i2s::I2S_NUM_0, rec_buffer, record_size, &sample_size, portMAX_DELAY);
  if (sample_size == 0) {
    Serial.printf("Record Failed!\n");
  } else {
    Serial.printf("Record %d bytes\n", sample_size);
  }

  // Increase volume
  for (uint32_t i = 0; i < sample_size; i += SAMPLE_BITS/8) {
    (*(uint16_t *)(rec_buffer+i)) <<= VOLUME_GAIN;
  }

  // Write data to the WAV file
  Serial.printf("Writing to the file ...\n");
  if (file.write(rec_buffer, record_size) != record_size)
    Serial.printf("Write file Failed!\n");

  free(rec_buffer);
  file.close();
  Serial.printf("Recording complete: \n");
  Serial.printf("Send rec for a new sample or enter a new label\n\n");
}

void generate_wav_header(uint8_t *wav_header, uint32_t wav_size, uint32_t sample_rate)
{
  // See this for reference: http://soundfile.sapp.org/doc/WaveFormat/
  uint32_t file_size = wav_size + WAV_HEADER_SIZE - 8;
  uint32_t byte_rate = SAMPLE_RATE * SAMPLE_BITS / 8;
  const uint8_t set_wav_header[] = {
    'R', 'I', 'F', 'F', // ChunkID
    file_size, file_size >> 8, file_size >> 16, file_size >> 24, // ChunkSize
    'W', 'A', 'V', 'E', // Format
    'f', 'm', 't', ' ', // Subchunk1ID
    0x10, 0x00, 0x00, 0x00, // Subchunk1Size (16 for PCM)
    0x01, 0x00, // AudioFormat (1 for PCM)
    0x01, 0x00, // NumChannels (1 channel)
    sample_rate, sample_rate >> 8, sample_rate >> 16, sample_rate >> 24, // SampleRate
    byte_rate, byte_rate >> 8, byte_rate >> 16, byte_rate >> 24, // ByteRate
    0x02, 0x00, // BlockAlign
    0x10, 0x00, // BitsPerSample (16 bits)
    'd', 'a', 't', 'a', // Subchunk2ID
    wav_size, wav_size >> 8, wav_size >> 16, wav_size >> 24, // Subchunk2Size
  };
  memcpy(wav_header, set_wav_header, sizeof(set_wav_header));
}

/* Edge Impulse Arduino examples
 * Copyright (c) 2022 EdgeImpulse Inc.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 */

// If your target is limited in memory remove this macro to save 10K RAM
#define EIDSP_QUANTIZE_FILTERBANK   0

/*
 ** NOTE: If you run into TFLite arena allocation issue.
 **
 ** This may be due to may dynamic memory fragmentation.
 ** Try defining "-DEI_CLASSIFIER_ALLOCATION_STATIC" in boards.local.txt (create
 ** if it doesn't exist) and copy this file to
 ** `<ARDUINO_CORE_INSTALL_PATH>/arduino/hardware/<mbed_core>/<core_version>/`.
 **
 ** See
 ** (https://support.arduino.cc/hc/en-us/articles/360012076960-Where-are-the-installed-cores-located-)
 ** to find where Arduino installs cores on your machine.
 **
 ** If the problem persists then there's not enough memory for this model and application.
 */

/* Includes ---------------------------------------------------------------- */
#include <XIAO-ESP32S3-KWS_inferencing.h>
#include <ESP_I2S.h>
I2SClass I2S;

#define SAMPLE_RATE 16000U
#define SAMPLE_BITS 16

#define LED_BUILT_IN 21 

/** Audio buffers, pointers and selectors */
typedef struct {
    int16_t *buffer;
    uint8_t buf_ready;
    uint32_t buf_count;
    uint32_t n_samples;
} inference_t;

static inference_t inference;
static const uint32_t sample_buffer_size = 2048;
static signed short sampleBuffer[sample_buffer_size];
static bool debug_nn = false; // Set this to true to see e.g. features generated from the raw signal
static bool record_status = true;

/**
 * @brief      Arduino setup function
 */
void setup()
{
    // put your setup code here, to run once:
    Serial.begin(115200);
    // comment out the below line to cancel the wait for USB connection (needed for native USB)
    while (!Serial);
    Serial.println("Edge Impulse Inferencing Demo");

    pinMode(LED_BUILT_IN, OUTPUT); // Set the pin as output
    digitalWrite(LED_BUILT_IN, HIGH); //Turn off
    
    // setup 42 PDM clock and 41 PDM data pins
    I2S.setPinsPdmRx(42, 41);
    if (!I2S.begin(I2S_MODE_PDM_RX, 16000, I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_MONO)) {
      Serial.println("Failed to initialize I2S!");
    while (1) ;
  }
    
    // summary of inferencing settings (from model_metadata.h)
    ei_printf("Inferencing settings:\n");
    ei_printf("\tInterval: ");
    ei_printf_float((float)EI_CLASSIFIER_INTERVAL_MS);
    ei_printf(" ms.\n");
    ei_printf("\tFrame size: %d\n", EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE);
    ei_printf("\tSample length: %d ms.\n", EI_CLASSIFIER_RAW_SAMPLE_COUNT / 16);
    ei_printf("\tNo. of classes: %d\n", sizeof(ei_classifier_inferencing_categories) / sizeof(ei_classifier_inferencing_categories[0]));

    ei_printf("\nStarting continious inference in 2 seconds...\n");
    ei_sleep(2000);

    if (microphone_inference_start(EI_CLASSIFIER_RAW_SAMPLE_COUNT) == false) {
        ei_printf("ERR: Could not allocate audio buffer (size %d), this could be due to the window length of your model\r\n", EI_CLASSIFIER_RAW_SAMPLE_COUNT);
        return;
    }

    ei_printf("Recording...\n");
}

/**
 * @brief      Arduino main function. Runs the inferencing loop.
 */
void loop()
{
    bool m = microphone_inference_record();
    if (!m) {
        ei_printf("ERR: Failed to record audio...\n");
        return;
    }

    signal_t signal;
    signal.total_length = EI_CLASSIFIER_RAW_SAMPLE_COUNT;
    signal.get_data = &microphone_audio_signal_get_data;
    ei_impulse_result_t result = { 0 };

    EI_IMPULSE_ERROR r = run_classifier(&signal, &result, debug_nn);
    if (r != EI_IMPULSE_OK) {
        ei_printf("ERR: Failed to run classifier (%d)\n", r);
        return;
    }

    int pred_index = 0;     // Initialize pred_index
    float pred_value = 0;   // Initialize pred_value

    // print the predictions
    ei_printf("Predictions ");
    ei_printf("(DSP: %d ms., Classification: %d ms., Anomaly: %d ms.)",
        result.timing.dsp, result.timing.classification, result.timing.anomaly);
    ei_printf(": \n");
    for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
        ei_printf("    %s: ", result.classification[ix].label);
        ei_printf_float(result.classification[ix].value);
        ei_printf("\n");

        if (result.classification[ix].value > pred_value){
           pred_index = ix;
           pred_value = result.classification[ix].value;
      }
    }
    // Display inference result
    if (pred_index == 3){
      digitalWrite(LED_BUILT_IN, LOW); //Turn on
    }
    else{
      digitalWrite(LED_BUILT_IN, HIGH); //Turn off
    }

    
#if EI_CLASSIFIER_HAS_ANOMALY == 1
    ei_printf("    anomaly score: ");
    ei_printf_float(result.anomaly);
    ei_printf("\n");
#endif
}

static void audio_inference_callback(uint32_t n_bytes)
{
    for(int i = 0; i < n_bytes>>1; i++) {
        inference.buffer[inference.buf_count++] = sampleBuffer[i];

        if(inference.buf_count >= inference.n_samples) {
          inference.buf_count = 0;
          inference.buf_ready = 1;
        }
    }
}

static void capture_samples(void* arg) {

  const int32_t i2s_bytes_to_read = (uint32_t)arg;
  size_t bytes_read = i2s_bytes_to_read;

  while (record_status) {

    /* read data at once from i2s - Modified for XIAO ESP2S3 Sense and I2S.h library */
    // i2s_read((i2s_port_t)1, (void*)sampleBuffer, i2s_bytes_to_read, &bytes_read, 100);
    esp_i2s::i2s_read(esp_i2s::I2S_NUM_0, (void*)sampleBuffer, i2s_bytes_to_read, &bytes_read, 100);

    if (bytes_read <= 0) {
      ei_printf("Error in I2S read : %d", bytes_read);
    }
    else {
        if (bytes_read < i2s_bytes_to_read) {
        ei_printf("Partial I2S read");
        }

        // scale the data (otherwise the sound is too quiet)
        for (int x = 0; x < i2s_bytes_to_read/2; x++) {
            sampleBuffer[x] = (int16_t)(sampleBuffer[x]) * 8;
        }

        if (record_status) {
            audio_inference_callback(i2s_bytes_to_read);
        }
        else {
            break;
        }
    }
  }
  vTaskDelete(NULL);
}

/**
 * @brief      Init inferencing struct and setup/start PDM
 *
 * @param[in]  n_samples  The n samples
 *
 * @return     { description_of_the_return_value }
 */
static bool microphone_inference_start(uint32_t n_samples)
{
    inference.buffer = (int16_t *)malloc(n_samples * sizeof(int16_t));

    if(inference.buffer == NULL) {
        return false;
    }

    inference.buf_count  = 0;
    inference.n_samples  = n_samples;
    inference.buf_ready  = 0;

//    if (i2s_init(EI_CLASSIFIER_FREQUENCY)) {
//        ei_printf("Failed to start I2S!");
//    }

    ei_sleep(100);

    record_status = true;

    xTaskCreate(capture_samples, "CaptureSamples", 1024 * 32, (void*)sample_buffer_size, 10, NULL);

    return true;
}

/**
 * @brief      Wait on new data
 *
 * @return     True when finished
 */
static bool microphone_inference_record(void)
{
    bool ret = true;

    while (inference.buf_ready == 0) {
        delay(10);
    }

    inference.buf_ready = 0;
    return ret;
}

/**
 * Get raw audio signal data
 */
static int microphone_audio_signal_get_data(size_t offset, size_t length, float *out_ptr)
{
    numpy::int16_to_float(&inference.buffer[offset], out_ptr, length);

    return 0;
}

/**
 * @brief      Stop PDM and release buffers
 */
static void microphone_inference_end(void)
{
    free(sampleBuffer);
    ei_free(inference.buffer);
}

#if !defined(EI_CLASSIFIER_SENSOR) || EI_CLASSIFIER_SENSOR != EI_CLASSIFIER_SENSOR_MICROPHONE
#error "Invalid model for current sensor."
#endif

// constants won't change. They're used here to set pin numbers:
#define buttonPin1 D7  // the number of the pushbutton pin
#define buttonPin2 D8

// variables will change:
int buttonState1 = 0;  // variable for reading the pushbutton status
int buttonState2 = 0;

void setup() {
  // initialize the LED pin as an output:
  Serial.begin(9600);
  // initialize the pushbutton pin as an input:
  pinMode(buttonPin1, INPUT);
  digitalWrite(buttonPin1, LOW);
    pinMode(buttonPin2, INPUT);
  digitalWrite(buttonPin2, LOW);
}

void loop() {
  // read the state of the pushbutton value:
  buttonState1 = digitalRead(buttonPin1);
  buttonState2 = digitalRead(buttonPin2);
// Serial.println("button checking");
  // check if the pushbutton is pressed. If it is, the buttonState is HIGH:
  if (buttonState1 == HIGH) {
    // turn LED on:
    digitalWrite(ledPin, HIGH);
    Serial.println("button1 push");
  } 
  else if  (buttonState2 == HIGH) {
    // turn LED on:
    digitalWrite(ledPin, HIGH);
    Serial.println("button2 push");
  } 
  else {
    // turn LED off:
    Serial.println(" no button  push");
    digitalWrite(ledPin, LOW);
  }
}

#define MOTIONPIN GPIO_NUM_4

void setup() {
  Serial.begin(9400);
  pinMode(LED_BUILTIN, OUTPUT);
  pinMode(MOTIONPIN, INPUT);
}

void loop() {
  Serial.println("it wake");
  digitalWrite(LED_BUILTIN, HIGH);
  delay(250);
  digitalWrite(LED_BUILTIN, LOW);
  delay(250);
  digitalWrite(LED_BUILTIN, HIGH);
    Serial.println("Going to sleep...");
  delay(1000);
  esp_sleep_enable_ext0_wakeup(MOTIONPIN, 1);
  delay(5000);
  Serial.println("Going to sleep...");
  esp_deep_sleep_start();
}

/* Edge Impulse Arduino examples
 * Copyright (c) 2022 EdgeImpulse Inc.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 */

// If your target is limited in memory remove this macro to save 10K RAM
#define EIDSP_QUANTIZE_FILTERBANK   0

/*
 ** NOTE: If you run into TFLite arena allocation issue.
 **
 ** This may be due to may dynamic memory fragmentation.
 ** Try defining "-DEI_CLASSIFIER_ALLOCATION_STATIC" in boards.local.txt (create
 ** if it doesn't exist) and copy this file to
 ** `<ARDUINO_CORE_INSTALL_PATH>/arduino/hardware/<mbed_core>/<core_version>/`.
 **
 ** See
 ** (https://support.arduino.cc/hc/en-us/articles/360012076960-Where-are-the-installed-cores-located-)
 ** to find where Arduino installs cores on your machine.
 **
 ** If the problem persists then there's not enough memory for this model and application.
 */

/* Includes ---------------------------------------------------------------- */
//#include <XIAO-ESP32S3-KWS_inferencing.h>
// #include <Marco-KWS-KIC_inferencing.h>
#include <Caihuo_nihao_hello_inferencing.h>
#include <I2S.h>
#include "WT2605C_Player.h"
#include <Arduino.h>

// #ifdef __AVR__
#include <SoftwareSerial.h>
SoftwareSerial SSerial(D7,D6); // RX, TX
#define COMSerial SSerial
// #define ShowSerial Serial

WT2605C<SoftwareSerial> Mp3Player;


#define SAMPLE_RATE 16000U
#define SAMPLE_BITS 16

#define LED_BUILT_IN 21 

#define MOTIONPIN GPIO_NUM_4
#define buttonPin1 D1  // the number of the pushbutton pin CHINESE
#define buttonPin2 D8 // ENGLISH
int buttonState1 = 0;  // variable for reading the pushbutton status
int buttonState2 = 0;
int collectTimes = 0;


#define USE_MULTOCRE 0
int Language = 3;

int remember_language = 3;

void xTaskOne(void *xTask1){
  int count = 0;
  int buttonstate = 3;// if press english return 0;      if press chinese return 1 ;   no buttun pressed return 3
  int i = 0;

  while (1) {
    if(Language == 3){
      buttonstate = Check_button();
      // Serial.println("+=+=+=+=+=+=+=+=+==+++=+");
      if(buttonstate != 3 /*按钮按下*/ && buttonstate != Language /*更换语言*/){
        Language = buttonstate;
        // Serial.println("-------------");
        // Serial.print("xTaskOne : ");
        // Serial.println(Language);
        // Serial.println("-------------");
        // vTaskDelete(NULL);
      }
      delay(10);
      i++;
    }else{
      delay(1000);
      // Serial.println("+++++++++++");
      // Serial.print("xTaskOne : ");
      // Serial.println(Language);
      // Serial.println("++++++++++");
    }

  }
  // 当任务完成时，删除自身
  vTaskDelete(NULL);
}

int Language_2 = 3;
void xTaskTwo(void *xTask2){
  int count = 0;
  while (count < 10) {
    // Serial.println("*****");
    // bool m = microphone_inference_record();
    // if (!m) {
    //     ei_printf("ERR: Failed to record audio...\n");
    //     return;
    // }

    // signal_t signal;
    // signal.total_length = EI_CLASSIFIER_RAW_SAMPLE_COUNT;
    // signal.get_data = &microphone_audio_signal_get_data;
    // ei_impulse_result_t result = { 0 };

    // EI_IMPULSE_ERROR r = run_classifier(&signal, &result, debug_nn);
    // if (r != EI_IMPULSE_OK) {
    //     ei_printf("ERR: Failed to run classifier (%d)\n", r);
    //     return;
    // }

    // int pred_index = 0;     // Initialize pred_index
    // float pred_value = 0;   // Initialize pred_value
    // int buttonstate = Check_button();
    // int language = 3;  // 1 is chinese, 0 is english, 3 is not selected yet

    // Serial.println("Task2");
    // delay(1000);
    // // count++;
    // // Serial.println("count");
    // ei_printf("Predictions ");
    // ei_printf("(DSP: %d ms., Classification: %d ms., Anomaly: %d ms.)",
    // result.timing.dsp, result.timing.classification, result.timing.anomaly);
    // ei_printf(": \n");
    // for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
    //     ei_printf("    %s: ", result.classification[ix].label);
    //     ei_printf_float(result.classification[ix].value);
    //     ei_printf("\n");
    //     if (result.classification[ix].value > 0.2){
    //       pred_index = ix;
    //       pred_value = result.classification[ix].value;
          
    //   }
    // }

    // // Display inference result
    // ei_printf("now test the sound : %d \n", EI_CLASSIFIER_LABEL_COUNT );
    // if ((pred_index == 0) && (pred_value > 0.6)){
    //   ei_printf("idex 0 \n");//English
    //   language = 0;
    // }else if((pred_index == 2) && (pred_value > 0.6)){
    //   ei_printf("idex 2 \n");
    //   digitalWrite(LED_BUILT_IN, LOW); //noise trun on noise
    //   Language_2 = 3;
    // }
    // else if((pred_index == 1) && (pred_value > 0.6)){
    //   ei_printf("idex 1 \n");
    //   digitalWrite(LED_BUILT_IN, HIGH); //Turn off //nihao
    //   Language_2 = 1;
    // }

  }
  vTaskDelete(NULL);
}


// check which button is press 
// if press english return 0;      if press chinese return 1 ;   no buttun pressed return 3
int Check_button(){
  buttonState1 = digitalRead(buttonPin1);
  buttonState2 = digitalRead(buttonPin2);

  if (buttonState1 == HIGH) {
    // turn LED on:
    digitalWrite(LED_BUILT_IN, HIGH);
    Serial.println("Chinese push");
    return 1;
  } 
  else if (buttonState2 == HIGH) {
    // turn LED on:
    digitalWrite(LED_BUILT_IN, HIGH);
    Serial.println("English push");
    return 0;
  } 
  else { 
    // turn LED off:
    // Serial.println(" no button  push");
    digitalWrite(LED_BUILT_IN, LOW);
    return 3;
  }

}

/** Audio buffers, pointers and selectors */
typedef struct {
    int16_t *buffer;
    uint8_t buf_ready;
    uint32_t buf_count;
    uint32_t n_samples;
} inference_t;

static inference_t inference;
static const uint32_t sample_buffer_size = 2048;
static signed short sampleBuffer[sample_buffer_size];
static bool debug_nn = false; // Set this to true to see e.g. features generated from the raw signal
static bool record_status = true;

/**
 * @brief      Arduino setup function
 */
void setup()
{
    // put your setup code here, to run once:
    Serial.begin(9600);
    // comment out the below line to cancel the wait for USB connection (needed for native USB)
    COMSerial.begin(115200);
    // while (!Serial){
    //   //  ShowSerial.println("1");
    // };

    Serial.println("+++++++++++++++++++++++++++++++++++++++++++++++++++++");
    Mp3Player.init(COMSerial);

    Serial.println("0...");

    // while (!Serial);
    Serial.println("Edge Impulse Inferencing Demo");

    pinMode(LED_BUILT_IN, OUTPUT); // Set the pin as output
    digitalWrite(LED_BUILT_IN, HIGH); //Turn off
    // digitalWrite(LED_BUILT_IN, LOW);

    I2S.setAllPins(-1, 42, 41, -1, -1);
    if (!I2S.begin(PDM_MONO_MODE, SAMPLE_RATE, SAMPLE_BITS)) {
      Serial.println("Failed to initialize I2S!");
    // while (1) ;
  }
    
    // summary of inferencing settings (from model_metadata.h)
    ei_printf("Inferencing settings:\n");
    ei_printf("\tInterval: ");
    ei_printf_float((float)EI_CLASSIFIER_INTERVAL_MS);
    ei_printf(" ms.\n");
    ei_printf("\tFrame size: %d\n", EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE);
    ei_printf("\tSample length: %d ms.\n", EI_CLASSIFIER_RAW_SAMPLE_COUNT / 16);
    ei_printf("\tNo. of classes: %d\n", sizeof(ei_classifier_inferencing_categories) / sizeof(ei_classifier_inferencing_categories[0]));

    ei_printf("\nStarting continious inference in 1 seconds...\n");
    ei_sleep(1000);

    if (microphone_inference_start(EI_CLASSIFIER_RAW_SAMPLE_COUNT) == false) {
        ei_printf("ERR: Could not allocate audio buffer (size %d), this could be due to the window length of your model\r\n", EI_CLASSIFIER_RAW_SAMPLE_COUNT);
        return;
    }

    ei_printf("Recording...\n");
  pinMode(LED_BUILTIN, OUTPUT);
  pinMode(MOTIONPIN, INPUT);
  digitalWrite(LED_BUILTIN, HIGH);
  delay(250);
  digitalWrite(LED_BUILTIN, LOW);
  delay(1000);
  digitalWrite(LED_BUILTIN, HIGH);
  delay(250);

  // initialize the pushbutton pin as an input:
  pinMode(buttonPin1, INPUT);
  digitalWrite(buttonPin1, LOW);
  pinMode(buttonPin2, INPUT);
  digitalWrite(buttonPin2, LOW);
  delay(10);
#if !USE_MULTCORE
  xTaskCreate(
    xTaskOne,/* Task function. */
    "TaskOne",/* String with name of task. */
    4096,/* Stack size in bytes.*/
    NULL,/* parameter passed as input of the task */
    1,/* priority of the task.(configMAx PRIORITIES - 1 being the highest, and @ being the lowest.) */
    NULL);/* Task handle.*/
  
  
  
  xTaskCreate(
    xTaskTwo,/* Task function.*/
    "TaskTwo",/* String with name of task. */
    4096,/* Stack size in bytes.*/
    NULL,/* parameter passed as input of the task */
    2,/* priority of the task.(configMax PRIORITIES - 1 being the highest, and  being the lowest.) */
    NULL);  /* Task handle.*/

#else
  //最后一个参数至关重要,决定这个任务创建在哪个核上.PRO_CPU 为 ,APP_cPu 为1,城者tskNoAFFINITY允许任务在两者上运行.
  xTaskCreatepinnedToCore(xTaskOne,"TaskOne",4096,NULL,1,NULL,0);
  xTaskCreatepinnedToCore(xTaskTwo,"TaskTwo",4896,NULL,2,NULL,1);

#endif
}

/**
 * @brief      Arduino main function. Runs the inferencing loop.
 */
void loop()
{
    bool m = microphone_inference_record();
    if (!m) {
        ei_printf("ERR: Failed to record audio...\n");
        return;
    }

    signal_t signal;
    signal.total_length = EI_CLASSIFIER_RAW_SAMPLE_COUNT;
    signal.get_data = &microphone_audio_signal_get_data;
    ei_impulse_result_t result = { 0 };

    EI_IMPULSE_ERROR r = run_classifier(&signal, &result, debug_nn);
    if (r != EI_IMPULSE_OK) {
        ei_printf("ERR: Failed to run classifier (%d)\n", r);
        return;
    }

    int pred_index = 0;     // Initialize pred_index
    float pred_value = 0;   // Initialize pred_value
    int buttonstate = Language;
    Serial.println(buttonstate);
    int language = 3;  // 1 is chinese, 0 is english, 3 is not selected yet
    if(buttonstate == language){ // which means language didn't change ==> didn't select ==> then try to rec sound to select language
      // print the predictions
      ei_printf("Predictions ");
      ei_printf("(DSP: %d ms., Classification: %d ms., Anomaly: %d ms.)",
      result.timing.dsp, result.timing.classification, result.timing.anomaly);
      ei_printf(": \n");
      for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
          ei_printf("    %s: ", result.classification[ix].label);
          ei_printf_float(result.classification[ix].value);
          ei_printf("\n");
          if (result.classification[ix].value > 0.2){
            pred_index = ix;
            pred_value = result.classification[ix].value;
            
        }
      }
    }

    // int language = 3;  // 1 is chinese, 0 is english, 3 is not selected yet
    // check any buton is press?
    // buttonstate = Check_button();

    if(buttonstate == language){ // if the button no press return 2, then check sound language
      // Display inference result
      ei_printf("now test the sound : %d \n", EI_CLASSIFIER_LABEL_COUNT );
      if ((pred_index == 0) && (pred_value > 0.5)){
        ei_printf("idex 0 \n");//English
        language = 0;
                Language = 1;
        digitalWrite(LED_BUILTIN, HIGH);
        delay(100);
        digitalWrite(LED_BUILTIN, LOW);
        delay(100);
        digitalWrite(LED_BUILTIN, HIGH);
        delay(100);
        digitalWrite(LED_BUILTIN, LOW);
      }else if((pred_index == 2) && (pred_value > 0.6)){
        ei_printf("idex 2 \n");
        digitalWrite(LED_BUILT_IN, LOW); //noise trun on noise
        language = 3;

      }
      else if((pred_index == 1) && (pred_value > 0.5)){
        ei_printf("idex 1 \n");
        digitalWrite(LED_BUILT_IN, HIGH); //Turn off //nihao
        language = 1;
                digitalWrite(LED_BUILTIN, HIGH);
        delay(100);
        digitalWrite(LED_BUILTIN, LOW);
        delay(100);
        digitalWrite(LED_BUILTIN, HIGH);
        delay(100);
        digitalWrite(LED_BUILTIN, LOW);
      }
    }else {
      language = buttonstate; // langague already change 
    }

  // if language is selected 
  // if(language != 3 && language != remember_language){
  if(language != 3 && language != remember_language){
    // play the introduction .
    remember_language = language;

    // Mp3Player.stop();
    int vol = 15;
    Mp3Player.volume(vol);
    Serial.println("Volume set to: " + String(vol));
    //if language change by button press  change language.
    //if language change play the introduction again.
    Serial.println("Play the MP3");
    if(language == 0) { // english
    delay(10);
      int index = 2;
      Mp3Player.playSDRootSong(index);
      Serial.println("Play music: " + String(index));
      delay(1000);
      // Mp3Player.stop();
    }else{ // Chinese
      int index = 1;
          delay(10);
      Mp3Player.playSDRootSong(index);
      Serial.println("Play music: " + String(index));
      delay(1000);
      // Mp3Player.stop();
    }
    Language = 3;
    delay(1000);

    delay(1000);

    collectTimes == 0;
  }
  //  Serial.println(collecTimes);
  //if the 
  if(collectTimes == 100){
    remember_language = 10;

  }
    
    
#if EI_CLASSIFIER_HAS_ANOMALY == 1
    ei_printf("    anomaly score: ");
    ei_printf_float(result.anomaly);
    ei_printf("\n");
#endif
  collectTimes++;
  // if all loop is finish
  // deep sleep with RIP wakeup
  if(collectTimes > 1000000){
    Serial.println("Going to sleep...");
    delay(1000);
    collectTimes = 0;
    esp_sleep_enable_ext0_wakeup(MOTIONPIN, 1);
    // Serial.println("it wake");
    delay(5000);
    Serial.println("Going to sleep...");
    esp_deep_sleep_start();
  }
}

static void audio_inference_callback(uint32_t n_bytes)
{
    for(int i = 0; i < n_bytes>>1; i++) {
        inference.buffer[inference.buf_count++] = sampleBuffer[i];

        if(inference.buf_count >= inference.n_samples) {
          inference.buf_count = 0;
          inference.buf_ready = 1;
        }
    }
}

static void capture_samples(void* arg) {

  const int32_t i2s_bytes_to_read = (uint32_t)arg;
  size_t bytes_read = i2s_bytes_to_read;

  while (record_status) {

    /* read data at once from i2s - Modified for XIAO ESP2S3 Sense and I2S.h library */
    // i2s_read((i2s_port_t)1, (void*)sampleBuffer, i2s_bytes_to_read, &bytes_read, 100);
    esp_i2s::i2s_read(esp_i2s::I2S_NUM_0, (void*)sampleBuffer, i2s_bytes_to_read, &bytes_read, 100);

    if (bytes_read <= 0) {
      ei_printf("Error in I2S read : %d", bytes_read);
    }
    else {
        if (bytes_read < i2s_bytes_to_read) {
        ei_printf("Partial I2S read");
        }

        // scale the data (otherwise the sound is too quiet)
        for (int x = 0; x < i2s_bytes_to_read/2; x++) {
            sampleBuffer[x] = (int16_t)(sampleBuffer[x]) * 8;
        }

        if (record_status) {
            audio_inference_callback(i2s_bytes_to_read);
        }
        else {
            break;
        }
    }
  }
  vTaskDelete(NULL);
}

/**
 * @brief      Init inferencing struct and setup/start PDM
 *
 * @param[in]  n_samples  The n samples
 *
 * @return     { description_of_the_return_value }
 */
static bool microphone_inference_start(uint32_t n_samples)
{
    inference.buffer = (int16_t *)malloc(n_samples * sizeof(int16_t));

    if(inference.buffer == NULL) {
        return false;
    }

    inference.buf_count  = 0;
    inference.n_samples  = n_samples;
    inference.buf_ready  = 0;

//    if (i2s_init(EI_CLASSIFIER_FREQUENCY)) {
//        ei_printf("Failed to start I2S!");
//    }

    ei_sleep(100);

    record_status = true;

    xTaskCreate(capture_samples, "CaptureSamples", 1024 * 32, (void*)sample_buffer_size, 10, NULL);

    return true;
}

/**
 * @brief      Wait on new data
 *
 * @return     True when finished
 */
static bool microphone_inference_record(void)
{
    bool ret = true;

    while (inference.buf_ready == 0) {
        delay(10);
    }

    inference.buf_ready = 0;
    return ret;
}

/**
 * Get raw audio signal data
 */
static int microphone_audio_signal_get_data(size_t offset, size_t length, float *out_ptr)
{
    numpy::int16_to_float(&inference.buffer[offset], out_ptr, length);

    return 0;
}

/**
 * @brief      Stop PDM and release buffers
 */
static void microphone_inference_end(void)
{
    free(sampleBuffer);
    ei_free(inference.buffer);
}

//
//static int i2s_init(uint32_t sampling_rate) {
//  // Start listening for audio: MONO @ 8/16KHz
//  i2s_config_t i2s_config = {
//      .mode = (i2s_mode_t)(I2S_CHANNEL_MONO),
//      .sample_rate = sampling_rate,
//      .bits_per_sample = (i2s_bits_per_sample_t)16,
//      .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT,
//      .communication_format = I2S_COMM_FORMAT_I2S,
//      .intr_alloc_flags = 0,
//      .dma_buf_count = 8,
//      .dma_buf_len = 512,
//      .use_apll = false,
//      .tx_desc_auto_clear = false,
//      .fixed_mclk = -1,
//  };
//  i2s_pin_config_t pin_config = {
//      .bck_io_num = -1,    // IIS_SCLK 26
//      .ws_io_num = 42,     // IIS_LCLK 32
//      .data_out_num = -1,  // IIS_DSIN -1
//      .data_in_num = 41,   // IIS_DOUT 33
//  };
//  esp_err_t ret = 0;
//
//  ret = i2s_driver_install((i2s_port_t)1, &i2s_config, 0, NULL);
//  if (ret != ESP_OK) {
//    ei_printf("Error in i2s_driver_install");
//  }
//
//  ret = i2s_set_pin((i2s_port_t)1, &pin_config);
//  if (ret != ESP_OK) {
//    ei_printf("Error in i2s_set_pin");
//  }
//
//  ret = i2s_zero_dma_buffer((i2s_port_t)1);
//  if (ret != ESP_OK) {
//    ei_printf("Error in initializing dma buffer with 0");
//  }
//
//  return int(ret);
//}
//
//static int i2s_deinit(void) {
//    i2s_driver_uninstall((i2s_port_t)1); //stop & destroy i2s driver
//    return 0;
//}

#if !defined(EI_CLASSIFIER_SENSOR) || EI_CLASSIFIER_SENSOR != EI_CLASSIFIER_SENSOR_MICROPHONE
#error "Invalid model for current sensor."
#endif

MIniature Voice Assistant base on XIAO ESP32S3 Sense

Things used in this project

Hardware components

Software apps and online services

Hand tools and fabrication machines

Story

BackGround

Introduction to Edge Impluse

XIAO ESP32S3 Sense Introduction Features:

Speech Recognition Models

Training data acquisition

Building a mechine learning model

Test the model

Grove MP3 module

Button Control

Multi-threaded Control

RIP digital Sensors

Circuit Diagram

Final procedure

In Summary

Custom parts and enclosures

Support

Code

Sketch 4

Sketch 1

Sketch 2

Sketch 3

Sketch 5

Sketch 6

Credits

kong

Comments

Embed the widget on your own site

MIniature Voice Assistant base on XIAO ESP32S3 Sense

MIniature Voice Assistant base on XIAO ESP32S3 Sense

Things used in this project

Hardware components

Software apps and online services

Hand tools and fabrication machines

Story

BackGround

Introduction to Edge Impluse

XIAO ESP32S3 Sense Introduction Features:

Speech Recognition Models

Training data acquisition​

Building a mechine learning model​

Test the model

Grove MP3 module

Button Control

Multi-threaded Control

RIP digital Sensors

Circuit Diagram

Final procedure

In Summary

Custom parts and enclosures

Support

Code

Sketch 4

Sketch 1

Sketch 2

Sketch 3

Sketch 5

Sketch 6

Credits

kong

Comments

Related channels and tags

Training data acquisition

Building a mechine learning model