FASTEST! Speech to Text Conversion using ESP32 Board
Why Choose the ESP32 for Speech-to-Text
How It Works
Hardware Setup
How to Create a Deepgram API Key
Install Required Libraries
Program the ESP32
Detailed Tutorial Video

Published January 7, 2025 © GPL3+

FASTEST! Speech to Text Conversion using ESP32

ESP32 + INMP441 mic + SD card + Deepgram API = Speech-to-Text in just 3 seconds! Record, store, and transcribe instantly. Try it now!

IntermediateFull instructions provided30 minutes1,386

FASTEST! Speech to Text Conversion using ESP32

Things used in this project

Hardware components

Espressif ESP32 Dev Board

INMP441 MEMS Microphone Moduel

Micro SD card Module

8GB SD Card

Story

FASTEST! Speech to Text Conversion using ESP32 Board

Speech-to-text technology is a game-changer for a wide range of projects. From enabling hands-free control in smart homes to creating accessible solutions for individuals with disabilities, the ability to convert spoken words into text opens endless possibilities. Whether you're building voice-activated automation, transcribing notes on the go, or integrating voice recognition into chatbots, speech-to-text can simplify user interactions and bring your ideas to life. With the ESP32 Dev Board and Deepgram Speech-to-Text API, you can achieve this seamlessly and efficiently, making it a must-have feature for innovative IoT projects.

Why Choose the ESP32 for Speech-to-Text?

The ESP32 is a versatile microcontroller with built-in Wi-Fi and Bluetooth, making it ideal for IoT applications. Its dual-core processor and ample memory allow it to handle complex tasks like speech-to-text conversion without breaking a sweat. By leveraging the Deepgram Speech-to-Text API, we can achieve real-time speech recognition while keeping the ESP32’s processing demands minimal.

How It Works

The ESP32 captures audio input through the INMP441 microphone and stores the recorded audio on an SD card. The stored audio file is then read from the SD card and sent to the Deepgram Speech-to-Text API. The API processes the audio data and returns the transcribed text, which can then be used for various applications like home automation, note-taking, or even chatbot interactions.

Hardware Setup

Connect the INMP441 Microphone:

Connect the I2S pins (WS, SD, and SCK) of the INMP441 to the corresponding pins on the ESP32 Dev Board.
Ensure proper power and ground connections.
Connect the INMP441 Microphone:

I2S MIC ESP32

GND -> GND

VDD -> 3.3V

SD -> D35

SCK -> D33

WS -> D22

L/R -> 3.3V

Connect the SD Card Module:

Connect the SD card module to the SPI pins of the ESP32 (MOSI, MISO, SCK, and CS).
Insert the 8GB SD card into the module.
Connect the SD Card Module:

SD Card Module ESP32

GND -> GND

Vcc -> VIn

MISO -> D19

MOSI -> D23

SCK -> D18

CS -> D5

Assembled Hardware on PCB (we added an additional pushbutton for trigger)

How to Create a Deepgram API Key

To use the Deepgram Speech-to-Text API, you need an API key. Follow these steps to create one:

Sign Up for a Deepgram Account:

Visit Deepgram's website and create a free account.
Sign Up for a Deepgram Account:
With the new account you get $200 free credit
Later click on "Create API Key"
Give that it a name and later you can get the API key just copy and save it

1 / 5

Install Required Libraries

Open the Arduino IDE and install the following libraries:

ESP32 Core for Arduino (via the Board Manager version 3.4.0)
HTTPClient for sending HTTP requests to the Deepgram API (Builtin Library)
ArduinoJson for parsing JSON responses from the API (Need to Install)

Program the ESP32

The code files for this is already uploaded in the GitHub repository attached with the article

Detailed Tutorial Video

This complete step by step process of doing Speech-to-Text using Deepgram is explained in our tutorial video so you can also refer to that video to clear all your doubts and also to see the practical working demo of it

#HappyMaking

Code

Credits

Sachin Soni

28 projects • 90 followers

Maker, YouTuber, Educator, & E-commerce Entrepreneur, empowering creators with tutorials, projects, & high-quality electronics components

Contact

Comments

Please log in or sign up to comment.

FASTEST! Speech to Text Conversion using ESP32