Published November 10, 2024 © GPL3+

Smart Voice Controlled Bluetooth Speaker Using ESP32

A DIY Bluetooth speaker using ESP32 with built-in voice recognition that lets you control music playback and volume using voice commands.

IntermediateWork in progress2 hours1,986

Smart Voice Controlled Bluetooth Speaker Using ESP32

Things used in this project

Hardware components

DFRobot Gravity: Offline Language Learning Voice Recognition Sensor

DFRobot FireBeetle ESP32 IOT Microcontroller (Supports Wi-Fi & Bluetooth)

DFRobot MAX98357 I2S Amplifier Module

DFRobot Stereo Enclosed Speaker - 3W 8Ω

Software apps and online services

Arduino IDE

Hand tools and fabrication machines

Soldering iron (generic)

Solder Wire, Lead Free

Story

Ever found yourself with messy hands while cooking, deep in a project, working out, or singing in the shower, wishing you could control your music without touching anything? That's exactly why I built this voice-controlled speaker.

While smart speakers like Amazon Echo, Google Home, and Apple HomePod have transformed how we interact with music, they all require internet connectivity and cloud processing to function - meaning no connection, no music control.

This project takes a different approach by creating a smart speaker that processes voice commands completely offline using DFRobot's Offline Language Learning Voice Recognition Sensor. The ESP32 microcontroller works double duty – handling Bluetooth audio streaming while managing voice commands, while the MAX98357A I2S amplifier ensures high-quality sound output.

What sets this project apart is its independence and simplicity. Once programmed, it works like any Bluetooth speaker but responds to natural voice commands like "play music, " "stop playing, " or "volume up" without needing apps or internet connection. The voice recognition happens instantly on-device, ensuring quick response times and complete privacy.

Hardware Required

ESP32 Development Board
DFRobot DF2301Q Voice Recognition Module

DFRobot MAX98357A I2S Audio Amplifier

Speaker (8Ω recommended)
Power Supply (5V)
Connecting Wires
Project Box/Enclosure (optional)

Pin Connections

Voice Recognition Module (DF2301Q)

RX → GPIO16 (ESP32)
TX → GPIO17 (ESP32)
VCC → 5V
GND → GND

Audio Amplifier (MAX98357A)

BCLK → GPIO25
LRCLK → GPIO26
DIN → GPIO14
VCC → 5V
GND → GND

Software Dependencies

Make sure to install the two required Libraries below

1. DFRobot_DF2301Q Library for the voice recognition module

2.DFRobot_MAX98357A for the amplifier Module

It is is included in the code as shown below

#include <DFRobot_MAX98357A.h>
#include "DFRobot_DF2301Q.h"

How It Works

Voice Recognition Communication

The DF2301Q voice recognition module communicates with the ESP32 using UART protocol. While the module supports I2C communication, UART was chosen for its simplicity and straightforward implementation. The connection requires just two data pins (TX and RX) plus power and ground

Learn more about the module and how to use it Here

// Configure voice recognition sensor on Serial2 for ESP32
DFRobot_DF2301Q_UART DF2301Q(/*hardSerial =*/&Serial2, /*rx =*/16, /*tx =*/17);

When the module recognizes a voice command, it sends a corresponding command ID (CMDID) through the serial connection. Each command has a unique ID that triggers specific actions:

// Voice command IDs 
const uint8_t CMD_PLAY = 92;
const uint8_t CMD_STOP = 93;
const uint8_t CMD_PREVIOUS = 94;
const uint8_t CMD_NEXT = 95;
const uint8_t CMD_REPEAT = 96;
const uint8_t CMD_VOLUME_UP = 97;
const uint8_t CMD_VOLUME_DOWN = 98;
const uint8_t CMD_VOLUME_MAX = 99;
const uint8_t CMD_VOLUME_MIN = 100;
const uint8_t CMD_VOLUME_MID = 101;

The main loop continuously monitors for command IDs:

void loop() {
  uint8_t commandID = DF2301Q.getCMDID();
  
  if (commandID != 0) {
    Serial.print("Received command ID: ");
    Serial.println(commandID);
    
    switch (commandID) {
      case CMD_VOLUME_UP:
        if (currentVolume < 9) {
          currentVolume++;
          amplifier.setVolume(currentVolume);
        }
        break;
        // Other cases...
    }
  }
}

Audio System

The MAX98357A amplifier connects to the ESP32 via I2S (Inter-IC Sound), a dedicated digital audio interface. This ensures high-quality audio transmission from Bluetooth to the speaker. The ESP32 handles Bluetooth A2DP (Advanced Audio Distribution Profile) for streaming audio from your devices.

Software Setup

Install Required Libraries

DFRobot_MAX98357A
DFRobot_DF2301Q

Arduino IDE Settings

Board: ESP32 Dev Module
Upload Speed: 115200
Flash Frequency: 80MHz
CPU Frequency: 240MHz

Upload the Code

Open the provided code in Arduino IDE
Select the correct port
Upload to your ESP32

Initial Configuration

The setup function initializes both the voice recognition module and amplifier:

void setup() {
  // Initialize voice recognition sensor
  while (!DF2301Q.begin()) {
    Serial.println("Voice sensor initialization failed!");
    delay(3000);
  }

  // Initialize amplifier
  while (!amplifier.begin("Nick Smart Speaker", GPIO_NUM_25, GPIO_NUM_26, GPIO_NUM_14)) {
    Serial.println("Amplifier initialization failed!");
    delay(3000);
  }

  // Configure voice module settings
  DF2301Q.settingCMD(DF2301Q_UART_MSG_CMD_SET_MUTE, 0);  // Unmute
  DF2301Q.settingCMD(DF2301Q_UART_MSG_CMD_SET_VOLUME, 10); // Set recognition volume
  DF2301Q.settingCMD(DF2301Q_UART_MSG_CMD_SET_WAKE_TIME, 10); // Wake time in seconds
}

Voice Commands

The system recognizes these commands:

"Play Music" - Start playback
"Stop" - Stop playback
"Next track" - Skip to next track
"Previous Track" - Go to previous track
"Volume Up" - Increase volume
"Volume Down" - Decrease volume
"Change Volume to Maximum " - Set volume to maximum
"Change Volume to Minimum " - Set volume to minimum
"Change Volume to Medium " - Set volume to middle level

Troubleshooting

Voice Recognition Issues

Ensure you're speaking clearly and within 1 meter of the device
Check if TX/RX pins are correctly connected
Verify Serial2 initialization in code
Check serial monitor for command ID feedback

Audio Issues

Verify I2S pin connections
Check speaker connections and impedance
Ensure Bluetooth device is properly paired
Monitor serial output for initialization success

Connection Problems

Reset both ESP32 and Bluetooth device
Check power supply stability
Verify all ground connections
Monitor serial output for debugging information
A startup sound will play when successfully initialized

Operation Guide

Power on the device
Wait for the initialization confirmation
The device will appear as "Nick Smart Speaker" in your Bluetooth settings

Pair with your device
Use voice commands to control playback and volume

Future Enhancements

Implement playlist control
Add ability to play music from SD-Card
Add LED indicators for visual feedback
Develop a mobile app for additional control

Credits

Special Thanks to DFRobot for providing the components used in this project.

Contribution and Collaboration

Want to help make this project even better? Join in! Whether you have ideas for new features, improvements, or just want to collaborate, your contributions are welcome. Feel free to fork the project, make changes, and submit them. Let’s build something awesome together!

Github link https://github.com/tech-nickk/Smart-Voice-controlled-Bluetooth-Speaker

Don't forget to leave a like

Thankyou :)

Gallery

Code

Smart Voice Controlled Speaker

#include <DFRobot_MAX98357A.h>
#include "DFRobot_DF2301Q.h"

// Create amplifier instance
DFRobot_MAX98357A amplifier;

// Configure voice recognition sensor on Serial1
#if defined(ESP32)
  DFRobot_DF2301Q_UART DF2301Q(/*hardSerial =*/&Serial2, /*rx =*/16, /*tx =*/17);
#else
  DFRobot_DF2301Q_UART DF2301Q(/*hardSerial =*/&Serial1);
#endif



// Voice command IDs 
const uint8_t CMD_PLAY = 92;
const uint8_t CMD_STOP = 93;
const uint8_t CMD_PREVIOUS = 94;
const uint8_t CMD_NEXT = 95;
const uint8_t CMD_REOEAT = 96;
const uint8_t CMD_VOLUME_UP = 97;
const uint8_t CMD_VOLUME_DOWN = 98;
const uint8_t CMD_VOLUME_MAX = 99;
const uint8_t CMD_VOLUME_MIN = 100;
const uint8_t CMD_VOLUME_MID = 101;


// Current volume level
int currentVolume = 5;

void setup() {
  Serial.begin(115200);

  // Initialize voice recognition sensor
  while (!DF2301Q.begin()) {
    Serial.println("Voice sensor initialization failed!");
    delay(3000);
  }
  Serial.println("Voice sensor initialized successfully!");

  // Initialize amplifier
  while (!amplifier.begin("Nick Smart Speaker", GPIO_NUM_25, GPIO_NUM_26, GPIO_NUM_14)) {
    Serial.println("Amplifier initialization failed!");
    delay(3000);
  }
  Serial.println("Amplifier initialized successfully!");

  // Set initial volume
  amplifier.setVolume(currentVolume);

  // Initial voice module settings
  DF2301Q.settingCMD(DF2301Q_UART_MSG_CMD_SET_MUTE, 0);  // Unmute
  DF2301Q.settingCMD(DF2301Q_UART_MSG_CMD_SET_VOLUME, 10); // Set voice recognition volume
  DF2301Q.settingCMD(DF2301Q_UART_MSG_CMD_SET_WAKE_TIME, 10); // Wake time in seconds
  
  // Play startup sound
  DF2301Q.playByCMDID(23);  // You can change this ID to any appropriate sound
}

void loop() {
  // Get voice command ID
  uint8_t commandID = DF2301Q.getCMDID();
  
  // Process voice commands
  if (commandID != 0) {
    Serial.print("Received command ID: ");
    Serial.println(commandID);
    
    // Execute command based on ID
    switch (commandID) {
      case CMD_PLAY:
        Serial.println("Command: Play");
        esp_avrc_ct_send_passthrough_cmd(0, ESP_AVRC_PT_CMD_PLAY, ESP_AVRC_PT_CMD_STATE_PRESSED);
        break;
        
      case CMD_STOP:
        Serial.println("Command: Stop");
        esp_avrc_ct_send_passthrough_cmd(0, ESP_AVRC_PT_CMD_STOP, ESP_AVRC_PT_CMD_STATE_PRESSED);
        break;
        
      case CMD_NEXT:
        Serial.println("Command: Next Track");
        esp_avrc_ct_send_passthrough_cmd(0, ESP_AVRC_PT_CMD_FORWARD, ESP_AVRC_PT_CMD_STATE_PRESSED);
        break;
        
      case CMD_PREVIOUS:
        Serial.println("Command: Previous Track");
        esp_avrc_ct_send_passthrough_cmd(0, ESP_AVRC_PT_CMD_BACKWARD, ESP_AVRC_PT_CMD_STATE_PRESSED);
        break;
        
      case CMD_VOLUME_UP:
        if (currentVolume < 9) {
          currentVolume++;
          amplifier.setVolume(currentVolume);
          Serial.print("Volume increased to: ");
          Serial.println(currentVolume);
        }
        break;
        
      case CMD_VOLUME_DOWN:
        if (currentVolume > 0) {
          currentVolume--;
          amplifier.setVolume(currentVolume);
          Serial.print("Volume decreased to: ");
          Serial.println(currentVolume);
        }
        break;

      case CMD_VOLUME_MAX:
        if (currentVolume < 9) {
          currentVolume = 9;
          amplifier.setVolume(currentVolume);
          Serial.print("Volume increased to: ");
          Serial.println(currentVolume);
        }
        break;

      case CMD_VOLUME_MIN:
        
          currentVolume = 1;
          amplifier.setVolume(currentVolume);
          Serial.print("Volume increased to: ");
          Serial.println(currentVolume);
        
        break;

      case CMD_VOLUME_MID:
        
          currentVolume = 5;
          amplifier.setVolume(currentVolume);
          Serial.print("Volume increased to: ");
          Serial.println(currentVolume);
        
        break;
    }
  }
  
  delay(100);  // Small delay to prevent overwhelming the system
}

Credits

Nickson Kiprotich

13 projects • 45 followers

Iot, Robotics & TinyML Enthusiast

Contact

Comments

Please log in or sign up to comment.

Smart Voice Controlled Bluetooth Speaker Using ESP32

Things used in this project

Hardware components

Software apps and online services

Hand tools and fabrication machines

Story

Hardware Required

Pin Connections

Voice Recognition Module (DF2301Q)

Audio Amplifier (MAX98357A)

Software Dependencies

How It Works

Voice Recognition Communication

Audio System

Software Setup

Initial Configuration

Voice Commands

Troubleshooting

Voice Recognition Issues

Audio Issues

Connection Problems

Operation Guide

Future Enhancements

Credits

Contribution and Collaboration

Schematics

Amplifier connection

Code

Smart Voice Controlled Speaker

Smart Voice Controlled Speaker

Credits

Nickson Kiprotich

Comments

Embed the widget on your own site

Smart Voice Controlled Bluetooth Speaker Using ESP32

Smart Voice Controlled Bluetooth Speaker Using ESP32

Things used in this project

Hardware components

Software apps and online services

Hand tools and fabrication machines

Story

Hardware Required

Pin Connections

Voice Recognition Module (DF2301Q)

Audio Amplifier (MAX98357A)

Software Dependencies

How It Works

Voice Recognition Communication

Audio System

Software Setup

Initial Configuration

Voice Commands

Troubleshooting

Voice Recognition Issues

Audio Issues

Connection Problems

Operation Guide

Future Enhancements

Credits

Contribution and Collaboration

Schematics

Amplifier connection

Code

Smart Voice Controlled Speaker

Smart Voice Controlled Speaker

Credits

Nickson Kiprotich

Comments

Related channels and tags