***This page is incomplete, edits in progress!**
Overview
VIT Voice-UI Complex System Demo
Functions in the M7 application
M7 processor core utilization
VIT Custom Wake-Word and Voice Commands
Use of NXP Maestro Audio Framework
Functions in the M4 application
Programming the M7 and M4 binaries into Flash Memory
VIT Voice-UI Basic Control Demo
Description of Commands
Reference links

Published August 2, 2023

Implement Voice-UI Applications on MaaXBoard RT

Hands-on guidance on how to implement custom voice-controlled applications using NXP Maestro framework VIT

AdvancedProtip217

Implement Voice-UI Applications on MaaXBoard RT

Things used in this project

Hardware components

Tria Technologies MaaXBoard RT

Story

This page is incomplete, edits in progress!

Use documents in the following download for the full procedure:http://avnet.me/maaxRT-voice-lab

Overview

To rapidly implement custom applications on the MaaXBoard RT development board, users have access to some high value resources from NXP and Avnet:

1) MCUXpresso SDK for RT1170-EVK (from NXP)ie. Available via search for “RT1170-EVK” from the NXP SDK builder site at: https://mcuxpresso.nxp.com/en/select

2) Reference designs for MaaXBoard RT (from Avnet)A number of system-level multi-threaded FreeRTOS demo implementations are provided. Examples of these are listed below

Reference designs summary page http://avnet.me/maaxRT-demo-apps
Out-of-box GUI demohttp://avnet.me/maaxRT-gui-demo http://avnet.me/maaxRT-gui-guider-demo
Wi-Fi webserver sensor demohttp://avnet.me/maaxRT-wifi-webserver-demo
BLE health thermometer demohttp://avnet.me/maaxRT-ble-ht
TensorFlow Lite object-recognition demohttp://avnet.me/maaxRT-run-tf
VIT voice-UI complex system demohttp://avnet.me/maaxRT-voice-maestro-demo
VIT voice-UI basic control demohttp://avnet.me/maaxRT-voice-control

The two Avnet VIT based voice-UI applications differ significantly in their level of complexity. The goal of this App Note is to provide the following:

a) A brief description and demo of the voice-UI complex system demo,(using pre-compiled binaries)

b) Step-by-step procedure to customize, build and run the voice-UI basic-control demo (using the MCUXpresso IDE)

VIT Voice-UI Complex System Demo

This headless application is partitioned into multiple FreeRTOS tasks on the M7 and M4 cores of the RT1176.

M7 based voice-processing, USB MSD, MP3 audio-decoding, http webserver and Wi-Fi network
M4 based-I2C sensor monitoring (requires inexpensive add-on hardware)

PI 2 Click Shield / HAT (MikroE $8.00)

6DOF IMU 3 Click board (MikroE $7.00)

LightRanger 8 Click board (MikroE $12.00)

Functions in the M7 application

Key functions in the M7 and M4 applications

Local Voice UI (uses VIT voice-recognition function in the NXP Maestro audio framework)

· Uses 1 to 3 of the 4 onboard PDM microphones
· Local playback control of MP3 audio files from USB thumb-drive
· Local control of the board’s GPIOs (RGB LEDs)

Remote Web-UI (smartphone browser-based UI)A http webserver provides GUI webpages (via 802.11ac Wi-Fi soft-A/P or client connection) for:

· Navigating the playlist of MP3 files on the USB storage
· Remote status & control of board GPIO (RGB LEDs)
· Wi-Fi scanning and configuration
· Display of sensor measurements (6-axis IMU and Range sensor, streamed using websockets)

USB MSD storage access (FAT32 based file system with MP3 audio files)

MP3 Audio File Player (uses 3rd-party Helix MP3 decoder, notthe MP3 decoder within Maestro framework)

Webcerver view of the MP3 media player

M7 processor core utilization

Tabled below is typical utilization of the M7 processor core by the different FreeRTOS tasks:

VIT Custom Wake-Word and Voice Commands

A custom wake-word plus set of 12 voice commands have been predefined for this application, using NXP’s web-based VIT text to speech voice-modelling tool at https://vit.nxp.com/

Use of NXP Maestro Audio Framework

The NXP Maestro software framework supports multiple options for “audio source” and audio “sink devices”. This reference design however continuously listens for voice commands, so Maestro utilization is limited to:

Audio source = Microphone(s)
Voice processing = VIT
Audio sink = Audio speaker (via Codec)

Functions in the M4 application

The M4 FreeRTOS application continuously samples sensor measurements (via I2C) from two MikroE Click sensor boards:

· 6DOF IMU 3 Click (NXP FXOS8700CQ motion sensor 6-axis IMU)

· LightRanger 8 Click (ST VL53L3CX Time-of-Flight sensor)

Programming the M7 and M4 binaries into Flash Memory

TBD

VIT Voice-UI Basic Control Demo

This simpler Cortex M7 FreeRTOS application is what will be built and executed from the MCUXpresso IDE.

It supports voice-control of the RGB LED outputs, as well as audio record and playback functions.

On recorded samples, a dynamic compression algorithm is also applied (triangular dithering) in real time to minimize audio clipping and audio quality-loss during conversion from 24bit to 16bit.

The VIT “wake-word”, plus the set of voice commands and actions implemented on the board in response to these commands, are all fully customizable. For convenience, the application is provided with a default wake-word plus 8 voice commands that control the following on the board:

VIT_Model version : v5.4.0

WakeWord supported : " HEY AVNET "

Voice Commands supported
Cmd_Id : Cmd_Name
0    : UNKNOWN
1    : PLAY SAMPLE
2    : RECORD
3    : PLAY RECORD
4    : LED RED
5    : LED GREEN
6    : LED BLUE
7    : LED OFF
8    : PLAY COMPRESSED

Note: SAI peripheral is configured @(sample_rate: 16khz, bit_width: 16bit/32bit, channel: mono). PDM peripheral is configured to read single channel microphone @(sample_rate: 16khz, bit_width: 32bit(24bit+8bit padding))

Description of Commands

PLAY SAMPLE - Playback demo audio (channel:1, bit_width:16, sample_rate:16Khz) is stored as byte array in the project source folder as sample_mono.h.
RECORD - Record the microphone PCM data in the SDRAM memory region. __attribute__ ((section(".secSdram"))) uint8_t pcmBuffer[PCM_SIZE] = {[0 ... PCM_SIZE-1] = 0x00} ;PCM_SIZE is defined in main.h. record duration ~6.4 sec. beep_mono.h audio is used for signalling the begin and end of record.
Play RECORD - Playback pcm signal stored on the previously stored pcmBuffer. Note: It is volatile memory, so after power on, it will be blank after power cycle

To create custom beep prompt and audio sample, tools downloadable from the following sites can be used:

Use audacity to convert any audio format to mono 16khz mono wav file.
Use wavToCode to generate C array. Note: Windows OS only

The application executes two FreeRTOS tasks:

Playback task - play sample/recorded audio stored on the sdram/flash.
Voice task - VIT voice-recognition, using custom wake-word and 8 voice commands

These tasks are communicated through the FreeRTOS queue. "Voice task" is the producer, "Playback task" is the consumer.

Queue data has the following structure in main.h

typedef struct _queue_command
{
uint8_t command_type;
uint8_t taskId;
uint8_t buffer[24];
}queue_command_t;

The "voice task" handles voice commands as shownbelow. It communicates with "playback task" via queue. For more detail please refer to source/vit_proc.c Line#363.

/* Please enter your custom code in here. */
switch(VoiceCommand.Cmd_Id)
{
case CMD_PLAY_SAMPLE:                      // 1
voice_command.command_type = PLAYER_CMD_PLAY;
voice_command.buffer[0] = 0;
xQueueSend(*player_commandQ, (void *) &voice_command, 10);
break;
case CMD_RECORD:                           // 2
voice_command.command_type = PLAYER_CMD_RECORD;
xQueueSend(*player_commandQ, (void *) &voice_command, 10);
break;
case CMD_PLAY_RECORD:                      // 3
voice_command.command_type = PLAYER_CMD_PLAY;
voice_command.buffer[0] = 2;
xQueueSend(*player_commandQ, (void *) &voice_command, 10);
break;
case CMD_LED_RED:                          // 4 LED RED
set_led(RED);
break;
case CMD_LED_GREEN:                        // 5 LED GREEN
set_led(GREEN);
break;
case CMD_LED_BLUE:                         // 6 LED BLUE
set_led(BLUE);
break;
case CMD_LED_OFF:                          // 7 LED OFF
set_led(BLACK);
break;
case CMD_PLAY_COMPRESSED:                  // 8
voice_command.command_type = PLAYER_CMD_PLAY;
voice_command.buffer[0] = 3;
xQueueSend(*player_commandQ, (void *) &voice_command, 10);
break;
case 9:                                    // 9
break;
case 10:                                   // 10
break;
case 11:                                   // 11
break;
case 12:                                   // 12
break;
default:
break;
}

At "playback task", it waits for queue data. Once data is received, it will process the data and play the requested audio. audio_player.c line:#334

xResult = xQueueReceive(*player_commandQ, &(audio_recvd_cmd), 100);

if (xResult == pdTRUE)
{
switch(audio_recvd_cmd.command_type)
{
case PLAYER_CMD_PLAY:
PRINTF("[Audio]  Playing recorded data\r\n");
play_music(audio_recvd_cmd.buffer[0]);
PRINTF("[Audio]  *** Player stopped ***\r\n");
break;
case PLAYER_CMD_RECORD:
PRINTF("[Audio]  Recording data\r\n");
play_music(1);
enable_record(true);
xResult = xTaskNotifyWait(pdFALSE, 0xffffffff, &ulNotifiedValue, 12000/portTICK_PERIOD_MS);
if (xResult == pdTRUE)
{
PRINTF("[Audio]  Record success\r\n");
enable_record(false);
play_music(1);
}else
{
PRINTF("[Audio!]  Record error\r\n");
}
break;
case 2:

default:
PRINTF("[Audio]  Unknown command received from voice task");
break;
}
}

Reference links:

VIT - Creating custom voice command model.

Miniaudio - 24bit to 16bit conversion using dithering algorithm.

OpenAudio_ArduinoLibrary - dynamic range compression algorithm.

Peter Fenn

8 projects • 12 followers

Director, Avnet (Advanced Applications Group)

Contact

Comments

Please log in or sign up to comment.

Embed the widget on your own site

Implement Voice-UI Applications on MaaXBoard RT

Implement Voice-UI Applications on MaaXBoard RT

Things used in this project

Hardware components

Story

***This page is incomplete, edits in progress!***

Overview

VIT Voice-UI Complex System Demo

Functions in the M7 application

M7 processor core utilization

VIT Custom Wake-Word and Voice Commands

Use of NXP Maestro Audio Framework

Functions in the M4 application

Programming the M7 and M4 binaries into Flash Memory

VIT Voice-UI Basic Control Demo

Description of Commands

Reference links:

Credits

Peter Fenn

Comments

Related channels and tags

This page is incomplete, edits in progress!