Published April 2, 2024 © MIT

Jetson MagnaMirror

A clean mirror based UI for displaying voice results powered by Riva ASR and text-generation-webui, inspired by llamaspeak.

IntermediateFull instructions providedOver 1 day452

Things used in this project

Hardware components

NVIDIA Jetson AGX Orin Developer Kit

Amazon Basics 24 Inch Monitor

Mirror Acrylic

18x24 Shadow Box

Software apps and online services

text-generation-webui

MagicMirror

Story

1 / 4 • Mirror with response running in living room

Jetson MagnaMirror

There are many benefits to a local LLM. Primarily privacy, ability to scale without heavy costs, and the ability to have fine control over the API. The Jetson MagnaMirror attempts to do just that by taking an old concept and iterating on it just a little bit to provide an awesome "Magic Mirror" experience.

Magic Mirror Startup

What is MagicMirror2

MagicMirror^2 is an open source project that provides software to run a hidden assistant interface that lives within the confines of a mirror. The project uses an acrylic sheet which when nothing is applied to the back appears on the surface as any other mirror. The key is that once light from the screen is enabled the user interface shines through and the user is able to further interact.

https://magicmirror.builders/

Jetson Orin AGX

The Jetson Orin AGX is a powerful developer kit for running local LLMs, generative AI, and other heavy compute related activities.

There are a lot of great guides already for configuring a Jetson Orin AGX including: https://www.hackster.io/shahizat/getting-started-with-ai-on-nvidia-jetson-agx-orin-dev-kit-5a55b5 which can be referenced for further knowledge.

Update and install the associated nvidia jetpack package:

sudo apt update
sudo apt install nvidia-jetpack

Enable max performance mode and set max frequency for the clocks:

sudo nvpmodel -m 0
sudo jetson_clocks

NVMe SSD

The first thing to keep in mind when dealing with the Jetson Orin AGX is that in order to run machine learning models of moderate size you will need to have a NVMe SSD installed. You won't even be able to get Riva, one of the requirements, installed fully without it as it takes up a significant amount of space.

It's outside of the scope of this guide and more of a system setup issue but for completeness sake the first steps are:

Get and install an SSD on the device
After formatting and preparing your SSD make sure to change docker's location to use the SSD for containers

For me the following commands were used (where /mnt/storage is my NVMe mount):

sudo vim /etc/docker/daemon.json

Update "data-root" to point to a folder for your docker. In my case I updated it to: "/mnt/storage/docker":

{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia",
"data-root": "/mnt/storage/docker"
}

Make sure to restart docker after:

sudo systemctl restart docker

I believe I may have had to change permissions to get things fully working (was in my notes but if things work for you ignore):

chown -R root:root /mnt/storage/docker
chmod 701 /mnt/storage/docker

Hardware

In addition to the Jetson Orin AGX, a microphone, a speaker, and the monitor you will need to get a frame and an acrylic reflecting sheet. I've included links to the ones we used for this project to make it easier in the future for those following.

Outside of the frame setup, which we go into more detail further on, there isn't much to do aside from plugging in the associated cords for the screen and inputs.

Frame setup

1 / 14 • 18"x24" shadow box frame from Michael's

Software

NGC CLI

ngc-cli is needed the UI provides a download link if you're logged in on the following page: https://org.ngc.nvidia.com/setup/installers/cli make sure to add the CLI to your path, create an API key, and login.

Riva

With ngc available you can download and install Riva:

ngc config set
ngc registry resource download-version "nvidia/riva/riva_quickstart_arm64:2.14.0"
Once downloaded you can CD into that directory and test Riva by starting it up:
cd riva_quickstart_arm64_v2.14.0
bash riva_init.sh
bash riva_start.sh

You should see a response like (it will take a bit to run the first time as it needs to download models):

Riva Speech already running. Skipping...
Riva server is ready...
Use this container terminal to run applications:

Testing TTS

You can test TTS by running the following command as per their examples:

riva_tts_client --voice_name=English-US.Female-1                 --text="Hello, this is a speech synthesizer."                 --audio_file=/opt/riva/wav/output.wav

You can then copy this to your local Downloads folder like so:

docker cp riva-speech:/opt/riva/wav/output.wav ~/Downloads/output.wav

As I was using a remote session for testing with Riva I then copied this to my local machine (from my user root such that it ended up in ~/Downloads):

scp user@192.168.1.100:~/Downloads/output.wav ./Downloads

Testing ASR

It can be helpful to use an audio device of your own choosing when testing ASR.

First, follow this guide for getting your Jetson Orin AGX setup with Python audio: https://jetsonhacks.com/2023/08/07/speech-ai-on-nvidia-jetson-tutorial/

The following guide can be used to further test with that installed: https://github.com/dusty-nv/jetson-containers/tree/master/packages/audio/riva-client#list-audio-devices

For example:

./run.sh --workdir /opt/riva/python-clients $(./autotag riva-client:python) \
python3 scripts/list_audio_devices.py

In my case I see the output:

AUDIO DEVICES:
0: HD Pro Webcam C920: USB Audio (hw:0,0)             (inputs=2   outputs=0   sampleRate=32000)

This indicates my webcam is setup and able to receive audio with a sample rate of 32000.

I'll go ahead and use that with the example test they provided to trigger the transcription of my microphone:

./run.sh --workdir /opt/riva/python-clients $(./autotag riva-client:python)    python3 scripts/asr/transcribe_mic.py --input-device=0 --sample-rate-hz=32000

I can see the response as I speak in the terminal:

## i'm testing my microphone now and it's working this time
## when i had an issue previously what happened was my microphone would completely freeze and i'd be unable to even control c out of the console

Side note: Microphone issues

I ran into quite a few issues finding a compatible microphone device for inferencing. I was able to use one logitech camera I owned, and had been able to use with a Google Coral dev mini board previously, for video inferencing on the device but nothing I could do would work for audio input.

I would note that when I used an invalid microphone the ASR script would just freeze and become unusable. The working microphone just worked without any effort. One of my microphones did work for one line but then freeze immediately after the first prompt leading me to waste quite a bit of time trying to debug it thinking I was close. Luckily we had a spare old webcam that did work I could use here. In the future I'll look for a more attractive USB microphone for the setup.

MagicMirror Setup

MagicMirror's setup is fairly straightforward on the Orin AGX. Install Node.js and then follow the associated guide: https://docs.magicmirror.builders/getting-started/installation.html#manual-installation

Here is the full list of commands I ran to do such on my Orin AGX:

sudo apt-get install curl
curl -sL https://deb.nodesource.com/setup_18.x | sudo -E bash -
sudo apt-get install nodejs -y
git clone https://github.com/MagicMirrorOrg/MagicMirror
cd MagicMirror/
npm run install-mm
cp config/config.js.sample config/config.js
npm run start

After that you'll see your linked screen taken over by the MagicMirror install. You can use this configuration file to further adjust your settings and prepare your install for normal use. The MagicMirror documentation includes further details on configuration: https://docs.magicmirror.builders/configuration/introduction.html

Jetson MagnaMirror Module

The last element of this project and the key component that ties them together are the modules I've created to interface with the llama using riva for voice recognition.

First fetch my repository on your device:

https://github.com/Cosmic-Bee/MMM-JetsonMagnaMirror

You'll need to place most of the repository under your modules folder in MagicMirror. Please refer to their documentation on module instalation in case of any issues but in general as long as the folder exists under modules it can be found if configured.

In the configuration file for your mirror then enable it and select the portion of your mirror you want it to show up:

{
module: "MMM-MagnaMirror",
position: 'top_right',
},

Pairing Bluetooth Audio

Initially I had hoped to use my monitor's connected speakers but I found the audio to be very choppy when running the ASR logic (the basic demo to read what it's working on so nothing special on my part). To deal with this issue I had to switch gears and use a different audio approach. I checked my home for any USB audio based speakers but found none and then remembered the Orin AGX supports bluetooth so it should support audio likely.

The Jetson Orin AGX comes with bluetooth in a limited manner where it can't be used for audio related aspects. A quick modification to one configuration file, some updates, and installation of the audio logic and you'll be able to get it working like I did though. In the end this was a happy issue I had to debug as the speakers from inside my device, realizing now, would have been blocked by the frame so I would have needed to get the audio out some other way.

sudo vim /lib/systemd/system/bluetooth.service.d/nv-bluetooth-service.conf

Adjust the line:

ExecStart=/usr/lib/bluetooth/bluetoothd -d --noplugin=audio,a2dp,avrcp

To be instead:

ExecStart=/usr/lib/bluetooth/bluetoothd -d
sudo apt-get install pulseaudio-module-bluetooth
sudo reboot

After this I was able to attach my bluetooth audio device by searching for bluetooth from the Jetson UI and configuring the now available speaker. After it was configured I used the sound settings to change the default speaker for my device.

Running the text-generation-webui

After your initial startup you can specify which model you want the web UI to use. From inside the web UI you can also download additional models. Start it up now and go download a model from within.

Starting text generation web-ui:

./run.sh --workdir /opt/text-generation-webui $(./autotag text-generation-webui:1.7)    python3 server.py --listen --verbose --api --model-dir=/data/models/text-generation-webui --model=TheBloke_Llama-2-7b-Chat-GPTQ --loader=llamacpp --n-gpu-layers=128 --n_ctx=4096 --n_batch=4096 --threads=$(($(nproc)- 2))

This provides an API that can be used as the basis of our chat. So Riva provides speech recognition which is converted into a prompt, a pass is made over the prompt to determine if the 'wake word' was said, and if so the prompt (sans wake word) is sent to the text-generation-webui API for providing a response block in the chat. This is then updated in the mirror and if there are enough messages they are scrolled away. Once the messages are out of view they are dropped as currently I am not providing a means to scroll.

With the API up the next step is to run the MagicMirror install itself, you will need to wait a few moments for the text-generation-webui to finish loading so its API is available but once it is the MagicMirror install can be started with `npm run start` from its downloaded location.

In addition to the MagicMirror install the python script for running the Riva wake word processing and message passing to the MagicMirror install will need to be run.

That can be run via (note: it looks for a USB microphone as device 0 and has a sample rate predefined to 32000):

python3 scripts/magna-mirror.py

Assuming you have downloaded and installed the script to that location relative to your current location (I placed mine in a home directory with a subdirectory called 'scripts' hence the above).

With that running you can test ASR by saying the wake words "Other Me" or "Mirror Mirror On the Wall" both of which will then send the prompt to the local text-generation-webui API for further processing. The rest should work automatically for you with responses appearing on the screen. Additional models can be used and further adjustments can be made to the module to further improve it. It's not as fully fleshed out of course as llamaspeak but a nice start to the module for interfacing with the Jetson Orin AGX generative APIs.

Next Steps

There are several further steps that could be taken for this project. Primarily additional work could be done to support a webcam as the input image for doing all sorts of fun local based related imaging. I could imagine trying on different outfits or getting advice about a weird mark (although probably best not to do that via a machine learning model but to go visit a real doctor -- for now at least).

From a hardware perspective I would like to do something better for the microphone input. Either something fancy with a red button for triggering the voice commands (to avoid needing to inference all the time) or perhaps just something hidden as part of the frame but some addition there would be welcome.

I also need to monitor the heat situation with the frame. My wife added a bunch of buffers around the corners to offset the device such that the airflow would not be constricted but I'm not sure how this will work in practice for long time use. We do have the Jetson Orin AGX outside of the enclosure though so it keeps it light and avoids having to keep that in a potentially hot situation.

Credits

Timothy Lovett

16 projects • 16 followers

Maker. I spent over a decade working on backend systems in various languages.

Jetson MagnaMirror

Things used in this project

Hardware components

Software apps and online services

Story

Jetson MagnaMirror

What is MagicMirror2

Jetson Orin AGX

NVMe SSD

Hardware

Frame setup

Software

NGC CLI

Riva

Testing TTS

Testing ASR

Side note: Microphone issues

MagicMirror Setup

Jetson MagnaMirror Module

Pairing Bluetooth Audio

Running the text-generation-webui

Next Steps

Code

Jetson MagnaMirror

Credits

Timothy Lovett

Comments

Embed the widget on your own site

Jetson MagnaMirror

Jetson MagnaMirror

Things used in this project

Hardware components

Software apps and online services

Story

Jetson MagnaMirror

What is MagicMirror2

Jetson Orin AGX

NVMe SSD

Hardware

Frame setup

Software

NGC CLI

Riva

Testing TTS

Testing ASR

Side note: Microphone issues

MagicMirror Setup

Jetson MagnaMirror Module

Pairing Bluetooth Audio

Running the text-generation-webui

Next Steps

Code

Jetson MagnaMirror

Credits

Timothy Lovett

Comments

Related channels and tags