Seeed Studio's Local Voice Chatbot Puts a Speech-Recognizing LLaMa-2 LLM on Your NVIDIA Jetson
Providing it has 16GB or more of RAM available, mind you.
Seeed Studio has announced the launch of the Local Voice Chatbot, an NVIDIA Riva- and LLaMa-2-based large language model (LLM) chatbot with voice recognition capabilities — running entirely locally on NVIDIA Jetson devices, including the company's own reComputer range.
"In a world where artificial intelligence is evolving at an inventive pace, the mode of human-computer interaction has taken a revolutionary turn towards voice interaction. This shift is particularly evident in smart homes, personal assistants, and customer service support, where the demand for seamless and responsive voice chatbots is on the rise," claims Seeed Studio's Kunzang Cheki.
"However, the reliance on cloud-based solutions has brought about concerns related to data privacy and network latency. In response to these challenges, we present an innovative Local Voice Chatbot project that operates locally, addressing privacy issues and ensuring swift responses."
The Seeed Local Voice Chatbot builds atop two existing projects: NVIDIA's Riva, a hardware-accelerated automatic speech recognition (ASR) and speech synthesis engine, and Meta AI's LLaMa-2 large language model (LLM). The idea is simple: speech is picked up by a microphone and converted to text by Riva's ASR; the text is fed to LLaMa-2, which generates a plausible text-based response; and the response is then fed through the Riva text-to-speech engine to render it audible.
"Traditional voice chatbots heavily depend on cloud computing services, raising valid concerns about data privacy and network latency. Our project focuses on deploying a voice chatbot that operates entirely hardware, mitigating privacy concerns and offering a faster response time," Cheki claims. "The overall architecture ensures a secure, private and fast-responding voice interaction system without relying on cloud services, addressing data privacy and network latency concerns."
Running everything locally does come at a cost, of course: while the software itself is compatible with any model of NVIDIA Jetson, the memory-hungry LLM won't work properly on anything with less than 16GB of RAM — meaning the pocket-friendly Jetson Nano range is shut out of the project. "I completed all experiments using [a] Jetson AGX Orin 32GB H01 Kit," Cheki notes.
The project is documented in full on the Seeed Studio wiki.