Seeed Studio Application Engineer Projects

Seeed Studio Application Engineer Projects

•

youjiang yu

•

Peter Pan

Published May 23, 2024 © MIT

AI Assistant to Minutes the Meeting

A fully local AI Meeting Assistant for secure, efficient meeting transcriptions, organizing key points, and action items.

IntermediateWork in progress623

Things used in this project

Hardware components

Seeed Studio reComputer J4012-Edge AI Device

Seeed Studio ReSpeaker USB Mic Array

Software apps and online services

Nvidia RIVA

LlamaIndex

Ollama

NVIDIA Jetson AI Lab

Story

Introduction

The "Assistant to the Minutes of the Meeting" project by Seeed Studio Application Enginners is an advanced, fully local AI-powered tool designed to improve the process of recording and managing meeting minutes. By keeping all data within the company’s internal network, the tool ensures enhanced data privacy and security, significantly reducing the risk of data breaches or unauthorized access. This fully local implementation also helps companies retains complete control and ownership over their data.

The tool leverages cutting-edge AI capabilities, including speech recognition and natural language processing (NLP). These technologies enable accurate transcription of spoken words into written text, as well as the organization and understanding of meeting content. The AI can identify key points, decisions, and action items, making it easier to generate concise and comprehensive meeting minutes. Additionally, the system allows for efficient searching and retrieval of specific discussions or decisions, aiding in the referencing of past meetings and tracking the progress of action items.

Block Diagram

The Block Diagram of the complete structure:

block diagram

Note: Please follow the instruction to install the prerequisite service from below steps

Prerequisite-1. NVIDIA® Riva -- ASR service:

NVIDIA® Riva is a suite of GPU-accelerated microservices designed to handle all your speech and translation needs. It's like having a Swiss Army knife for building real-time conversational AI pipelines. Riva includes:

Automatic Speech Recognition (ASR): Converts spoken language into text.
Text-to-Speech (TTS): Turns written text into natural-sounding speech.
Neural Machine Translation (NMT): Translates text from one language to another with impressive accuracy.

And the best part? It’s deployable at the edge, meaning you can run these services locally on your own hardware for super low latency and enhanced privacy.

Here is the the Wiki steps on how to deploy Riva on reComputer Jetson:

https://wiki.seeedstudio.com/Local_Voice_Chatbot/#install-riva-server

Prerequisite-2. Ollama -- Local LLM:

What is Ollama?

Ollama is a popular LLM tool designed for simplicity and efficiency. It comes with a built-in model library featuring pre-quantized weights, which means the heavy lifting of model optimization is already done for you. Under the hood, Ollama uses llama.cpp for inference, ensuring you get top-notch performance. Plus, the Ollama container is compiled with CUDA support, leveraging the power of GPUs to accelerate your AI tasks.

Why Choose Ollama?

Here are a few reasons why Ollama stands out:

Ease of Use: Ollama is designed to be beginner-friendly, making it easy to get started with LLMs.
Pre-Quantized Models: The included model library comes with pre-quantized weights, saving you time and effort in optimizing the models.
GPU Acceleration: With CUDA support, Ollama can harness the power of your NVIDIA GPU to speed up inference.
Efficient Inference: By using llama.cpp, Ollama ensures efficient and fast model inference.

Here is the the Wiki steps on how to deploy Riva on reComputer Jetson:

https://www.jetson-ai-lab.com/tutorial_ollama.html

Prerequisite-3. LlamaIndex -- RAG:

LLMs like GPT-4 are trained on vast amounts of data, giving them a broad understanding of language and knowledge. However, they don’t have access to your specific data out-of-the-box. This is where RAG comes into play. RAG enhances LLMs by integrating your unique data, making the AI responses more relevant and tailored to your needs.

How Does RAG Work?

Here’s a simple breakdown of the RAG process:

Data Indexing: Your data is loaded and organized into an index. Think of this index as a highly efficient library catalog that the LLM can quickly reference.
User Queries: When a user asks a question, the query is matched against the index to find the most relevant pieces of information.
Context Generation: The filtered data from the index is combined with the user’s query to provide context.
LLM Response: This context, along with the original query, is fed into the LLM. The LLM then generates a response that is not only informed by its training but also tailored to your specific data.

Here is the the Wiki steps on how to deploy LlamaIndex on reComputer Jetson:

https://wiki.seeedstudio.com/Local_RAG_based_on_Jetson_with_LlamaIndex/

Meeting Assistant app:

With All the basic dependency setup you can start to use the "Assistant to the Minutes of the Meeting" app

Step 1: Please clone the AI Meeting Assistant repo from

https://github.com/Seeed-Projects/Assistant-to-the-Minutes-of-the-Meeting

git clone https://github.com/Seeed-Projects/Assistant-to-the-Minutes-of-the-Meeting.git

Step 2:

Step 3:

Conclusion

Overall, the "Assistant to the Minutes of the Meeting" project offers a secure, efficient, and intelligent solution for managing meeting minutes. By automating the minute-taking process, it reduces the time and effort required, freeing employees to focus on more strategic activities and enhancing overall productivity. The fully local implementation ensures that sensitive information remains protected and within the company’s control, addressing critical concerns around data privacy and security.