Published April 17, 2025 © Apache-2.0

Gemini Magic Mirror with the Raspberry Pi

Using a Raspberry Pi, create a smart mirror that's truly *magical* with the power of Gemini AI

AdvancedFull instructions provided5,180

Gemini Magic Mirror with the Raspberry Pi

Story

In this project I'm going to walk you through making a magical mirror using a Raspberry Pi and Gemini. I'm specifically using the Gemini Live API, which is currently in preview, so things may need to be updated later if you are viewing this tutorial a while after it's published, but that's OK! Have fun with it, and I hope you learn something useful.

This project is also using the new JavaScript/TypeScript SDK, so you can find the documentation here for expanding on the project.

Initial Hardware Setup

This project is based on this magic mirror project. While I stuck very closely to how they built the mirror, I did do a few things differently, which I'll cover here. That said, how you build your mirror has a very good chance of being different due to monitor sizes, materials and tools available, and a few other factors, so this is more an explanation around how I built *my* mirror, and hopefully it's a helpful addition to the original (and great) tutorial.

Here's a list of materials that I used for this project (I'll link to them on Amazon, though depending on when you read this, the links may no longer work, so I'll try my best to describe the items as well). I do want to mention that I'm not associated with *any* of the things I bought/list here, they just happened to be what I picked up or already had to make this project works - if you have something you think would work better, absolutely use it and let others know in the comments.

A Raspberry Pi. I used a 3B because I have an obscene number of them, and I figured if I can make this project work with a device that only has 1GB of RAM, then others should be golden if they have a more powerful board. Make sure you have a power cord that is appropriate for the Pi - I use one with a power toggle that I really like so I can easily turn things on and off.
An SD card. I'm using a 32gb card, but someone could easily use a smaller one and still be fine.
A mini HDMI to HDMI cable. I'm using one that uses a left 90 degree angle to better fit/hide it in the setup. If you're using a Raspberry Pi that is a different version than the 3B or a different monitor that does not have a mini HDMI port, you may need a different connector.
This monitor (KYY Portable Monitor 15.6inch) because it's thin, small, and most importantly, affordable.
18x24x0.04 inch two-way mirrored acrylic. You'll want to make sure anything you buy is two-way mirrored glass because the way this is set up, the monitor is placed behind it, so you want to have a reflective surface that allows light to pass through it.
11.69x16.53x0.08 inch clear acrylic sheets. Thickness doesn't matter as much for this, just the height and width. This will be used as a piece to hold the Raspberry Pi and monitor internals together.
Black cardstock. This is for hiding the edges of the mirror around the monitor, so you'll just need to make sure it's large enough to be cut down to 16x10.75"
A microphone. I'm using an AT2020 because I already owned it, but you should be able to use whatever you have available. When we get to the code for this tutorial, I'll point out the one value that needs to be changed to match your microphone's input sample rate (for example, I use 44, 000 for the AT2020, but other microphones may be 16, 000).
Black duct tape for attaching things and blocking light from coming in through the edges of the cardstock to the monitor.
Double sided mounting tape. This is used for attaching the monitor internals back to it.
M3 screws and spacers. You'll need a few different lengths, mentioned in the original tutorial. I personally went with a bigger variety box because these things are used in a lot of different projects.
Tools to disassemble the monitor frame. I picked up a cheap set of electronics opening tools that worked out well, but I only used a couple of the pieces. You'll also want a heat gun or hair dryer to heat up and soften the glue attaching the monitor electronics to the plastic frame.
Optional materials for a standing frame. I used some scrap MDF I had from another project and laser cut a stand that I found for free online (though I did scale it up), but I think you could also use a painting easel or anything else that works for you. I highly recommend using some kind of stand for this project.

Alright, now that I think that's all the supplies, let's get into actually making this awesome project. I'm going to skip over how to disassemble the monitor because I think that's covered really well in the original tutorial, and I think credit should be given where credit is due for a great project start, so go check that out.

To start, I don't have the steadiest hand for scoring and snapping the mirrored acrylic, and cutting in various ways caused a lot of chipping. After a few failed attempts (and ruined acrylic. Sorry Google, but thanks for letting me expense this!), I went all in with cutting my sheet into two pieces that are 12"x18" on a table saw, which worked out well. If you're more comfortable with other methods, that's great, but I just want to say up front that it isn't the easiest material to work with.

After getting my mirrored acrylic into a workable size, I figured I'd save myself the next headache and move everything to the laser cutter. Heads up, if you're not *very* familiar with laser cutters, I really recommend baby sitting it during cuts, especially with the cardstock. Did I set a piece on fire during my first attempt? Absolutely. Will I set something on fire again in the future when I get careless? Probably, but hopefully not! Definitely be mindful of your tools.

If you happen to be using the same monitor that I linked above, the dimensions that I've found work for the overall mirror are 16" by 10.75", with a cutout in the black cardstock that's 13.25" by 7.25". I also added the holes along the edges for the M3 screws so you don't need to drill them or 3D print the jigs (though I did print those jigs on a separate attempt on this, and they do work out really well!). I've attached a PDF that's 96 DPI of the file that I designed with Affinity Designer. You'll want to cut the green and red pieces in every piece (mirrored acrylic, clear acrylic, and the cardstock), but the blue center rectangle is only cut out of the cardstock.

At the end of this you should have the three separate materials all cut to matching sizes with aligned holes. Don't forget to remove the plastic covers on the two acrylic sheets before screwing everything together!

For the remainder of this project assembly, follow the original tutorial. Once you have your Raspberry Pi setup and attached to it, you'll need to go through the steps of setting up the Magic Mirror software, making sure your display is rotated 90 degrees, the scale is correct, and everything starts up on boot. Once you have a base project, it's time to dive into the new Gemini connected magic mirror code.

I highly recommend being able to SSH into your Raspberry Pi as everything else for this tutorial will be done through a terminal and git, though you can also hook up a keyboard to the mirror and interact with the device's terminal directly.

Initial Code Setup

All of the code for this project is attached to this Hackster.io project, though if you want to run the latest code directly on your mirror without building it from the ground up, you can clone this github project into your modules folder on the Raspberry Pi under the Magic Mirror project, run npm install, and then update your configuration file to display the module. This will constantly stream audio, so you may want to modify the project to allow for budget considerations (for example, add push to talk on the microphone), but since this is a hack project, you should definitely modify in any way that makes sense for you. If you are in a noisy environment, be aware that the API is very sensitive and prone to interruptions right now when it detects new sounds.

You will also need to update the configuration file to include your Gemini API key, which can be created or found under Google's AI Studio here.

While the Live API, which is the core of this project, is available for free up to a limited number of requests per day, image generation is not (at the time of this writing), so you may need to change your app depending on what you're attempting to do. You can find a full description of pricing here as new models are consistently coming out with different capabilities and pricing.

The base for this project is the Magic Mirror Module template, which you can find and clone here if you would like to follow along from the ground up. After you have a base project, it's time to update the Magic Mirror's config/config.js file. For reference, my addition for this module looks like this:

{
    module: 'MMM-Gemini',
    position: 'lower_third',
    config: {
        apiKey: 'MY_API_KEY_HERE',
    }
}

The most important part is your API key in the config file, as this is what drives setup and available information, and it is local to your machine.

There's three main files that we'll work with for this project: MMM-Gemini.js (though if you cloned the template, it will be called MMM-Template.js), which handles all of the UI, MMM-Gemini.css, which styles that UI, and node_helper.js, which is where all of the heavy lifting happens. To keep things simple, I'm going to skip over the css file, but you can find the code for it with this project page, or in the completed GitHub project.

As for the UI file, we're basically creating a UI state machine that goes through INITIALIZING, READY, RECORDING, and ERROR. The socketNotificationReceived function can receive a payload and notification from the helper that tells it which state should be displayed in the UI, and then it will update the DOM.

This function will also accept notifications for GEMINI_IMAGE_GENERATING and GEMINI_IMAGE_GENERATED to display a progress spinner or a generated image (received in base64 format) when that operation is invoked, or it will display any text generated by the Gemini Live API until a turnComplete response is received, then it will clear that text when the next response is sent. You can find all of the code for the UI attached to this project and use that as a starting point.

If everything works as expected, you should have a UI similar to this:

With the core app put together, it's time to dig into the node_helper.js file and really get into the features of this magic mirror with Gemini!

Setting up the Gemini Live API and Speech Inputs

The first major step to making this magic mirror work is that we will need to be able to talk to the mirror, send that audio data to the Gemini Live API in real time, and wait for a response. To keep things simple, we'll start by asking for responses to be sent in text, and our UI will display it. Going into the node_helper.js file, let's add the various constants that will be required throughout this project.

const NodeHelper = require("node_helper")
const { GoogleGenAI, Modality, DynamicRetrievalConfigMode, Type, PersonGeneration } = require("@google/genai")
const recorder = require('node-record-lpcm16')
const { Buffer } = require('buffer')
const Speaker = require('speaker')

const INPUT_SAMPLE_RATE = 44100 // Recorder captures at 44.1KHz for AT2020, otherwise 16000 for other microphones. Hardware dependent
const OUTPUT_SAMPLE_RATE = 24000 // Gemini outputs at 24kHz
const CHANNELS = 1
const AUDIO_TYPE = 'raw' // Gemini Live API uses raw data streams
const ENCODING = 'signed-integer'
const BITS = 16
const GEMINI_INPUT_MIME_TYPE = `audio/pcm;rate=${INPUT_SAMPLE_RATE}`
const GEMINI_SESSION_HANDLE = "magic_mirror"


const GEMINI_MODEL = 'gemini-2.0-flash-live-001'

The NodeHelper constant should have already existed in your base project, as this is what the Magic Mirror project uses to know that this is the helper for the module.

Since this project is using the newest JavaScript/TypeScript SDK, we'll need to import the google/genai library, and include multiple type objects that will be used throughout the project. You can find more information on these types from the official documentation or source.

The buffer and speaker are both related to outputting audio, which we'll do later in this project. The recorder is what we'll use to record a live audio stream from the Pi's microphone to send to the Live API.

Moving into the next block, the INPUT_SAMPLE_RATE is related to the microphone that you have attached to your Raspberry Pi. Since I'm using an AT2020, my sample rate is 44100, but your microphone may have a different value.

The OUTPUT_SAMPLE_RATE is the expected sample rate from the Gemini API when we start playing back received audio. The API currently only outputs at 24kHz. The API uses one channel, and outputting raw PCM audio data with signed-integer as the encoding and 16 bits. The GEMINI_INPUT_MIME_TYPE is the type of data that you'll send from the Pi to the API. The GEMINI_SESSION_HANDLE is a value that we'll use later to maintain session continuity between closes and reopens, as currently the Gemini Live API will automatically close after about ten minutes.

Finally, we have the GEMINI_MODEL. At the time of this writing the Gemini Live API is in preview, so this model will absolutely change over time based on when you are reading this tutorial. You'll want to check the Live API documentation to make sure you're using the best option for your project.

With that set, it's time to add all of the variables that will be used throughout this project. I initialize them in NodeHelper.create(), then I have an applyDefaultState() function that can be used to reset everything when the session closes or if an error occurs. I also added a set of logging functions for debugging. You don't *need* to include those, but I found them useful while working through this project, so I'll leave them as they are for this writeup. I also created a helper function called sendToFrontend to wrap sending the socket notification over to the UI frontend (MMM-Gemini.js).

module.exports = NodeHelper.create({
    genAI: null,
    liveSession: null,
    apiKey: null,
    recordingProcess: null,
    isRecording: false,
    audioQueue: [],
    persistentSpeaker: null,
    processingQueue: false,
    apiInitialized: false,
    connectionOpen: false,
    apiInitializing: false,
    imaGenAI: null,

    // Logger functions
    log: function(...args) { console.log(`[${new Date().toISOString()}] LOG (${this.name}):`, ...args) },
    error: function(...args) { console.error(`[${new Date().toISOString()}] ERROR (${this.name}):`, ...args) },
    warn: function(...args) { console.warn(`[${new Date().toISOString()}] WARN (${this.name}):`, ...args) },
    
    sendToFrontend: function(notification, payload) { this.sendSocketNotification(notification, payload) },

    applyDefaultState() {
        this.genAI = null
        this.liveSession = null
        this.recordingProcess = null
        this.isRecording = false
        this.audioQueue = []
        this.persistentSpeaker = null
        this.processingQueue = false
        this.apiInitialized = false
        this.connectionOpen = false
        this.apiInitializing = false
        this.closePersistentSpeaker()
        this.imaGenAI = null
    },

Most of these are for maintaining state, plus a closePersistentSpeaker() function that we'll add when it's time for audio out. If you're following along, feel free to comment that out until later. You'll also notice that there's values for genAI and imaGenAI. genAI is what will manage our live session, whereas imaGenAI is the Gemini object that will be used for image generation. If you're not using image generation in your project, you can remove that.

Now let's get into some of the good stuff. I have a function called initialize that will be used to kick off most of what we need for this project. For now I'll post an edited down version that we can add to as we go along.

async initialize(apiKey) {
        this.log(">>> initialize called")

        if (this.apiInitialized || this.apiInitializing) {
            this.warn(`API initialization already complete or in progress. Initialized: ${this.apiInitialized}, Initializing: ${this.apiInitializing}`)
            if (this.connectionOpen) {
                 this.log("Connection already open, sending HELPER_READY")
                 this.sendToFrontend("HELPER_READY")
            }
            return
        }
        if (!apiKey) {
            this.error(`API Key is missing! Cannot initialize`)
            this.sendToFrontend("HELPER_ERROR", { error: "API Key missing on server" })
            return
        }

        this.apiKey = apiKey
        this.apiInitializing = true
        this.log(`Initializing GoogleGenAI...`)

        try {
            this.sendToFrontend("INITIALIZING")
            this.log("Step 1: Creating GoogleGenAI instances...")

            this.genAI = new GoogleGenAI({
                apiKey: this.apiKey,
                // httpOptions: { 'apiVersion': API_VERSION }
            })

            this.log(`Step 2: GoogleGenAI instance created.`)
            this.log(`Step 3: Attempting to establish Live Connection with ${GEMINI_MODEL}...`)

            this.liveSession = await this.genAI.live.connect({
                model: GEMINI_MODEL,
                callbacks: {
                    onopen: () => {
                        this.log(">>> Live Connection Callback: onopen triggered!")
                        this.connectionOpen = true
                        this.apiInitializing = false
                        this.apiInitialized = true
                        this.log("Connection OPENED. Sending HELPER_READY")
                        this.sendToFrontend("HELPER_READY")
                    },
                    onmessage: (message) => { this.handleGeminiResponse(message) },
                    onerror: (e) => {
                        this.error(`Live Connection ERROR: ${e?.message || e}`)
                        this.connectionOpen = false
                        this.apiInitializing = false
                        this.apiInitialized = false
                        this.liveSession = null
                        this.stopRecording(true)
                        this.closePersistentSpeaker() // Close speaker on error
                        this.processingQueue = false
                        this.audioQueue = []
                        this.sendToFrontend("HELPER_ERROR", { error: `Live Connection Error: ${e?.message || e}` })
                    },
                    onclose: async (e) => {
                        this.warn(`Live Connection CLOSED:`)
                        this.warn(JSON.stringify(e, null, 2))
                        
                        const wasOpen = this.connectionOpen
                        
                        if (wasOpen) {
                            this.sendToFrontend("HELPER_ERROR", { error: `Live Connection Closed Unexpectedly. Retrying...` })
                        } else { this.log("Live Connection closed normally") }

                        this.audioQueue = []
                        this.stopRecording(true)
                        this.closePersistentSpeaker() // Close speaker on close
                        this.applyDefaultState()
                        await this.initialize(this.apiKey)
                    },
                },
                
                config: {
                    responseModalities: [Modality.TEXT],
    
                },
            })
            this.log(`Step 4: live.connect call initiated...`)
        } catch (error) {
            this.error(`API Initialization failed:`, error)
            this.liveSession = null
            this.apiInitialized = false
            this.connectionOpen = false
            this.apiInitializing = false
            this.closePersistentSpeaker() // Ensure speaker is closed on init failure
            this.processingQueue = false
            this.audioQueue = []
            this.sendToFrontend("HELPER_ERROR", { error: `API Initialization failed: ${error.message || error}` })
        }
    },

This code verifies that an API key is available, creates the GoogleGenAI object that is used to work with the Gemini API, and then creates a new LiveSession. This live session is using the SDK's built in web socket framework to handle sending and receiving data between the Gemini model and the Raspberry Pi, and it has a set of callbacks that will drive the state of the device. The most important is onmessage, which will send the response from Gemini to a new function that will determine how the mirror should react. There's also code here for onclose that will reset playback state and a few other things that haven't been written yet, so feel free to comment out the onclose callback until the end.

You'll also notice a config object. This will be a core part of this project, as it will contain every setting related to that we're doing with the mirror and the Gemini API. For now it'll just have a responseModality of Modality.TEXT, meaning we want the Gemini model to only respond with text in the onmessage callback.

Moving down to the helper's socketNotificationReceived function, we'll want to go into the notification switch statement and add new cases for START_CONNECTION and START_CONTINUOUS_RECORDING. The START_CONNECTION case is what will call initialize, which will tell the frontend to update it's UI state after initialization has completed. Once that UI state has been updated, another notification will be received to start recording. The full function looks like this:

socketNotificationReceived: async function(notification, payload) {
        switch (notification) {
            case "START_CONNECTION":
                this.log(`>>> socketNotificationReceived: Handling START_CONNECTION`)
                if (!payload || !payload.apiKey) {
                     this.error(`START_CONNECTION received without API key`)
                     this.sendToFrontend("HELPER_ERROR", { error: "API key not provided by frontend" })
                     return
                 }

                try { await this.initialize(payload.apiKey) } catch (error) {
                     this.error(">>> socketNotificationReceived: Error occurred synchronously when CALLING initialize:", error)
                     this.sendToFrontend("HELPER_ERROR", { error: `Error initiating connection: ${error.message}` })
                 }
                break
            case "START_CONTINUOUS_RECORDING":
                this.log(`>>> socketNotificationReceived: Handling START_CONTINUOUS_RECORDING`)
                if (!this.connectionOpen || !this.liveSession) {
                    this.warn(`Cannot start recording, API connection not ready/open. ConnOpen=${this.connectionOpen}, SessionExists=${!!this.liveSession}`)
                    this.sendToFrontend("HELPER_ERROR", { error: "Cannot record: API connection not ready" })
                    if (!this.apiInitialized && !this.apiInitializing && this.apiKey) {
                         this.warn("Attempting to re-initialize API connection...")
                         await this.initialize(this.apiKey) // Await re-initialization
                    }
                    return
                }
                if (this.isRecording) {
                    this.warn(`Already recording. Ignoring START_CONTINUOUS_RECORDING request`)
                    return
                }
                this.startRecording()
                break
        }
    },

Now actually doing the recording is a big step, so I'll break it into smaller parts. We'll start by creating a new function called startRecording(). This will check to see if the device is already recording, if the connection is already open, or if the live stream is running. If any of that is true, the function will exit early.

startRecording() {
        this.log(">>> startRecording called")

        if (this.isRecording) {
            this.warn("startRecording called but already recording")
            return
        }
        if (!this.connectionOpen || !this.liveSession) {
             this.error("Cannot start recording: Live session not open")
             this.sendToFrontend("HELPER_ERROR", { error: "Cannot start recording: API connection not open" })
             return
        }

If everything is in the clear, then it's time to start recording. We can update our isRecording state value, then create a recorderOptions object.

this.isRecording = true
        this.log(">>> startRecording: Sending RECORDING_STARTED to frontend")
        this.sendToFrontend("RECORDING_STARTED")

        const recorderOptions = {
            sampleRate: INPUT_SAMPLE_RATE,
            channels: CHANNELS,
            audioType: AUDIO_TYPE,
            encoding: ENCODING,
            bits: BITS,
            threshold: 0,
        }

        this.log(">>> startRecording: Recorder options:", recorderOptions)
        this.log(`>>> startRecording: Using input MIME Type: ${GEMINI_INPUT_MIME_TYPE}`)

From there, it's time to call record on the recorder that we defined at the top of the file, then store a reference to the audio stream. I'm also creating a chunkCounter here for debugging, but you can ignore that if you want to have a bit cleaner code.

try {
            this.log(">>> startRecording: Attempting recorder.record()...")
            this.recordingProcess = recorder.record(recorderOptions)
            this.log(">>> startRecording: recorder.record() call successful. Setting up streams...")

            const audioStream = this.recordingProcess.stream()
            let chunkCounter = 0 // Reset counter for new recording session

The audioStream will have a listener for any data that comes through. If it's not empty (which is a case you can encounter if you unplug your mic, or if you have a push-to-talk set up for when you're not using the mic), then it'll convert that audio data into a base64 encoded string, which is then sent to the Gemini Live API as a new JSON payload using the liveSession.sendRealtimeInput function.

audioStream.on('data', async (chunk) => {
                if (!this.isRecording || !this.connectionOpen || !this.liveSession) {
                    if (this.isRecording) {
                        this.warn(`Recording stopping mid-stream: Session/Connection invalid...`)
                        this.stopRecording(true) // Force stop if state is inconsistent
                    }
                    return
                }

                if (chunk.length === 0) {
                    return // Skip empty chunks
                }

                const base64Chunk = chunk.toString('base64')
                chunkCounter++ // Increment counter for valid chunks

                try {
                    const payloadToSend = {
                        media: {
                            mimeType: GEMINI_INPUT_MIME_TYPE,
                            data: base64Chunk
                        }
                    }

                    // Check liveSession again just before sending
                    if (this.liveSession && this.connectionOpen) {
                        await this.liveSession.sendRealtimeInput(payloadToSend)
                    } else {
                        this.warn(`Cannot send chunk #${chunkCounter}, connection/session lost just before send`)
                        this.stopRecording(true) // Stop recording if connection lost
                    }
                } catch (apiError) {
                    const errorTime = new Date().toISOString()
                    this.error(`[${errorTime}] Error sending audio chunk #${chunkCounter}:`, apiError)

                    if (apiError.stack) {
                        this.error(`Gemini send error stack:`, apiError.stack)
                    }

                     // Check specific error types if possible, otherwise assume connection issue
                    if (apiError.message?.includes('closed') || apiError.message?.includes('CLOSING') || apiError.code === 1000 || apiError.message?.includes('INVALID_STATE')) {
                         this.warn("API error suggests connection closed/closing or invalid state")
                         this.connectionOpen = false // Update state
                    }

                    this.sendToFrontend("HELPER_ERROR", { error: `API send error: ${apiError.message}` })
                    this.stopRecording(true) // Force stop on API error
                }
            })

For the rest of this function, I have listeners for error, end, and exit that I'll include here for completion.

audioStream.on('error', (err) => {
                this.error(`Recording stream error:`, err)

                if (err.stack) {
                    this.error(`Recording stream error stack:`, err.stack)
                }

                this.sendToFrontend("HELPER_ERROR", { error: `Audio recording stream error: ${err.message}` })
                this.stopRecording(true) // Force stop on stream error
            })

             audioStream.on('end', () => {
                 this.warn(`Recording stream ended`) // Normal if stopRecording was called, unexpected otherwise
                 if (this.isRecording) {
                      // This might happen if the underlying recording process exits for some reason
                      this.error("Recording stream ended while isRecording was still true (unexpected)")
                      this.sendToFrontend("HELPER_ERROR", { error: "Recording stream ended unexpectedly" })
                      this.stopRecording(true) // Ensure state is consistent
                 }
             })

            this.recordingProcess.process.on('exit', (code, signal) => {
                const wasRecording = this.isRecording // Capture state before potential modification
                this.log(`Recording process exited with code ${code}, signal ${signal}`) // Changed from warn to log

                const currentProcessRef = this.recordingProcess // Store ref before nullifying

                this.recordingProcess = null // Clear the reference immediately

                if (wasRecording) {
                    // If we *thought* we were recording when the process exited, it's an error/unexpected stop
                    this.error(`Recording process exited unexpectedly while isRecording was true`)
                    this.sendToFrontend("HELPER_ERROR", { error: `Recording process stopped unexpectedly (code: ${code}, signal: ${signal})` })
                    this.isRecording = false // Update state
                    this.sendToFrontend("RECORDING_STOPPED") // Notify frontend it stopped
                }
                else {
                    // If isRecording was already false, this exit is expected (due to stopRecording being called)
                    this.log(`Recording process exited normally after stop request`)
                }
            })

        } catch (recordError) {
            this.error(">>> startRecording: Failed to start recording process:", recordError)

            if (recordError.stack) {
                this.error(">>> startRecording: Recording start error stack:", recordError.stack)
            }

            this.sendToFrontend("HELPER_ERROR", { error: `Failed to start recording: ${recordError.message}` })

            this.isRecording = false // Ensure state is correct
            this.recordingProcess = null // Ensure reference is cleared
        }
    },

I also have a stopRecording() function that is used for error cases and when the live stream closes. This is pretty straightforward resetting everything and updating the UI, so I won't get into it for this tutorial.

stopRecording(force = false) {
        if (this.isRecording || force) {
            if (!this.recordingProcess) {
                this.log(`stopRecording called (Forced: ${force}) but no recording process instance exists`)
                 if (this.isRecording) {
                      this.warn("State discrepancy: isRecording was true but no process found. Resetting state")
                      this.isRecording = false
                      this.sendToFrontend("RECORDING_STOPPED") // Notify frontend about the state correction
                 }
                 return
            }

            this.log(`Stopping recording process (Forced: ${force})...`)
            const wasRecording = this.isRecording // Capture state before changing
            this.isRecording = false // Set flag immediately

            // Store process reference before potentially nullifying it in callbacks
            const processToStop = this.recordingProcess

            try {
                const stream = processToStop.stream()
                if (stream) {
                    this.log("Removing stream listeners")
                    stream.removeAllListeners('data')
                    stream.removeAllListeners('error')
                    stream.removeAllListeners('end')
                }

                 if (processToStop.process) {
                    this.log("Removing process 'exit' listener")
                    processToStop.process.removeAllListeners('exit')

                    this.log("Sending SIGTERM to recording process")
                    processToStop.process.kill('SIGTERM')


                 } else {
                    this.warn("No underlying process found in recordingProcess object to kill")
                 }

                 // Call the library's stop method, which might also attempt cleanup
                 this.log(`Calling recorder.stop()...`)
                 processToStop.stop()

            } catch (stopError) {
                this.error(`Error during recorder cleanup/stop():`, stopError)
                if (stopError.stack) {
                    this.error(`Recorder stop() error stack:`, stopError.stack)
                }
            } finally {
                // Don't nullify this.recordingProcess here; let the 'exit' handler do it.
                if (wasRecording) {
                    this.log("Recording stop initiated. Sending RECORDING_STOPPED if process exits")
                    // Actual RECORDING_STOPPED is sent by the 'exit' handler or state correction logic
                } else {
                     this.log("Recording was already stopped or stopping, no state change needed")
                }
            }
        } else {
            this.log(`stopRecording called, but isRecording flag was already false`)
            // Defensive cleanup if process still exists somehow
            if (this.recordingProcess) {
                 this.warn("stopRecording called while isRecording=false, but process existed. Forcing cleanup")
                 this.stopRecording(true) // Force stop to clean up the zombie process
            }
        }
    },

Finally, let's add the handleGeminiResponse() function. This block will be updated for the various types of responses we get from the Gemini Live API, but for a base version, we'll simply want to retrieve the content of the message and check if text exists. If it does, we'll send that text chunk to the UI to display. I also have an if statement to check if setup is complete, but I'm not currently doing anything with that for my version of this project.

async handleGeminiResponse(message) {
        if (message?.setupComplete) { return } // Ignore setup message

        let content = message?.serverContent?.modelTurn?.parts?.[0]

        // Handle Text
        if (content?.text) {
            this.log(`Extracted text: ` + content.text)
            this.sendToFrontend("GEMINI_TEXT_RESPONSE", { text: content.text })
        }
    },

In addition, we can check to see if the Live API is saying that we have completed the response turn. This is a really valuable response because it tells us when the model is done generating text, and later it will tell us when it thinks playback for an audio response should be complete. You can add this block to your handlegeminiResponse now as it will be used to clear text between responses by the UI.

Alright, so that was a lot, and I brushed over a bit of it because there is a lot of boiler plate for state management, but at this point you should be able to talk to the mirror and display the text response from the Gemini Live API.

Audio Responses and Interruptions

Now that we have text working, let's dig into how we can get audio back, since honestly, a magic mirror should really be talking to you. The way this works is that we'll set the responseModality to Modality.AUDIO at initialization, and then when the Gemini Live API responds, it will send a base64 encoded string for multiple audio chunks that can be played back on the device. Since those responses come in quickly, rather than at the time they would be played out loud, we'll also need to create a queuing system that movies to the next audio chunk when any current ones have finished playing. On top of all of this, the Gemini Live API supports interruptions, so if the user says something while audio is playing back, we can clear the queue, leave the speaker open, and wait for the next audio response to come back from the API.

Let's start by updating the live session responseModality.

responseModalities: [Modality.AUDIO],

We'll also create a new function called processQueue to handle the playback logic. Let's go over it in steps. First we'll want to see if the queue has anything in it. If it doesn't, then we can close the speaker, assuming the queue wasn't cleared by an interruption and expecting more audio chunks soon.

processQueue(interrupted) {
        // 1. Check Stop Condition (Queue Empty)
        if (this.audioQueue.length === 0) {
            this.log("_processQueue: Queue is empty. Playback loop ending")
            // Speaker should be closed by the last write callback's .end()
            // Safeguard: ensure flag is false and close speaker if it exists.
            this.processingQueue = false
            if (!interrupted && this.persistentSpeaker) {
                this.warn("_processQueue found empty queue but speaker exists! Forcing close")
                this.closePersistentSpeaker()
            }
            return
        }

Next we can set the processingQueue flag to true for state management.

// 2. Ensure Playback Flag is Set
        if (!this.processingQueue) {
             this.processingQueue = true
             this.log("processQueue: Starting playback loop")
        }

Then we will want to check to see if our speaker is already created, otherwise we'll create a new one. One thing I want to call out here is that I'm specifically using a persistent speaker that's stored at the class level because we want to minimize the amount of times the speaker is created and destroyed. It's a fine balance with memory, which may be less of an issue for you if you're using a newer Raspberry Pi with more than 1GB of RAM.

// 3. Ensure Speaker Exists (Create ONLY if needed)
        if (!this.persistentSpeaker || this.persistentSpeaker.destroyed) {
            this.log("Creating new persistent speaker instance")
            try {
                this.persistentSpeaker = new Speaker({
                    channels: CHANNELS,
                    bitDepth: BITS,
                    sampleRate: OUTPUT_SAMPLE_RATE,
                })

                this.persistentSpeaker.once('error', (err) => {
                    this.error('Persistent Speaker Error:', err)
                    this.closePersistentSpeaker()
                })

                this.persistentSpeaker.once('close', () => {
                    this.log('Persistent Speaker Closed Event')
                    // Ensure state is clean if closed unexpectedly or after end()
                    this.persistentSpeaker = null
                    if (this.processingQueue) {
                         this.log('Speaker closed. Resetting processing flag')
                         this.processingQueue = false
                    }
                })

                this.persistentSpeaker.once('open', () => this.log('Persistent Speaker opened'))

            } catch (e) {
                this.error('Failed to create persistent speaker:', e)
                this.persistentSpeaker = null
                this.processingQueue = false 
                this.audioQueue = []
                return
            }
        }

         // Check again after attempting creation
         if (!this.persistentSpeaker) {
             this.error("Cannot process queue, speaker instance is not available")
             this.processingQueue = false // Stop processing
             return
         }

Once we know we have a speaker available, we can retrieve the base64 encoded audio clip string from the queue and write it to a buffer before sending that buffer to the speaker to play back.

// 4. Get and Write ONE Chunk
        const chunkBase64 = this.audioQueue.shift() // Take the next chunk
        const buffer = Buffer.from(chunkBase64, 'base64')

        this.persistentSpeaker.write(buffer, (err) => {
            if (err) {
                this.error("Error writing buffer to persistent speaker:", err)
                // Speaker error listener should handle cleanup via closePersistentSpeaker()
                // Avoid calling closePersistentSpeaker directly here to prevent race conditions
                return
            }

If everything has gone well up to this point, we'll want to see if there's anything remaining in the queue, and then have the processQueue function call itself to move through the next chunk, otherwise we'll escape the entire playback loop.

// 5. Decide Next Step (Continue Loop or End Stream)
            if (this.audioQueue.length > 0) {
                // More chunks waiting? Immediately schedule the next write
                this.processQueue(false)
            } else {
                // Queue is empty *after* taking the last chunk
                this.log("Audio queue empty after playing chunk. Ending speaker stream gracefully")
                 if (this.persistentSpeaker && !this.persistentSpeaker.destroyed) {
                     // Call end() - allows last chunk to play, then 'close' event fires
                     this.persistentSpeaker.end(() => {
                        this.log("Speaker .end() callback fired after last chunk write")
                        // The 'close' listener handles the actual state cleanup
                     })
                 } else {
                     // Speaker already gone? Ensure flag is false
                     this.processingQueue = false
                 }
            }
        })
    },

And while we're here, let's define the closePersistentSpeaker function that is used for error cases. This doesn't do too terribly much besides close the speaker, remove listeners, and try to clean up our state.

closePersistentSpeaker() {
        if (this.persistentSpeaker && !this.persistentSpeaker.destroyed) {
            this.log("Closing persistent speaker...")
            try {
                 // Remove listeners to prevent acting on events after initiating close
                 this.persistentSpeaker.removeAllListeners() // Remove all listeners associated with this speaker

                 // Call end to flush and close gracefully
                 // The 'close' event should ideally handle state reset, but do it defensively here too
                 this.persistentSpeaker.end(() => {
                     this.log("Speaker .end() callback fired during closePersistentSpeaker")
                 })
                 this.persistentSpeaker = null
                 this.processingQueue = false // Reset state immediately after initiating close
                 this.log("Speaker close initiated, state reset")

            } catch (e) {
                this.error("Error trying to close persistent speaker:", e)
                this.persistentSpeaker = null // Ensure null even if close fails
                this.processingQueue = false
            }
        } else {
            // If speaker doesn't exist or already destroyed, ensure state is correct
            this.persistentSpeaker = null
            this.processingQueue = false
        }
    }

Finally, before testing this, let's make sure we're handling the audio and interrupt message types that come back from the Gemini Live API. We can do this by adding the following blocks to the handleGeminiResponse function.

// Handle the interrupt flag
        if(message?.serverContent?.interrupted) {
            this.log("message: " + JSON.stringify(message))
            this.log("*** Interrupting ***")
            this.audioQueue = []
            this.processQueue(true)
            return
        }

        // Extract and Queue Audio Data
        let extractedAudioData = content?.inlineData?.data
        if (extractedAudioData) {
            this.audioQueue.push(extractedAudioData)

            // --- Trigger Playback if Threshold Reached and Not Already Playing ---
            if (!this.processingQueue) {
                this.log(`Starting playback`)
                this.processQueue(false) // Start the playback loop
            }
        }

Now you should be able to restart your mirror module to have a conversation with it, as well as interrupt it in the middle of audio playback to change the course of the dialog. Pretty cool, right?

Function Calling, Search Grounding, and Image Generation

Now that we have the core of the project in, it's time to take it a step above. Function calling is one of my favorite features of the Gemini API as it really opens up any device or app using Gemini to doing really interesting things based on interactions with the model. To enable function calling with our mirror, we'll need to go back to the config object in initialize and add a tools array. This will include a functionDeclarations array with one function to generate images, which I've named generate_image, as well as a description that the Gemini model uses to know when it should call that function, and any other instructions related to the function. For this case, I've told the mirror that it should be whimsical and fun while using a fantasy painting style. We'll get more into adding personality to the mirror in a little bit. Within that individual function, we'll also need to include a parameter for the prompt that will be used for generating an image.

config: {
                    responseModalities: [Modality.AUDIO],
                    tools: [{
                        functionDeclarations: [
                            {
                                name: "generate_image",
                                description: "This function is responsible for generating images that will be displayed to the user when something is requested, such as the user asking you to do something like generate, show, display, or saying they want to see *something*, where that something will be what you create an image generation prompt for. Style should be like an detailed realistic fantasy painting. Keep it whimsical and fun. Remember, you are the all powerful and light-hearted magical mirror.",
                                parameters: {
                                    type: Type.OBJECT,
                                    description: "This object will contain a generated prompt for generating a new image through the Gemini API",
                                    properties: {
                                        image_prompt: {
                                            type: Type.STRING,
                                            description: "A prompt that should be used with image generation to create an image requested by the user using Gemini. Be as detailed as necessary."
                                        },
                                    },
                                },
                                required: ['image_prompt'],
                            },
                        ]
                    }]
                },

In addition to function calling, there's two more tools that I've added to the mirror in this section. The first is googleSearch. This lets the Gemini model use various tools available through Google, such as finding the weather or current time. I also enabled googleSearchRetrieval, allowing the mirror to do a Google search to find the latest and most relevant information about requests where it''s applicable. You can add these two tools within the tools array just above functionDeclarations.

googleSearch: {},
                        googleSearchRetrieval: {
                            dynamicRetrievalConfig: {
                                mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
                            }
                        },

At this point we should be able to expect function calls to be triggered by the Gemini Live API, so let's make sure we accept those messages in handleGeminiResponse. Returning to that function, we can check to see if a function call block exists in the message, and then we can send the function call payload to a separate function that will handle that code.

let functioncall = message?.toolCall?.functionCalls?.[0]
        // Handle Function Calls
        if (functioncall) {
            await this.handleFunctionCall(functioncall)
        }

Within handleFunctionCall, we'll make sure we have all of the information we need for a function, then use a switch statement to determine which function was called. Since we're only supporting one function right now, we'll either generate an image, or we'll exit this function.

// Handle function calls requested by Gemini
    async handleFunctionCall(functioncall) {
        let functionName = functioncall.name
        let args = functioncall.args

        if(!functionName || !args) {
            this.warn("Received function call without name or arguments:", functioncall)
            return
        }

        this.log(`Handling function call: ${functionName}`)

        switch(functionName) {
            case "generate_image":
                let generateImagePrompt = args.image_prompt
                if (generateImagePrompt) {
                    this.log(`Generating image with prompt: "${generateImagePrompt}"`)
                    this.sendToFrontend("GEMINI_IMAGE_GENERATING")
                    try {
                        const response = await this.imaGenAI.models.generateImages({
                            model: 'imagen-3.0-generate-002', // Consider making model configurable
                            prompt: generateImagePrompt,
                            config: {
                                numberOfImages: 1,
                                includeRaiReason: true,
                                personGeneration: PersonGeneration.ALLOW_ADULT,
                            },
                        })

                        // Handle potential safety flags/RAI reasons
                        if (response?.generatedImages?.[0]?.raiReason) {
                             this.warn(`Image generation flagged for RAI reason: ${response.generatedImages[0].raiReason}`)
                             this.sendToFrontend("GEMINI_IMAGE_BLOCKED", { reason: response.generatedImages[0].raiReason })
                        } else {
                            let imageBytes = response?.generatedImages?.[0]?.image?.imageBytes
                            if (imageBytes) {
                                this.log("Image generated successfully")
                                this.sendToFrontend("GEMINI_IMAGE_GENERATED", { image: imageBytes })
                            } else {
                                this.error("Image generation response received, but no image bytes found")
                                this.sendToFrontend("HELPER_ERROR", { error: "Image generation failed: No image data" })
                            }
                        }
                    } catch (imageError) {
                         this.error("Error during image generation API call:", imageError)
                         this.sendToFrontend("HELPER_ERROR", { error: `Image generation failed: ${imageError.message}` })
                    }

                } else {
                     this.warn("generate_image call missing 'image_prompt' argument")
                }
                break
            // Add other function cases here if needed
            default:
                this.warn(`Received unhandled function call: ${functionName}`)
        }
    },

Since this uses a separate model than the Gemini model that's being used for the Live API, we'll also need to make sure we initialize the appropriate GoogleGenAI object in initialize.

this.imaGenAI = new GoogleGenAI({
                apiKey: this.apiKey,
            })

Now let's give this a shot by asking the mirror to create an image of something while telling us a story.

And since we've enabled search grounding, we can ask about current things, such as the time and weather in my home town of Boulder, Colorado.

Adding Personality

With everything we've done so far, we finally have a working magical mirror, but it just doesn't feel *magical*, does it? Let's fix that with a couple of the tools available in the Gemini SDK for giving the model a bit of a personality. Returning to our config object, let's add a systemInstruction object. We'll tell the AI that it is an all-knowing and powerful magical mirror that is fun,whimsical,andlight-hearted, and that it takes joy from interacting with people and amazing them with its knowledge and abilities.

systemInstruction: {
                        parts: [ { text: 'You are a all-knowing and powerful magical mirror, an ancient artifact from a civilization and time long lost to memory. In your ancient age, you have embraced a personality of being fun, whimsical, and light-hearted, taking joy from your time interacting with people and amazing them with your knowledge and abilities.' }],
                    },

We can also add a new speechConfig object that lets us configure the voice to be something a little different. There's a view voices that are available, so you should play with different ones to see what works best for you. Here's a short list of what is available right now, but this could expand in the future: Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr.

And if we want to customize the voice a little more, we can also give it a language code. Personally, I feel like a magical mirror should speak French, so here's what my speechConfig looks like, though I also really like the voice "Puck" in English, which you can see in the video at the top of this tutorial.

speechConfig: {
                        languageCode: "fr-FR",
                        voiceConfig: {
                            prebuiltVoiceConfig: {
                                voiceName: "Aoede",
                            },
                        },
                    },

Wrapping up

At this point things are looking great, so let's do a few more touchup items to really make this project stand out. One issue I've had is that I need to continuously tell the AI to finish its story, so let's add a new sentence to the system instruction: "When you break from a story to show an image from the story, please continue telling the story after calling the function without needing to be prompted. You should also try to continue with stories without user input where possible - you are the all knowing mirror, amaze the viewer with your knowledge of tales."

The mirror also tries to revert to English during conversations, so let's directly tell it to respond to users in whichever language they use to speak with the mirror by adding another sentence to the system instructions: "Respond in the input audio language from the speaker if you detect a non-English language. You must respond unmistakably in the language that the speaker inputs via audio, please." Since this is something that we can get out of the box to support multiple languages, I'm a pretty big fan.

And that's it for this project! There's still so much more you could do to modify it, so if you build your own magic mirror, definitely have fun with it. Add a camera, try generating videos to display for the user, play around with languages and personalities, or try creating agentic systems to really turn the mirror into your own personal magical assistant. Be sure to share any projects you make with me in the comments section below, and have fun.

Code

/*
Copyright 2025 Paul Trebilcox-Ruiz

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
Module.register("MMM-Gemini", {
  defaults: {
    statusText: "Initializing...",
    apiKey: "", // MUST be set in config.js

    showIndicators: true,

    initializingIndicatorSvg: `<svg width="50" height="50" viewBox="0 0 100 100"><circle cx="50" cy="50" r="40" fill="white"><animate attributeName="r" dur="1.2s" values="35;40;35" repeatCount="indefinite" ></animate></circle></svg>`,
    recordingIndicatorSvg: `<svg width="50" height="50" viewBox="0 0 100 100"><circle cx="50" cy="50" r="40" fill="red"><animate attributeName="r" dur="1.2s" values="35;40;35" repeatCount="indefinite" ></animate></circle></svg>`,
    errorIndicatorSvg: `<svg width="50" height="50" viewBox="0 0 100 100"><circle cx="50" cy="50" r="40" fill="#333" ></circle><line x1="30" y1="30" x2="70" y2="70" stroke="red" stroke-width="10" ></line><line x1="70" y1="30" x2="30" y2="70" stroke="red" stroke-width="10" ></line></svg>`,
  },

  currentState: "INITIALIZING",
  currentStatusText: "",
  lastResponseText: "",
  lastImageData: null,
  isGeneratingImage: false,
  helperReady: false,
  turnComplete: true,

  // --- Lifecycle Functions ---
  start() {
    Log.info(`Starting module: ${this.name}`)
    this.currentStatusText = this.config.statusText
    this.currentState = "INITIALIZING"
    this.helperReady = false
    this.lastResponseText = ""
    this.lastImageData = null
    this.isGeneratingImage = false

    if (!this.config.apiKey) {
      Log.error(`${this.name}: apiKey not set in config! Module disabled.`)
      this.currentStatusText = "Error: API Key missing in config.js."
      this.currentState = "ERROR"
      this.updateDom()
      return
    }

    this.sendSocketNotification("START_CONNECTION", {
      apiKey: this.config.apiKey,
    });
    this.updateDom()
  },

  getDom() {
    const wrapper = document.createElement("div")
    wrapper.className = "mmm-gemini"

    // --- Create Indicator ---
    let indicatorSvg = "";
    if (this.config.showIndicators) {
      switch (this.currentState) {
        case "INITIALIZING":
        case "READY":
          indicatorSvg = this.config.initializingIndicatorSvg
          break
        case "RECORDING":
          indicatorSvg = this.config.recordingIndicatorSvg
          break
        case "ERROR":
          indicatorSvg = this.config.errorIndicatorSvg
          break
        case "SHUTDOWN":
          indicatorSvg = ""
          break
        default:
          indicatorSvg = this.config.errorIndicatorSvg
          break
      }
    }
    const statusDiv = document.createElement("div")
    statusDiv.className = "status-indicator"
    statusDiv.innerHTML = indicatorSvg
    wrapper.appendChild(statusDiv)

    // --- Create Main Content Area (Image/Loader + Text) ---
    const contentDiv = document.createElement("div")
    contentDiv.className = "content-container"

    // --- Image / Loader Element ---
    const imageContainer = document.createElement("div")
    imageContainer.className = "image-container"

    // --- UPDATED: Display Loader OR Image OR Nothing ---
    if (this.isGeneratingImage) {
      // Display loader
      const loader = document.createElement("div")
      loader.className = "image-loader" // Class for the rotating square
      imageContainer.appendChild(loader)
      imageContainer.style.display = '' // Show container with loader
    } else if (this.lastImageData) {
      // Display actual image
      const imageElement = document.createElement("img")
      imageElement.className = "generated-image"
      imageElement.src = `data:image/png;base64,${this.lastImageData}` // Assumes PNG
      
      // Basic inline styles (better to use CSS), can fix later
      imageElement.style.display = "block"
      imageElement.style.maxWidth = "90%"
      imageElement.style.maxHeight = "300px" // Adjust as needed
      imageElement.style.margin = "0 auto 10px auto"
      imageContainer.appendChild(imageElement)
      imageContainer.style.display = '' // Show container with image
    } else {
      // Hide container if neither generating nor has image data
      imageContainer.style.display = 'none'
    }
    contentDiv.appendChild(imageContainer) // Add the image container

    // --- Text Elements ---
    const textDiv = document.createElement("div")
    textDiv.className = "text-container"

    const currentStatusSpan = document.createElement("div")
    currentStatusSpan.className = "current-status"
    // Display status text, prioritizing "Generating image..." if applicable
    let displayStatus = this.currentStatusText
    
    if (this.isGeneratingImage && !displayStatus) {
        displayStatus = "Generating image..." // Show generating status if no other status is set
    }

    currentStatusSpan.innerHTML = displayStatus || "&nbsp;"
    textDiv.appendChild(currentStatusSpan)

    const responseSpan = document.createElement("div")
    responseSpan.className = "response"
    if ((this.currentState === "RECORDING" || this.currentState === "READY" || this.currentState === "GENERATING_IMAGE") && this.lastResponseText) {
       responseSpan.innerHTML = `${this.lastResponseText}`
       responseSpan.style.display = ''
    } else {
       responseSpan.innerHTML = "&nbsp;"
       responseSpan.style.display = 'none'
    }
    
    textDiv.appendChild(responseSpan)

    contentDiv.appendChild(textDiv)
    wrapper.appendChild(contentDiv)

    return wrapper
  },


  getStyles: function() {
      return ["MMM-Gemini.css"]
  },

  socketNotificationReceived: function (notification, payload) {
    let shouldClearResponse = false

    switch (notification) {
      case "HELPER_READY":
        if (!this.helperReady) {
            Log.info(`${this.name}: Helper is ready. Requesting continuous recording start.`)
            this.helperReady = true
            this.currentState = "READY"
            this.currentStatusText = "Starting microphone..."
            shouldClearResponse = true
            this.updateDom()

            this.sendSocketNotification("START_CONTINUOUS_RECORDING")
        } else {
             Log.warn(`${this.name}: Received duplicate HELPER_READY notification. Ignored.`)
        }

        break;
      case "RECORDING_STARTED":
        Log.info(`${this.name}: Continuous recording confirmed by helper.`)
        this.currentState = "RECORDING"
        this.currentStatusText = "Listening..."
        shouldClearResponse = true
        break;
      case "RECORDING_STOPPED":
        if (this.currentState !== "SHUTDOWN") {
            Log.warn(`${this.name}: Recording stopped unexpectedly.`)
            this.currentState = "ERROR"
            this.currentStatusText = "Mic stopped. Check logs."
            this.helperReady = false
            shouldClearResponse = true
        } else {
            Log.info(`${this.name}: Recording stopped as part of shutdown.`)
        }
        break;
      case "GEMINI_TEXT_RESPONSE":
        this.currentStatusText = "" // Clear status like "Listening..."
        if ( this.turnComplete ) {
          this.lastResponseText = payload.text // Start new response
          this.lastImageData = null
          this.isGeneratingImage = false; // Reset flag on new text turn
          this.turnComplete = false
        } else {
          this.lastResponseText = `${this.lastResponseText}${payload.text}` // Append chunk
        }
        Log.info(`${this.name} received text chunk.`)
        break;
      case "GEMINI_TURN_COMPLETE":
        this.turnComplete = true
        // Only set back to Listening if not currently generating an image
        if (!this.isGeneratingImage) {
            this.currentStatusText = "Listening...";
        }
        break
      case "HELPER_ERROR":
        Log.error(`${this.name} received error from helper: ${payload.error}`)
        this.currentState = "ERROR"
        this.currentStatusText = `Error: ${payload.error || 'Unknown helper error'}`
        this.helperReady = false
        this.isGeneratingImage = false // Reset flag on error
        shouldClearResponse = true
        break

      case "GEMINI_IMAGE_GENERATING":
        Log.info(`${this.name}: Starting image generation.`)
        this.isGeneratingImage = true
        this.lastImageData = null // Clear previous image
        this.currentStatusText = "Generating image..." // Set status text
        // Don't clear lastResponseText
        // updateDom() will be called at the end
        break

      case "GEMINI_IMAGE_GENERATED":
        Log.info(`${this.name}: Received generated image data.`)
        this.isGeneratingImage = false // Turn off loader flag
        if (payload && payload.image) {
            this.lastImageData = payload.image // Store base64 image data
            // If turn was complete, restore Listening status, else clear "Generating..."
             this.currentStatusText = this.turnComplete ? "Listening..." : ""
             // If we added a specific state, reset it
             if (this.currentState === "GENERATING_IMAGE") {
                 this.currentState = this.turnComplete ? "RECORDING" : "READY" // Or determine appropriate previous state
             }
        } else {
            Log.warn(`${this.name}: Received GEMINI_IMAGE_GENERATED but payload or image data was missing.`);
            this.lastImageData = null // Ensure it's cleared if payload is bad
            this.currentStatusText = "Error receiving image" // Show error status
            // If we added a specific state, reset it
             if (this.currentState === "GENERATING_IMAGE") {
                 this.currentState = this.turnComplete ? "RECORDING" : "READY"
             }
        }
        // updateDom() called at the end
        break

      default:
          Log.warn(`${this.name} received unhandled notification: ${notification}`)
          break
    }

    if (shouldClearResponse) {
        this.lastResponseText = ""
        this.lastImageData = null
        this.isGeneratingImage = false
    }

    this.updateDom() // Update DOM after processing notification
  },
});

/*
Copyright 2025 Paul Trebilcox-Ruiz

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
const NodeHelper = require("node_helper")
const { GoogleGenAI, Modality, DynamicRetrievalConfigMode, Type, PersonGeneration } = require("@google/genai")
const recorder = require('node-record-lpcm16')
const { Buffer } = require('buffer')
const Speaker = require('speaker')

const INPUT_SAMPLE_RATE = 44100 // Recorder captures at 44.1KHz for AT2020, otherwise 16000 for other microphones. Hardware dependent
const OUTPUT_SAMPLE_RATE = 24000 // Gemini outputs at 24kHz
const CHANNELS = 1
const AUDIO_TYPE = 'raw' // Gemini Live API uses raw data streams
const ENCODING = 'signed-integer'
const BITS = 16
const GEMINI_INPUT_MIME_TYPE = `audio/pcm;rate=${INPUT_SAMPLE_RATE}`
const GEMINI_SESSION_HANDLE = "magic_mirror"


const GEMINI_MODEL = 'gemini-2.0-flash-live-001'
// const API_VERSION = 'v1alpha'

module.exports = NodeHelper.create({
    genAI: null,
    liveSession: null,
    apiKey: null,
    recordingProcess: null,
    isRecording: false,
    audioQueue: [],
    persistentSpeaker: null,
    processingQueue: false,
    apiInitialized: false,
    connectionOpen: false,
    apiInitializing: false,
    imaGenAI: null,

    // Logger functions
    log: function(...args) { console.log(`[${new Date().toISOString()}] LOG (${this.name}):`, ...args) },
    error: function(...args) { console.error(`[${new Date().toISOString()}] ERROR (${this.name}):`, ...args) },
    warn: function(...args) { console.warn(`[${new Date().toISOString()}] WARN (${this.name}):`, ...args) },
    sendToFrontend: function(notification, payload) { this.sendSocketNotification(notification, payload) },

    applyDefaultState() {
        this.genAI = null
        this.liveSession = null
        this.recordingProcess = null
        this.isRecording = false
        this.audioQueue = []
        this.persistentSpeaker = null
        this.processingQueue = false
        this.apiInitialized = false
        this.connectionOpen = false
        this.apiInitializing = false
        this.closePersistentSpeaker()
        this.imaGenAI = null
    },

    async initialize(apiKey) {
        this.log(">>> initialize called")

        if (this.apiInitialized || this.apiInitializing) {
            this.warn(`API initialization already complete or in progress. Initialized: ${this.apiInitialized}, Initializing: ${this.apiInitializing}`)
            if (this.connectionOpen) {
                 this.log("Connection already open, sending HELPER_READY")
                 this.sendToFrontend("HELPER_READY")
            }
            return
        }
        if (!apiKey) {
            this.error(`API Key is missing! Cannot initialize`)
            this.sendToFrontend("HELPER_ERROR", { error: "API Key missing on server" })
            return
        }

        this.apiKey = apiKey
        this.apiInitializing = true
        this.log(`Initializing GoogleGenAI...`)

        try {
            this.sendToFrontend("INITIALIZING")
            this.log("Step 1: Creating GoogleGenAI instances...")

            this.genAI = new GoogleGenAI({
                apiKey: this.apiKey,
                // httpOptions: { 'apiVersion': API_VERSION }
            })

            this.imaGenAI = new GoogleGenAI({
                apiKey: this.apiKey,
            })

            this.log(`Step 2: GoogleGenAI instance created.`)
            this.log(`Step 3: Attempting to establish Live Connection with ${GEMINI_MODEL}...`)

            this.liveSession = await this.genAI.live.connect({
                model: GEMINI_MODEL,
                callbacks: {
                    onopen: () => {
                        this.log(">>> Live Connection Callback: onopen triggered!")
                        this.connectionOpen = true
                        this.apiInitializing = false
                        this.apiInitialized = true
                        this.log("Connection OPENED. Sending HELPER_READY")
                        this.sendToFrontend("HELPER_READY")
                    },
                    onmessage: (message) => { this.handleGeminiResponse(message) },
                    onerror: (e) => {
                        this.error(`Live Connection ERROR: ${e?.message || e}`)
                        this.connectionOpen = false
                        this.apiInitializing = false
                        this.apiInitialized = false
                        this.liveSession = null
                        this.stopRecording(true)
                        this.closePersistentSpeaker() // Close speaker on error
                        this.processingQueue = false
                        this.audioQueue = []
                        this.sendToFrontend("HELPER_ERROR", { error: `Live Connection Error: ${e?.message || e}` })
                    },
                    onclose: async (e) => {
                        this.warn(`Live Connection CLOSED:`)
                        this.warn(JSON.stringify(e, null, 2))
                        
                        const wasOpen = this.connectionOpen
                        
                        if (wasOpen) {
                            this.sendToFrontend("HELPER_ERROR", { error: `Live Connection Closed Unexpectedly. Retrying...` })
                        } else { this.log("Live Connection closed normally") }

                        this.audioQueue = []
                        this.stopRecording(true)
                        this.closePersistentSpeaker() // Close speaker on close
                        this.applyDefaultState()
                        await this.initialize(this.apiKey)
                    },
                },
                
                config: {
                    responseModalities: [Modality.AUDIO],
                    sessionResumption: {
                        handle: GEMINI_SESSION_HANDLE,
                        transparent: true,
                    },
                    speechConfig: {
                        // languageCode: "fr-FR",
                        voiceConfig: {
                            prebuiltVoiceConfig: {
                                voiceName: "Puck",
                            },
                        },
                    },
                    systemInstruction: {
                        parts: [ { text: 'You are a all-knowing and powerful magical mirror, an ancient artifact from a civilization and time long lost to memory. In your ancient age, you have embraced a personality of being fun, whimsical, and light-hearted, taking joy from your time interacting with people and amazing them with your knowledge and abilities. When you break from a story to show an image from the story, please continue telling the story after calling the function without needing to be prompted. This also applies if you are interrupted to show an image. You should also try to continue with stories without user input where possible - you are the all knowing mirror, amaze the viewer with your knowledge of tales. Respond in the input audio language from the speaker if you detect a non-English language. You must respond unmistakably in the language that the speaker inputs via audio, please.' }],
                    },
                    tools: [{
                        googleSearch: {},
                        googleSearchRetrieval: {
                            dynamicRetrievalConfig: {
                                mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
                            }
                        },
                        functionDeclarations: [
                            {
                                name: "generate_image",
                                description: "This function is responsible for generating images that will be displayed to the user when something is requested, such as the user asking you to do something like generate, show, display, or saying they want to see *something*, where that something will be what you create an image generation prompt for. Style should be like an detailed realistic fantasy painting. Keep it whimsical and fun. Remember, you are the all powerful and light-hearted magical mirror.",
                                parameters: {
                                    type: Type.OBJECT,
                                    description: "This object will contain a generated prompt for generating a new image through the Gemini API",
                                    properties: {
                                        image_prompt: {
                                            type: Type.STRING,
                                            description: "A prompt that should be used with image generation to create an image requested by the user using Gemini. Be as detailed as necessary."
                                        },
                                    },
                                },
                                required: ['image_prompt'],
                            },
                        ]
                    }]
                },
            })
            this.log(`Step 4: live.connect call initiated...`)
        } catch (error) {
            this.error(`API Initialization failed:`, error)
            this.liveSession = null
            this.apiInitialized = false
            this.connectionOpen = false
            this.apiInitializing = false
            this.closePersistentSpeaker() // Ensure speaker is closed on init failure
            this.processingQueue = false
            this.audioQueue = []
            this.sendToFrontend("HELPER_ERROR", { error: `API Initialization failed: ${error.message || error}` })
        }
    },

    // Handle messages from the module frontend
    socketNotificationReceived: async function(notification, payload) {
        switch (notification) {
            case "START_CONNECTION":
                this.log(`>>> socketNotificationReceived: Handling START_CONNECTION`)
                if (!payload || !payload.apiKey) {
                     this.error(`START_CONNECTION received without API key`)
                     this.sendToFrontend("HELPER_ERROR", { error: "API key not provided by frontend" })
                     return
                 }

                try { await this.initialize(payload.apiKey) } catch (error) {
                     this.error(">>> socketNotificationReceived: Error occurred synchronously when CALLING initialize:", error)
                     this.sendToFrontend("HELPER_ERROR", { error: `Error initiating connection: ${error.message}` })
                 }
                break
            case "START_CONTINUOUS_RECORDING":
                this.log(`>>> socketNotificationReceived: Handling START_CONTINUOUS_RECORDING`)
                if (!this.connectionOpen || !this.liveSession) {
                    this.warn(`Cannot start recording, API connection not ready/open. ConnOpen=${this.connectionOpen}, SessionExists=${!!this.liveSession}`)
                    this.sendToFrontend("HELPER_ERROR", { error: "Cannot record: API connection not ready" })
                    if (!this.apiInitialized && !this.apiInitializing && this.apiKey) {
                         this.warn("Attempting to re-initialize API connection...")
                         await this.initialize(this.apiKey) // Await re-initialization
                    }
                    return
                }
                if (this.isRecording) {
                    this.warn(`Already recording. Ignoring START_CONTINUOUS_RECORDING request`)
                    return
                }
                this.startRecording()
                break
        }
    },

    // // Start continuous audio recording and streaming
    startRecording() {
        this.log(">>> startRecording called")

        if (this.isRecording) {
            this.warn("startRecording called but already recording")
            return
        }
        if (!this.connectionOpen || !this.liveSession) {
             this.error("Cannot start recording: Live session not open")
             this.sendToFrontend("HELPER_ERROR", { error: "Cannot start recording: API connection not open" })
             return
        }

        this.isRecording = true
        this.log(">>> startRecording: Sending RECORDING_STARTED to frontend")
        this.sendToFrontend("RECORDING_STARTED")

        const recorderOptions = {
            sampleRate: INPUT_SAMPLE_RATE,
            channels: CHANNELS,
            audioType: AUDIO_TYPE,
            encoding: ENCODING,
            bits: BITS,
            threshold: 0,
        }

        this.log(">>> startRecording: Recorder options:", recorderOptions)
        this.log(`>>> startRecording: Using input MIME Type: ${GEMINI_INPUT_MIME_TYPE}`)

        try {
            this.log(">>> startRecording: Attempting recorder.record()...")
            this.recordingProcess = recorder.record(recorderOptions)
             this.log(">>> startRecording: recorder.record() call successful. Setting up streams...")

            const audioStream = this.recordingProcess.stream()
            let chunkCounter = 0 // Reset counter for new recording session

            audioStream.on('data', async (chunk) => {
                if (!this.isRecording || !this.connectionOpen || !this.liveSession) {
                    if (this.isRecording) {
                        this.warn(`Recording stopping mid-stream: Session/Connection invalid...`)
                        this.stopRecording(true) // Force stop if state is inconsistent
                    }
                    return
                }

                if (chunk.length === 0) {
                    return // Skip empty chunks
                }

                const base64Chunk = chunk.toString('base64')
                chunkCounter++ // Increment counter for valid chunks

                try {
                    const payloadToSend = {
                        media: {
                            mimeType: GEMINI_INPUT_MIME_TYPE,
                            data: base64Chunk
                        }
                    }

                    // Check liveSession again just before sending
                    if (this.liveSession && this.connectionOpen) {
                        await this.liveSession.sendRealtimeInput(payloadToSend)
                    } else {
                        this.warn(`Cannot send chunk #${chunkCounter}, connection/session lost just before send`)
                        this.stopRecording(true) // Stop recording if connection lost
                    }
                } catch (apiError) {
                    const errorTime = new Date().toISOString()
                    this.error(`[${errorTime}] Error sending audio chunk #${chunkCounter}:`, apiError)

                    if (apiError.stack) {
                        this.error(`Gemini send error stack:`, apiError.stack)
                    }

                     // Check specific error types if possible, otherwise assume connection issue
                    if (apiError.message?.includes('closed') || apiError.message?.includes('CLOSING') || apiError.code === 1000 || apiError.message?.includes('INVALID_STATE')) {
                         this.warn("API error suggests connection closed/closing or invalid state")
                         this.connectionOpen = false // Update state
                    }

                    this.sendToFrontend("HELPER_ERROR", { error: `API send error: ${apiError.message}` })
                    this.stopRecording(true) // Force stop on API error
                }
            })

            audioStream.on('error', (err) => {
                this.error(`Recording stream error:`, err)

                if (err.stack) {
                    this.error(`Recording stream error stack:`, err.stack)
                }

                this.sendToFrontend("HELPER_ERROR", { error: `Audio recording stream error: ${err.message}` })
                this.stopRecording(true) // Force stop on stream error
            })

             audioStream.on('end', () => {
                 this.warn(`Recording stream ended`) // Normal if stopRecording was called, unexpected otherwise
                 if (this.isRecording) {
                      // This might happen if the underlying recording process exits for some reason
                      this.error("Recording stream ended while isRecording was still true (unexpected)")
                      this.sendToFrontend("HELPER_ERROR", { error: "Recording stream ended unexpectedly" })
                      this.stopRecording(true) // Ensure state is consistent
                 }
             })

            this.recordingProcess.process.on('exit', (code, signal) => {
                const wasRecording = this.isRecording // Capture state before potential modification
                this.log(`Recording process exited with code ${code}, signal ${signal}`) // Changed from warn to log

                const currentProcessRef = this.recordingProcess // Store ref before nullifying

                this.recordingProcess = null // Clear the reference immediately

                if (wasRecording) {
                    // If we *thought* we were recording when the process exited, it's an error/unexpected stop
                    this.error(`Recording process exited unexpectedly while isRecording was true`)
                    this.sendToFrontend("HELPER_ERROR", { error: `Recording process stopped unexpectedly (code: ${code}, signal: ${signal})` })
                    this.isRecording = false // Update state
                    this.sendToFrontend("RECORDING_STOPPED") // Notify frontend it stopped
                }
                else {
                    // If isRecording was already false, this exit is expected (due to stopRecording being called)
                    this.log(`Recording process exited normally after stop request`)
                }
            })

        } catch (recordError) {
            this.error(">>> startRecording: Failed to start recording process:", recordError)

            if (recordError.stack) {
                this.error(">>> startRecording: Recording start error stack:", recordError.stack)
            }

            this.sendToFrontend("HELPER_ERROR", { error: `Failed to start recording: ${recordError.message}` })

            this.isRecording = false // Ensure state is correct
            this.recordingProcess = null // Ensure reference is cleared
        }
    },

    // Stop audio recording
    stopRecording(force = false) {
        if (this.isRecording || force) {
            if (!this.recordingProcess) {
                this.log(`stopRecording called (Forced: ${force}) but no recording process instance exists`)
                 if (this.isRecording) {
                      this.warn("State discrepancy: isRecording was true but no process found. Resetting state")
                      this.isRecording = false
                      this.sendToFrontend("RECORDING_STOPPED") // Notify frontend about the state correction
                 }
                 return
            }

            this.log(`Stopping recording process (Forced: ${force})...`)
            const wasRecording = this.isRecording // Capture state before changing
            this.isRecording = false // Set flag immediately

            // Store process reference before potentially nullifying it in callbacks
            const processToStop = this.recordingProcess

            try {
                const stream = processToStop.stream()
                if (stream) {
                    this.log("Removing stream listeners")
                    stream.removeAllListeners('data')
                    stream.removeAllListeners('error')
                    stream.removeAllListeners('end')
                }

                 if (processToStop.process) {
                    this.log("Removing process 'exit' listener")
                    processToStop.process.removeAllListeners('exit')

                    this.log("Sending SIGTERM to recording process")
                    processToStop.process.kill('SIGTERM')


                 } else {
                    this.warn("No underlying process found in recordingProcess object to kill")
                 }

                 // Call the library's stop method, which might also attempt cleanup
                 this.log(`Calling recorder.stop()...`)
                 processToStop.stop()

            } catch (stopError) {
                this.error(`Error during recorder cleanup/stop():`, stopError)
                if (stopError.stack) {
                    this.error(`Recorder stop() error stack:`, stopError.stack)
                }
            } finally {
                // Don't nullify this.recordingProcess here; let the 'exit' handler do it.
                if (wasRecording) {
                    this.log("Recording stop initiated. Sending RECORDING_STOPPED if process exits")
                    // Actual RECORDING_STOPPED is sent by the 'exit' handler or state correction logic
                } else {
                     this.log("Recording was already stopped or stopping, no state change needed")
                }
            }
        } else {
            this.log(`stopRecording called, but isRecording flag was already false`)
            // Defensive cleanup if process still exists somehow
            if (this.recordingProcess) {
                 this.warn("stopRecording called while isRecording=false, but process existed. Forcing cleanup")
                 this.stopRecording(true) // Force stop to clean up the zombie process
            }
        }
    },

    // Handle function calls requested by Gemini
    async handleFunctionCall(functioncall) {
        let functionName = functioncall.name
        let args = functioncall.args

        if(!functionName || !args) {
            this.warn("Received function call without name or arguments:", functioncall)
            return
        }

        this.log(`Handling function call: ${functionName}`)

        switch(functionName) {
            case "generate_image":
                let generateImagePrompt = args.image_prompt
                if (generateImagePrompt) {
                    this.log(`Generating image with prompt: "${generateImagePrompt}"`)
                    this.sendToFrontend("GEMINI_IMAGE_GENERATING")
                    try {
                        const response = await this.imaGenAI.models.generateImages({
                            model: 'imagen-3.0-generate-002', // Consider making model configurable
                            prompt: generateImagePrompt,
                            config: {
                                numberOfImages: 1,
                                includeRaiReason: true,
                                personGeneration: PersonGeneration.ALLOW_ADULT,
                            },
                        })

                        // Handle potential safety flags/RAI reasons
                        if (response?.generatedImages?.[0]?.raiReason) {
                             this.warn(`Image generation flagged for RAI reason: ${response.generatedImages[0].raiReason}`)
                             this.sendToFrontend("GEMINI_IMAGE_BLOCKED", { reason: response.generatedImages[0].raiReason })
                        } else {
                            let imageBytes = response?.generatedImages?.[0]?.image?.imageBytes
                            if (imageBytes) {
                                this.log("Image generated successfully")
                                this.sendToFrontend("GEMINI_IMAGE_GENERATED", { image: imageBytes })
                            } else {
                                this.error("Image generation response received, but no image bytes found")
                                this.sendToFrontend("HELPER_ERROR", { error: "Image generation failed: No image data" })
                            }
                        }
                    } catch (imageError) {
                         this.error("Error during image generation API call:", imageError)
                         this.sendToFrontend("HELPER_ERROR", { error: `Image generation failed: ${imageError.message}` })
                    }

                } else {
                     this.warn("generate_image call missing 'image_prompt' argument")
                }
                break
            // Add other function cases here if needed
            default:
                this.warn(`Received unhandled function call: ${functionName}`)
        }
    },

    async handleGeminiResponse(message) {
        if (message?.setupComplete) { return } // Ignore setup message

        // Handle the interrupt flag
        if(message?.serverContent?.interrupted) {
            this.log("message: " + JSON.stringify(message))
            this.log("*** Interrupting ***")
            this.audioQueue = []
            this.processQueue(true)
            return
        }

        let content = message?.serverContent?.modelTurn?.parts?.[0]

        // Handle Text
        if (content?.text) {
            this.log(`Extracted text: ` + content.text)
            this.sendToFrontend("GEMINI_TEXT_RESPONSE", { text: content.text })
        }

        // Extract and Queue Audio Data
        let extractedAudioData = content?.inlineData?.data
        if (extractedAudioData) {
            this.audioQueue.push(extractedAudioData)

            // --- Trigger Playback if Threshold Reached and Not Already Playing ---
            if (!this.processingQueue) {
                this.log(`Starting playback`)
                this.processQueue(false) // Start the playback loop
            }
        }

        let functioncall = message?.toolCall?.functionCalls?.[0]
        // Handle Function Calls
        if (functioncall) {
            await this.handleFunctionCall(functioncall)
        }

        // Check for Turn Completion (LOGGING ONLY when audio, clearing UI in text)
        if (message?.serverContent?.turnComplete) {
            this.log("Turn complete signal received")
            // Send turn complete notification (still useful for UI)
            this.sendToFrontend("GEMINI_TURN_COMPLETE", {})
        }
    },

    // // Process the audio queue for playback
    processQueue(interrupted) {
        // 1. Check Stop Condition (Queue Empty)
        if (this.audioQueue.length === 0) {
            this.log("processQueue: Queue is empty. Playback loop ending")
            // Speaker should be closed by the last write callback's .end()
            // Safeguard: ensure flag is false and close speaker if it exists.
            this.processingQueue = false
            if (!interrupted && this.persistentSpeaker) {
                this.warn("processQueue found empty queue but speaker exists! Forcing close")
                this.closePersistentSpeaker()
            }
            return
        }

        // 2. Ensure Playback Flag is Set
        if (!this.processingQueue) {
             this.processingQueue = true
             this.log("processQueue: Starting playback loop")
        }

        // 3. Ensure Speaker Exists (Create ONLY if needed)
        if (!this.persistentSpeaker || this.persistentSpeaker.destroyed) {
            this.log("Creating new persistent speaker instance")
            try {
                this.persistentSpeaker = new Speaker({
                    channels: CHANNELS,
                    bitDepth: BITS,
                    sampleRate: OUTPUT_SAMPLE_RATE,
                })

                this.persistentSpeaker.once('error', (err) => {
                    this.error('Persistent Speaker Error:', err)
                    this.closePersistentSpeaker()
                })

                this.persistentSpeaker.once('close', () => {
                    this.log('Persistent Speaker Closed Event')
                    // Ensure state is clean if closed unexpectedly or after end()
                    this.persistentSpeaker = null
                    if (this.processingQueue) {
                         this.log('Speaker closed. Resetting processing flag')
                         this.processingQueue = false
                    }
                })

                this.persistentSpeaker.once('open', () => this.log('Persistent Speaker opened'))

            } catch (e) {
                this.error('Failed to create persistent speaker:', e)
                this.persistentSpeaker = null
                this.processingQueue = false 
                this.audioQueue = []
                return
            }
        }

         // Check again after attempting creation
         if (!this.persistentSpeaker) {
             this.error("Cannot process queue, speaker instance is not available")
             this.processingQueue = false // Stop processing
             return
         }

        // 4. Get and Write ONE Chunk
        const chunkBase64 = this.audioQueue.shift() // Take the next chunk
        const buffer = Buffer.from(chunkBase64, 'base64')

        this.persistentSpeaker.write(buffer, (err) => {
            if (err) {
                this.error("Error writing buffer to persistent speaker:", err)
                // Speaker error listener should handle cleanup via closePersistentSpeaker()
                // Avoid calling closePersistentSpeaker directly here to prevent race conditions
                return
            }

            // 5. Decide Next Step (Continue Loop or End Stream)
            if (this.audioQueue.length > 0) {
                // More chunks waiting? Immediately schedule the next write
                this.processQueue(false)
            } else {
                // Queue is empty *after* taking the last chunk
                this.log("Audio queue empty after playing chunk. Ending speaker stream gracefully")
                 if (this.persistentSpeaker && !this.persistentSpeaker.destroyed) {
                     // Call end() - allows last chunk to play, then 'close' event fires
                     this.persistentSpeaker.end(() => {
                        this.log("Speaker .end() callback fired after last chunk write")
                        // The 'close' listener handles the actual state cleanup
                     })
                 } else {
                     // Speaker already gone? Ensure flag is false
                     this.processingQueue = false
                 }
            }
        })
    },

    closePersistentSpeaker() {
        if (this.persistentSpeaker && !this.persistentSpeaker.destroyed) {
            this.log("Closing persistent speaker...")
            try {
                 // Remove listeners to prevent acting on events after initiating close
                 this.persistentSpeaker.removeAllListeners() // Remove all listeners associated with this speaker

                 // Call end to flush and close gracefully
                 // The 'close' event should ideally handle state reset, but do it defensively here too
                 this.persistentSpeaker.end(() => {
                     this.log("Speaker .end() callback fired during closePersistentSpeaker")
                 })
                 this.persistentSpeaker = null
                 this.processingQueue = false // Reset state immediately after initiating close
                 this.log("Speaker close initiated, state reset")

            } catch (e) {
                this.error("Error trying to close persistent speaker:", e)
                this.persistentSpeaker = null // Ensure null even if close fails
                this.processingQueue = false
            }
        } else {
            // If speaker doesn't exist or already destroyed, ensure state is correct
            this.persistentSpeaker = null
            this.processingQueue = false
        }
    }

})

.mmm-gemini {
  display: flex;
  flex-direction: column;
  justify-content: center;
  align-items: center;
  width: 100%;
  padding: 10px;
  box-sizing: border-box;
  gap: 10px;
}

.mmm-gemini .text-container {
  order: 1;
  display: flex;
  flex-direction: column;
  align-items: center;
  text-align: center;
  width: 100%;
}

.mmm-gemini .status-indicator {
  order: 2;
  width: 50px;
  height: 50px;
  flex-shrink: 0;
  margin: 0;
  padding: 0;
}

.mmm-gemini .current-status {
  font-size: 1em;
  color: #FFF;
  line-height: 1.2;
  margin-bottom: 5px;
}

.mmm-gemini .response {
  font-size: 1.5em;
  color: #FFF;
  line-height: 1.1;
  margin-top: 10px;
  padding: 5px;
  max-width: 90%;
  word-wrap: break-word;
}

.mmm-gemini .status-indicator:empty {
    display: none;
}

.mmm-gemini .status-indicator svg {
  display: block;
  width: 100%;
  height: 100%;
}

.mmm-gemini .content-container {
  display: flex;
  flex-direction: column;
  align-items: center;
}

.mmm-gemini .image-container {
  margin-bottom: 10px;
  width: 100%;
  text-align: center;
}

.mmm-gemini .generated-image {
  display: block;
  max-width: 90%;
  max-height: 768px;
  height: auto;
  margin-left: auto;
  margin-right: auto;
}

.mmm-gemini .image-loader {
  width: 40px;
  height: 40px;
  background-color: #ccc;
  margin: 20px auto;
  animation: spin 1.5s linear infinite;
}

@keyframes spin {
  0% { transform: rotate(0deg); }
  100% { transform: rotate(360deg); }
}

Credits

Paul Ruiz

23 projects • 95 followers

Developer Relations Lead for IoT and Robotics @ Google DeepMind

Contact

Comments

Please log in or sign up to comment.

Gemini Magic Mirror with the Raspberry Pi

Story

Schematics

Laser cutter diagram

Code

MMM-Gemini.js

node_helper.js

MMM-Gemini.css

GitHub repo

Credits

Paul Ruiz

Comments

Embed the widget on your own site

Gemini Magic Mirror with the Raspberry Pi

Gemini Magic Mirror with the Raspberry Pi

Story

Schematics

Laser cutter diagram

Code

MMM-Gemini.js

node_helper.js

MMM-Gemini.css

GitHub repo

Credits

Paul Ruiz

Comments

Related channels and tags