Over the last year, AI systems, like Google's Gemini, have changed the landscape of app development, and are continuing to do so with each new feature release. In this post I want to kick off a series that goes over how you can interact with the Gemini API from a simple IoT device - in this case an ESP-12E because it's what I had laying around and it's pretty cheap, but the concepts will carry over to any kind of device - to add intelligence to your projects.
To keep things simple for this introduction post, I will only go over how you would send text and get a response from the Gemini API. I get that the world probably doesn't need more chat bots, and I absolutely agree, but this is a foundational step for larger things that I will cover in later tutorial projects (like function calling, which I think is way more useful), so bear with me, things will get cooler soon!
I'm also not going to be a stickler for optimizations for a small project like this, so take what I put together with a grain of salt - I tend to subscribe to the philosophy of "make it work, make it right, make it fast, " but I also tend to stop once I get to "make it work" :)
Hardware
As I just mentioned, the only hardware I'm using is an ESP-12E, which supports wifi and bluetooth, though I think any ESP8266 will do the trick for this tutorial. Once you have an appropriate board plugged into your computer and the Arduino IDE connected to it, it's time to write some code to dosomething with it.
Setup
I'm using the four following libraries for this project:
#include <ESP8266WiFi.h>
#include <ESP8266HTTPClient.h>
#include <WiFiClientSecure.h>
#include <ArduinoJson.h>
The first library, ESP8266WiFi, as the name implies, is what you'll use to connect to your local wifi. ESP8266HTTPClient and WiFIClientSecure are used for making REST calls to the Gemini API, and ArduinoJson is a library that makes it reallyeasy to work with JSON data, such as the response you'll receive from the Gemini API.
To keep things easy, I also have a set of constant strings to add my home wifi SSID, wifi password, and my Gemini API key.
const String ssid = "ssid";
const String password = "wifi_password";
const String API_key = "gemini_api_key";
If you're new to working with Gemini, you can go to AI Studio to create a key, as well as try a variety of Gemini features across different models. At the time of this writing, Gemini 2.0 Flash is the latest if you want to try it out over there. This also allows up to 15 requests per minute and 1500 requests per day within the free tier.
With that out of the way, it's time to connect the board to wifi. I've put together a simple function called setupWifi() to handle this.
void setupWifi() {
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(1000);
Serial.print("...");
}
Serial.print("IP address: ");
Serial.println(WiFi.localIP());
}
Finally, my setup() function just connects to the serial monitor and calls this setupWifi() function.
void setup() {
Serial.begin(115200);
WiFi.mode(WIFI_STA);
WiFi.disconnect();
while (!Serial) ;
setupWifi();
}
Connecting to Gemini
Now let's get to the good part. To interact with Gemini from the ESP32, I'm going to use the REST API directly from a new function I called makeGeminiRequest(), which accepts a String for the text that will be sent to the API. Within that function you can start by creating the wifi and http clients that will be used for all of your networking work.
void makeGeminiRequest(String input) {
WiFiClientSecure client;
client.setInsecure();
HTTPClient http;
To keep things easy, I'm just setting the wifi client to insecure. If you're interested in network security, then you probably already know way more about this area than I do, so feel free to leave a comment here on Hackster with how you're able to use the client without setting setInsecure().
Next you can start the https request using the client and the URL for the Gemini model (in this case gemini-2.0-flash, though new models keep coming out, so you might have better options available when reading this later). You'll also send a header and a JSON payload representing your request, which in this case is just a textfield. Once the request is built, you can use POST to send it, and save the HTTP code for the response that you get back.
(for readability, the JSON body looks like this:)
{
"contents":[
{
"parts":[
{
"text":"Write a cute story about cats."
}
]
}
]
}
if (http.begin(client, "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=" + API_key)) {
http.addHeader("Content-Type", "application/json");
String payload = String("{\"contents\": [{\"parts\":[{\"text\":\"" + input + "\"}]}]}");
int httpCode = http.POST(payload);
http.end();
}
else {
Serial.print("Unable to connect\n");
}
Once that response is returned, you can retrieve the payload from the http object. In this example I'm printing the full JSON value for clarity, but you can skip that step and just deserialize the JSON like in this next code block, as well as dosomething with the response - though I'm just printing the response to the console as well.
if (httpCode == HTTP_CODE_OK) {
String payload = http.getString();
Serial.println("Payload: " + payload);
DynamicJsonDocument doc(2048);
deserializeJson(doc, payload);
String responseText = doc["candidates"][0]["content"]["parts"][0]["text"];
Serial.print("Response: ");
Serial.println(responseText);
} else {
Serial.println("Failed to reach Gemini");
}
Conclusion
And that's it! In this tutorial you learned how to connect an ESP32 board to the Gemini API for adding LLM intelligence to your IoT projects. You can test it out by sending a question to your ESP32 board through the serial monitor, and then the response will be printed out.
As more articles become available for this series, I'll add them here, but expect to see something soon on working with audio, as well as calling custom functions based on user input!
Comments
Please log in or sign up to comment.