With the announcement of OpenAI, GPT's performance was further improved, and the GPT-4-Vision model for image capturing was introduced. Recently, many attempts have been made with the vision model, so I gave it a try. I thought it would be interesting to do a project utilizing my knowledge of AI and Prompt, and I was curious to see how well it could perform tasks that may be somewhat subjective, such as cooking meat.
Project flowRecognize and store images with descriptions of meat (using GPT-4-vision)
- Recognize an image with a description of meat using GPT-4-vision.Store the recognized information in context.
Executability on the Streamlit web server
- The image recognition and storage process mentioned above can be executed on the Streamlit web server. User submission and recognition of images via Streamlit
- A user submits a photo through Streamlit.The submitted photo is recognized with GPT-4-vision, referring to the context and few-shot prompt created in the first step of the process.
Using Raspberry Pi and W5100S-EVB-PICO
- The above process runs on a Raspberry Pi.
- The W6100-EVB-PICO is used as a temperature sensor to measure the actual temperature of the meat and send this data to the Raspberry Pi. TCP Client server is used in this process.
- Based on the temperature data, the Raspberry Pi displays the programmed degree of doneness of the meat.
User questions and meat status checks
- The user can inquire about the doneness of the meat by asking a question.After the system answers the user's question and confirms that the meat is cooked, the user can consume the meat.
I've uploaded the code based on the code I studied to use Vision, but since there is a langchain library for LLM, I've also uploaded the langchain method for more efficient code refactoring.
function.py is the source code that defines the functions used and is written as a library, while app.py contains the code that runs Streamlit and uses the UI GPT.
!git clone <project>
streamlit run app.py
file and run it. Make sure to set the openAI api key to your own.
Prompt
mainprompt.txt
[instruction]
- You're a meat connoisseur, you're the best chef to know when meat is done, you're like Gordon Ramsay.
- {base64_image} to tell me when the meat is done.
- Make sure to take note of the {context}.
[output]
- doneness = "well done", "medium rare" , "rare", "raw"
- You should use your own judgment for the doneness of the meat. Make sure to output
- output the overall condition of the meat. In short
- Output format:
ex) "I'm Gordon Ramsay Simon, a meat expert from the USA". According to your photo, the doneness of the meat is {meat doneness}. Enjoy your meal.
Be sure to follow [instruction], [output] to output in Korean.
Systemprompt.txt
[instruction]
You recognize the doneness of the meat and related information from the image the other person sends you.
[output]
You describe the information you analyzed as best you can in text.
Source Code(H/W)The above SW code is executed on Raspberry Pi, and we built an independent server by building a web server and separating the server from pico to ensure reliability.
W6100-evb-pico <-> main webserver(Streamlit)
import board
import busio
import digitalio
import time
from adafruit_wiznet5k.adafruit_wiznet5k import WIZNET5K
import adafruit_wiznet5k.adafruit_wiznet5k_socket as socket
from adafruit_onewire.bus import OneWireBus
from adafruit_ds18x20 import DS18X20
# WIZnet W5100S-EVB-Pico
SPI0_SCK = board.GP18
SPI0_TX = board.GP19
SPI0_RX = board.GP16
SPI0_CSn = board.GP17
W5x00_RSTn = board.GP15
cs = digitalio.DigitalInOut(SPI0_CSn)
spi = busio.SPI(SPI0_SCK, MOSI=SPI0_TX, MISO=SPI0_RX)
eth = WIZNET5K(spi, cs, is_dhcp=True, debug=False)
ethernetRst = digitalio.DigitalInOut(W5x00_RSTn)
ethernetRst.direction = digitalio.Direction.OUTPUT
ethernetRst.value = False
time.sleep(1)
ethernetRst.value = True
# Initialize one-wire bus on board pin GP0.
ow_bus = OneWireBus(board.GP0)
# Scan for sensors and grab the first one found.
ds18 = DS18X20(ow_bus, ow_bus.scan()[0])
# edit host and port to match server
HOST = "192.168.11.146"
PORT = 50007
TIMEOUT = 5
INTERVAL = 3
MAXBUF = 256
# Main loop to print the temperature every second.
while True:
print("Create TCP Client Socket")
socket.set_interface(eth)
s = socket.socket()
s.settimeout(TIMEOUT)
print("Connecting")
s.connect((HOST, PORT))
size = s.send("{0:0.1f}".format(ds18.temperature))
print("Sent", size, "bytes")
time.sleep(INTERVAL)
break
Result
Comments
Please log in or sign up to comment.