If you’re not already aware, OpenAI’s GPT-3 is an advanced machine learning model that can be used in everything from sentiment classification to code completion. Access to GPT-3 used to be invite-only, but OpenAI very recently opened access to the general public. Now is the perfect time to include it in your next project!
The best way to learn is by example, so I’ll show you the procedure I went through to build an API for a project of mine. My goal was to build a voice assistant that would begrudgingly complete my home automation requests while also insulting me. A typical request would look something like this:
Hey computer, I’d like you to turn on the kitchen light for me.
And a typical response would be something like this:
I’ll turn on your kitchen light, but we both know you’re only in there to get food from the refrigerator.
You can view a video of the finished project below, and full build instructions for the project here.
Three natural language processing tasks need to be accomplished here: extracting the desired object from the prompt, extracting the desired state of that object, and formulating an insulting response that incorporates contextual information from the prompt.
I could likely use a dumber NLP model to accomplish the first two tasks, but the third task would be pretty difficult for anything but GPT-3 to do well. GPT-3’s power is that it can accomplish all these tasks at once in the same text-based API. Here’s how to do it.
Step 1: Creating an AccountStart by creating an account with OpenAI. Once that’s done, open up the Playground. This is where you can interact with GPT-3 directly — it’s a great place to experiment! Start by entering some text, and click on “Generate” at the bottom to see what GPT-3 does. Playing around with GPT-3 this way is fun, but not particularly useful. To make the model useful, you need to provide some structure to your query.
A typical query for my insulting voice assistant looks something like this:
A snarky voice assistant that delivers insulting responses to queries.
Prompt: turn on my bedroom lamp
Object: bedroom lamp
Desired State: 1
Response: What, you're too lazy to get off your bed and do it yourself?<|endoftext|>
Prompt: please extinguish the kitchen light
Object: kitchen light
Desired State: 0
Response: I bet you were getting food again, weren't you?<|endoftext|>
...[More examples]...
Prompt: engage the door lock
Object:
Let’s break this down step by step. The general structure of a GPT-3 query is as follows:
1. Model task overview
2. Query Examples
3. Query
Step 2: Model Task OverviewThe model task overview gives GPT-3 some context as to what you’d like it to accomplish. In my case, this is “A snarky voice assistant that delivers insulting responses to queries”. It may seem strange to describe in words to a machine learning model the task you’d like it to accomplish, this is the approach that OpenAI recommends in their documentation.
Step 3: Query ExamplesNext up are the examples. This is where you show GPT-3 how your API will work and the exact syntax it should follow. GPT-3 is a text completion model at its core, so having examples of how it should complete queries will help it a lot. There are many different ways to structure the query syntactically, but I have found that having some descriptive field text, followed by a colon, with fields separated by a newline works well.
In my specific case, I supply everything up to the “Object:” prompt and GPT-3 is expected to fill in the rest by extracting the desired object and state and coming up with a funny retort.
The last syntactic item to note here is the “<|endoftext|>” tag after the response field. GPT-3 can continue generating text until it hits a word limit or finds something called a stop token, which are user-specified pieces of text at which the model will stop generating after it outputs one. We’ll talk about these more later.
In the query above I give two examples, but I have omitted some for brevity. In general, I have found that including between three and eight examples works well for most tasks. The OpenAI documentation uses around this number.
Step 4: The QueryThe last component of your request should be the actual query you want the model to complete. In my case, this is the following:
Prompt: engage the door lock
Object:
Given the examples above this, GPT-3 should have more than enough context to figure out that you want it to generate something like the following text:
Prompt: engage the door lock
Object: Door Lock
Desired State: 1
Response: What, are you scared I'm going to break into your house or something?
You might have noticed that there’s no <|endoftext|> tag at the end of the response. Once GPT-3 encounters a stop token, it will stop generation and return all the text it generated without the stop token included. All you then need to do is parse this text and extract the desired fields, and we’re basically done!
Step 5: Model ParametersThe last items you’ll need to play around with are model parameters, which you will find in the right pane of the Playground window.
Engine is the model on which your query will be executed. I generally stick with Davinci, as this is the largest and most creative model. This creativity is really only needed for the response, so without this, another model would likely suffice.
Temperature is how wacky and creative you let the model be. If the temperature is too high, GPT-3 is likely to produce very erratic results and not follow your instructions. If the temperature is too low, the model will likely be repetitive and won’t produce anything interesting. The way to select this parameter is to play around with it and see what works best for your application.
Response Length dictates how long the responses from the model can be. This is measured in tokens, which roughly equate to half a word per token. All I needed in my case was very short responses, but if you want GPT-3 to generate paragraphs you can turn this parameter up.
Stop Sequence is what we discussed before: some text that the model will stop at if generated. The <|endoftext|> token is a special sequence that works well with GPT-3.
Step 6: Using OpenAI APIsThe Playground is great for experimentation, but it’s obviously not suitable for use in a real project. Instead, you can use a language-specific API to deliver requests to GPT-3 and receive responses.
OpenAI officially supports a Python and HTTP interface. There are also community-supported packages in many other languages. The best place to go for instructions on this is the OpenAI documentation.
You can see how I used the Python API for my project in GitHub here, with some of the code copied below:
response = openai.Completion.create(
engine="davinci",
prompt=prompt,
temperature=self.temperature,
max_tokens=self.resp_len,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0,
stop=[self.stop_token]
)
Parsing the response is not particularly difficult. You can view my code for doing this in GitHub.
Step 7: Tips & TricksOne interesting quirk of GPT-3 is that it sometimes “thinks for itself” and delivers a response that’s outside of the syntax you provided. With no error checking, this could cause your program to fail. You should always expect that the model will do something strange — make sure your text parsing has ample error checking!
Another thing to always consider with advanced text-completion models is model bias and safety. This is a huge can of worms to open, so I won’t go into too much detail on it; just know that there are literally thousands of papers written on the subject. If you’re just building a project for yourself this doesn’t matter too much, but if you’re planning on deploying your application you should think very carefully about how the bias present in the dataset on which the model was trained can present itself to an end-user.
That’s all! Let me know what you want to build with GPT-3 in the comments. To see my writing, check out my Medium profile or my website. If you’d like to see more of my projects, check out my Hackster profile.
Comments