We are going to build an API to serve our fine-tuned model for ease of use. I thought about building it over gradio or streamlit. However, in my experience, I have always seen models being served over backend frameworks like FastAPI, and Django. Streamlit could have been easier for me, but it's my responsibility to make you aware of up-to-date and industry-standard technologies.
Let us build an API using FastAPI. With the virtual environment activated, run pip install fastapi uvicorn
FastAPI is the backend framework and uvicorn is the web server serving fastapi. Now, create a file named main.py and put the below lines of code:
import json
from openai import OpenAI
from fastapi import FastAPI
client = OpenAI(api_key="sk-W-I-won't-tell-1")
app = FastAPI()
@app.post("/v1/chat/completions")
def chat_completions(user_prompt: str):
system_prompt = "Classify the given input text, \
return a JSON object containing the probability scores for: 'toxic', 'indecent', 'threat', 'offensive', 'erotic', and 'spam'.\
Please respond with only the JSON object, without any additional text or explanation"
model = "ft:gpt-3.5-turbo-0125:nofoobar::9JC2TBQr"
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
],
)
return json.loads(response.choices[0].message.content)
We import the necessary libraries, initialize the OpenAI client with an API key, create a FastAPI instance, and define a route for handling POST requests to /v1/chat/completions
. When a user sends a text prompt to this route, the application constructs a system prompt and a user prompt, sends them to the OpenAI API using the specified fine-tuned model, and returns the response as a JSON object containing the probability scores for the different categories.
Now you can visit: http://127.0.0.1:8000/docs
Try sending a request in the Swagger Documentation to the chat completion endpoint. You should receive a proper dictionary in return.