Chat Completions — Prediction Guard

(Run this example in Google Colab here)

We briefly introduced few shot chat prompts in the basic prompting tutorial. However, chat is a special scenario when it comes to LLMs because: (1) it is a very frequently occuring use case; (2) there are many models fine-tuned specifically for chat; and (3) the handling of message threads, context, and instructions in chat prompts is always the same.

As such, Prediction Guard has specifically created a “chat completions” enpoint within its API and Python client. This tutorial will demonstrate how to easy create a simple chatbot with the chat completions endpoint.

We will use Python to show an example:

Dependencies and Imports

You will need to install Prediction Guard into your Python environment.

copy

$ $ pip install predictionguard

Now import PredictionGuard, setup your API Key, and create the client.

copy

1 import os
2 
3 from predictionguard import PredictionGuard
4 
5 # Set your Prediction Guard token as an environmental variable.
6 os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
7 
8 client = PredictionGuard()

Basic Chat Completion

Chat completions are enabled in the Prediction Guard API for only certain of the models. You don’t have to worry about special prompt templates when doing these completions as they are already implemented.

You can find out more about the available Models in the docs.

To perform a chat completion, you need to create an array of messages. Each message object should have a:

role - “system”, “user”, or “assistant”
content - the text associated with the message

You can utilize a single “system” role prompt to give general instructions to the bot. Then you should include the message memory from your chatbot in the message array. This gives the model the relevant context from the conversation to respond appropriately.

copy

1 messages = [
2     {
3         "role": "system",
4         "content": "You are a helpful assistant that provide clever and sometimes funny responses."
5     },
6     {
7         "role": "user",
8         "content": "What's up!"
9     },
10     {
11         "role": "assistant",
12         "content": "Well, technically vertically out from the center of the earth."
13     },
14     {
15         "role": "user",
16         "content": "Haha. Good one."
17     }
18 ]
19 
20 result = client.chat.completions.create(
21     model="neural-chat-7b-v3-3",
22     messages=messages
23 )
24 
25 print(json.dumps(
26     result,
27     sort_keys=True,
28     indent=4,
29     separators=(',', ': ')
30 ))

Simple Chatbot

Here we will show the chat functionality with the most simple of chat UI, which just asks for messages and prints the message thread. We will create an evolving message thread and respond with the chat completion portion of the Python client highlighted above.

copy

1 print('Welcome to the Chatbot! Let me know how can I help you')
2 
3 while True:
4     print('')
5     request = input('User' + ': ')
6     if request=="Stop" or request=='stop':
7         print('Bot: Bye!')
8         break
9     else:
10         messages.append({
11             "role": "user",
12             "content": request
13         })
14 
15         response = client.chat.completions.create(
16             model="neural-chat-7b-v3-3",
17             messages=messages
18         )['choices'][0]['message']['content'].split('\n')[0].strip()
19 
20         messages.append({
21             "role": "assistant",
22             "content": response
23         })
24         print('Bot: ', response)

Using The SDKs

You can also try these examples using the other official SDKs:

Python, Go, Rust, JS, HTTP

1	import os
2
3	from predictionguard import PredictionGuard
4
5	# Set your Prediction Guard token as an environmental variable.
6	os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
7
8	client = PredictionGuard()

1	messages = [
2	{
3	"role": "system",
4	"content": "You are a helpful assistant that provide clever and sometimes funny responses."
5	},
6	{
7	"role": "user",
8	"content": "What's up!"
9	},
10	{
11	"role": "assistant",
12	"content": "Well, technically vertically out from the center of the earth."
13	},
14	{
15	"role": "user",
16	"content": "Haha. Good one."
17	}
18	]
19
20	result = client.chat.completions.create(
21	model="neural-chat-7b-v3-3",
22	messages=messages
23	)
24
25	print(json.dumps(
26	result,
27	sort_keys=True,
28	indent=4,
29	separators=(',', ': ')
30	))

1	print('Welcome to the Chatbot! Let me know how can I help you')
2
3	while True:
4	print('')
5	request = input('User' + ': ')
6	if request=="Stop" or request=='stop':
7	print('Bot: Bye!')
8	break
9	else:
10	messages.append({
11	"role": "user",
12	"content": request
13	})
14
15	response = client.chat.completions.create(
16	model="neural-chat-7b-v3-3",
17	messages=messages
18	)['choices'][0]['message']['content'].split('\n')[0].strip()
19
20	messages.append({
21	"role": "assistant",
22	"content": response
23	})
24	print('Bot: ', response)