Using LLMs

Chat Completions

(Run this example in Google Colab here)

We briefly introduced few shot chat prompts in the basic prompting tutorial. However, chat is a special scenario when it comes to LLMs because: (1) it is a very frequently occuring use case; (2) there are many models fine-tuned specifically for chat; and (3) the handling of message threads, context, and instructions in chat prompts is always the same.

As such, Prediction Guard has specifically created a “chat completions” enpoint within its API and Python client. This tutorial will demonstrate how to easy create a simple chatbot with the chat completions endpoint.

Dependencies and imports

Similar to the last notebook, you will need to install Prediction Guard and add your token.

$$ pip install predictionguard
1import os
3import predictionguard as pg
6os.environ['PREDICTIONGUARD_TOKEN'] = "<your access token>"

Basic chat completion

Chat completions are enabled in the Prediction Guard API for only certain of the models. You don’t have to worry about special prompt templates when doing these completions as they are already implemented.


To perform a chat completion, you need to create an array of messages. Each message object should have a:

  • role - “system”, “user”, or “assistant”
  • content - the text associated with the message

You can utilize a single “system” role prompt to give general instructions to the bot. Then you should include the message memory from your chatbot in the message array. This gives the model the relevant context from the conversation to respond appropriately.

1messages = [
2 {
3 "role": "system",
4 "content": "You are a helpful assistant that provide clever and sometimes funny responses."
5 },
6 {
7 "role": "user",
8 "content": "What's up!"
9 },
10 {
11 "role": "assistant",
12 "content": "Well, technically vertically out from the center of the earth."
13 },
14 {
15 "role": "user",
16 "content": "Haha. Good one."
17 }
20result = pg.Chat.create(
21 model="Neural-Chat-7B",
22 messages=messages
26 result,
27 sort_keys=True,
28 indent=4,
29 separators=(',', ': ')

Simple chatbot

Here we will show the chat functionality with the most simple of chat UI, which just asks for messages and prints the message thread. We will create an evolving message thread and respond with the chat completion portion of the Python client highlighted above.

1print('Welcome to the Chatbot! Let me know how can I help you')
3while True:
4 print('')
5 request = input('User' + ': ')
6 if request=="Stop" or request=='stop':
7 print('Bot: Bye!')
8 break
9 else:
10 messages.append({
11 "role": "user",
12 "content": request
13 })
15 response = pg.Chat.create(
16 model="Neural-Chat-7B",
17 messages=messages
18 )['choices'][0]['message']['content'].split('\n')[0].strip()
20 messages.append({
21 "role": "assistant",
22 "content": response
23 })
24 print('Bot: ', response)