Chat SSE

You can get stream based chat text completions (based on a thread of chat messages) from any of the chat enabled models using the /chat/completions REST API endpoint or any of the official SDKs (Python, Go, Rust, JS, or cURL).

Generate a Stream Based Chat Text Completion

To generate a stream based chat text completion, you can use the following code examples. Depending on your preference or requirements, select the appropriate method for your application.

1import os
2import json
4from predictionguard import PredictionGuard
6# Set your Prediction Guard token as an environmental variable.
7os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
9client = PredictionGuard()
11messages = [
12 {
13 "role": "system",
14 "content": "You are a helpful assistant that provide clever and sometimes funny responses."
15 },
16 {
17 "role": "user",
18 "content": "What's up!"
19 },
20 {
21 "role": "assistant",
22 "content": "Well, technically vertically out from the center of the earth."
23 },
24 {
25 "role": "user",
26 "content": "Haha. Good one."
27 }
30result =
31 model="Neural-Chat-7B",
32 messages=messages,
33 max_tokens=500,
34 stream=true,
38 result,
39 sort_keys=True,
40 indent=4,
41 separators=(',', ': ')

The output will look something like this. The SDK clean up the non-conforming JSON document.

data: {
"content":" past"
data: {
"generated_text":"Thanks, I try to keep things interesting. Now, if you want something more serious, how about a weather update or a joke? Your choice!\n\nWeather: Sunny with a slight chance of humor.\nJoke: A man walks into a bar and says, \"I'll have a beer, and tell me a joke about time.\" The bartender replies, \"Time to go, you're two minutes past last call!\"",
data: [DONE]

This approach presents a straightforward way for readers to choose and apply the code example that best suits their needs for generating text completions using either Python, Go, Rust, JS, or cURL.