Model Options

Model Options

Using Prediction Guard gives you quick and easy access to state-of-the-art LLMs, without you needing to spend weeks figuring out all of the implementation details, managing a bunch of different API specs, and setting up a secure infrastructure for model deployments.

LLMs are hosted by Prediction Guard in a secure, privacy conserving environment built in partnership with Intel's Liftoff program for startups.

Note - Prediction Guard does NOT save or share any data sent to these models (or responses from the models). Further, we are able to sign a BAA for customers needing HIPAA compliance. Contact support with any questions.

Note - We only integrate models that are licensed permissively for commercial use.

Open Access LLMs (what most of our customers use) 🚀

Open access models are amazing these days! Each of these models was trained by a talented team and released publicly under a permissive license. The data used to train each model and the prompt formatting for each model varies. We've tried to give you some of the relevant details here, but shoot us a message in Slack with any questions.

The best models (start here)

Model NameTypeUse CasePrompt FormatContext LengthMore Info
Nous-Hermes-Llama2-13BText GenerationGenerating output in response to arbitrary instructionsAlpaca4096link (opens in a new tab)
Nous-Hermes-2-SOLAR-10.7BChatInstruction following or chat-like applicationsChatML4096link (opens in a new tab)
Neural-Chat-7BChatInstruction following or chat-like applicationsNeural Chat4096link (opens in a new tab)
Yi-34B-ChatChatInstruction following in English or ChineseChatML2048link (opens in a new tab)
sqlcoder-34b-alphaCode GenerationGenerating SQL queries from natural language promptsSQLCoder4096link (opens in a new tab)
deepseek-coder-6.7b-instructCode GenerationGenerating computer code or answering tech questionsDeepseek4096link (opens in a new tab)

Other models available

The models below are available in our API. However, these models scale to zero (i.e., they might not be ready for you to interact with). These models are less frequently accessed by our users, so we suggest you start with the models above. If your company requires one of these models to be up-and-running 24/7. Reach out to us, and we will help make that happen!

Model NameModel CardParametersContext Length
Llama-2-13Blink (opens in a new tab)13B4096
Llama-2-7Blink (opens in a new tab)7B4096
Nous-Hermes-Llama2-7Blink (opens in a new tab)7B4096
Camel-5Blink (opens in a new tab)5B2048
Dolly-3Blink (opens in a new tab)3B2048
Dolly-7Blink (opens in a new tab)7B2048
Falcon-7B-Instructlink (opens in a new tab)7B2048
h2oGPT-6_9Blink (opens in a new tab)6.9B2048
MPT-7B-Instructlink (opens in a new tab)7B4096
Pythia-6_9-Dedupedlink (opens in a new tab)6.9B2048
RedPajama-INCITE-Instruct-7Blink (opens in a new tab)7B2048
WizardCoderlink (opens in a new tab)15.5B8192
StarCoderlink (opens in a new tab)15.5B8192
ℹ️

Note if you aren't actively using these models, they are scaled down. As such, your first call to a model might need to "wake up" that model inference server. You will get a message "Waking up model. Try again in a few minutes." in such cases. Typically it takes around 5-15 minutes to wake up the model server depending on the size of the model. We are actively working on reducing these cold start times.

Closed LLMs (if you t̶r̶u̶s̶t̶ need them)

These models are integrated into our API, but they are not hosted by Prediction Guard in the same manner as the models above.

Note - You will need your own OpenAI API key to use the models below. Customers worried about data privacy, IP/PII leakage, HIPAA compliance, etc. should look into the above "Open Access LLMs" and/or our enterprise deploy. Contact support with any questions.

Model NameGenerationContext Length
OpenAI-gpt-3.5-turbo-instructGPT-3.54097
OpenAI-davinci-002GPT-3.54097
OpenAI-babbage-002GPT-32049
ℹ️

To use the OpenAI models above, make sure you either: (1) define the environment variable OPENAI_API_KEY if you are using the Python client; or (2) set the header parameter OpenAI-ApiKey if you are using the REST API.