Basic Prompting
(Run this example in Google Colab here)
Prompt and AI Engineering is the emerging developer task of designing and optimizing prompts (and associated workflows/ infra) for AI models to achieve specific goals or outcomes. It involves creating high-quality inputs that can elicit accurate and relevant responses from AI models. The next several examples will help get you up to speed on common prompt engineering strategies.
We will use Python to show an example:
Dependencies and Imports
You will need to install Prediction Guard into your Python environment.
Now import PredictionGuard, setup your API Key, and create the client.
Autocomplete
Because LLMs are configured/ trained to perform the task of text completion, the most basic kind of prompt that you might provide is an autocomplete prompt. Regardless of prompt structure, the model function will compute the probabilities of words, tokens, or characters that might follow in the sequence of words, tokens, or characters that you provided in the prompt.
Depending on the desired outcome, the prompt may be a single sentence, a paragraph, or even an partial story. Additionally, the prompt may be open-ended, providing only a general topic or theme, or it may be more specific, outlining a particular scenario or plot.
This prompt should result in an output similar to:
Other examples include the following (note that we can also complete things like SQL statements):
Zero-Shot Prompts
Autocomplete is a great place to start, but it is only that: a place to start. Throughout this workshop we will be putting on our prompt engineering hats to do some impressive things with generative AI. As we continue along that path, there is a general prompt structure that will pop up over and over again:
One of the easiest ways to leverage the above prompt structure is to describe a task (e.g., sentiment analysis), provide a single piece of data as context, and then provide a single output indicator. This is called a zero shot prompt. Here is a zero-shot prompt for performing sentiment analysis:
Which should output POS
.
Note - We are doing some post-processing on the text output (stripping out extra whitespace and only getting the first word/label), because the model will just continue generating text in certain cases. We will return to this later on in the tutorials.
Note - We are using a very specific prompt format (with the ### Instruction:
etc. markers). This is the alpaca prompt format that is preferred by the
Hermes-2-Pro-Llama-3-8B
model. Each model might have a different preferred prompt
format, and you can find out more about that here.
Another example of zero-shot prompting is the following for question and answer:
Few Shot Prompts
When your task is slightly more complicated or requires a few more leaps in reasoning to generate an appropriate response, you can turn to few shot prompting (aka in context learning). In few shot prompting, a small number of gold standard demonstrations are integrated into the prompt. These demonstrations serve as example (context, output) pairs for the model, which serve to tune the probable output on-the-fly to what we ideally want in the output.
Although not always necessary (as seen above), few shot prompting generally produces better results than single shot prompting in terms of consistency and similarity to your ideal outputs.
Let’s reformat our sentiment prompt to include demonstrations:
This should output NEG
.
Another common example of few shot prompting is chat conversations. Although Prediction Guard has specific functionality to support chat memory and threads. You can actually use any non-chat-specific model to generate a chat response. For example:
This will output the Hinglish response similar to:
If you don’t speak Hinglish, you can check out the translation using another prompt:
Which will output similar to:
Using The SDKs
You can also try these examples using the other official SDKs: