Prompt Injection Detection
There are several types of prompt injection attacks, new attacks being discovered at a rapid speed. As you integrate LLMs to regular workflow is is always good to be prepared against these injection attacks.
With Prediction Guard, you have the ability to assess whether an incoming prompt might be an injection attempt before it reaches the LLM. Get a probability score and the option to block it, safeguarding against potential attacks. Below, you can see the feature in action, demonstrated with a modified version of a known prompt injection.
We can now get an output with probability of injection.
Let’s try this again with an inoccuous prompt.
This will produce an output like the following.
Similar to the PII feature, the injection feature can be used with both the \completions
and \chat\completions
endpoints.
How to detect Injections while using the \completions Endpoint:
this will produce the following ValueError:
How to detect Injections while using the \chat\completions
:
this will produce the following ValueError: