There are several types of prompt injection attacks, new attacks being discovered at a rapid speed. As you integrate LLMs to regular workflow is is always good to be prepared against these injection attacks.
With Prediction Guard, you have the ability to assess whether an incoming prompt might be an injection attempt before it reaches the LLM. Get a probability score and the option to block it, safeguarding against potential attacks. Below, you can see the feature in action, demonstrated with a modified version of a known prompt injection:
We can now get an output with probability of injection
Let’s try this again with an inoccuous prompt:
which outputs: