PII anonymization

Some of your incoming prompts may include personally identifiable information (PII). With Prediction Guard’s PII anonymization feature, you can detect PII such as names, email addresses, phone numbers, credit card details, and country-specific ID numbers like SSNs, NHS numbers, and passport numbers. Here’s a demonstration of how this works:

1 import os
2 import json
3 
4 import predictionguard as pg
5 
6 os.environ['PREDICTIONGUARD_TOKEN'] = "<your access token>"
7 
8 result = pg.PII.check(
9     prompt="Hello, my name is John Doe and my SSN is 111-22-3333",
10     replace=False
11 )
12 
13 print(json.dumps(
14     result,
15     sort_keys=True,
16     indent=4,
17     separators=(',', ': ')
18 ))

This outputs the PII entity and indices of where the info was found:

1 {
2   "checks": [
3     {
4       "pii_types_and_positions": "[{\"start\": 17, \"end\": 25, \"type\": \"PERSON\"}, {\"start\": 40, \"end\": 51, \"type\": \"US_SSN\"}]",
5       "index": 0,
6       "status": "success"
7     }
8   ],
9   "created": 1701721456,
10   "id": "pii-O0CdxbefFwSRo7uypla7hdUka3pPf",
11   "object": "pii_check"
12 }

To maintain utility without compromising privacy, you have the option to replace PII with fake names and then forward the modified prompt to the LLM for further processing:

1 result = pg.PII.check(
2     prompt="Hello, my name is John Doe and my SSN is 111-22-3333",
3     replace=True,
4     replace_method="fake"
5 )
6 
7 print(json.dumps(
8     result,
9     sort_keys=True,
10     indent=4,
11     separators=(',', ': ')
12 ))

The processed prompt will then be:

1 {
2   "checks": [
3     {
4       "new_prompt": "Hello, my name is William and my SSN is 222-33-4444",
5       "index": 0,
6       "status": "success"
7     }
8   ],
9   "created": 1701721456,
10   "id": "pii-O0CdxbefFwSRo7uypla7hdUka3pPf",
11   "object": "pii_check"
12 }

Other options for the replace_method parameter include: random (to replace the detected PII with random character), category (to mask the PII with the entity type) and mask (simply replace with *).

Getting started

Models

Using LLMs

Process LLM Input

Validating LLM Output

Guides

Reference

Support

1	import os
2	import json
3
4	import predictionguard as pg
5
6	os.environ['PREDICTIONGUARD_TOKEN'] = "<your access token>"
7
8	result = pg.PII.check(
9	prompt="Hello, my name is John Doe and my SSN is 111-22-3333",
10	replace=False
11	)
12
13	print(json.dumps(
14	result,
15	sort_keys=True,
16	indent=4,
17	separators=(',', ': ')
18	))

1	{
2	"checks": [
3	{
4	"pii_types_and_positions": "[{\"start\": 17, \"end\": 25, \"type\": \"PERSON\"}, {\"start\": 40, \"end\": 51, \"type\": \"US_SSN\"}]",
5	"index": 0,
6	"status": "success"
7	}
8	],
9	"created": 1701721456,
10	"id": "pii-O0CdxbefFwSRo7uypla7hdUka3pPf",
11	"object": "pii_check"
12	}

1	result = pg.PII.check(
2	prompt="Hello, my name is John Doe and my SSN is 111-22-3333",
3	replace=True,
4	replace_method="fake"
5	)
6
7	print(json.dumps(
8	result,
9	sort_keys=True,
10	indent=4,
11	separators=(',', ': ')
12	))

1	{
2	"checks": [
3	{
4	"new_prompt": "Hello, my name is William and my SSN is 222-33-4444",
5	"index": 0,
6	"status": "success"
7	}
8	],
9	"created": 1701721456,
10	"id": "pii-O0CdxbefFwSRo7uypla7hdUka3pPf",
11	"object": "pii_check"
12	}