Zero Dependency Binary
Zero Dependency Binary
Deploy Prediction Guard using a single-binary installer on a single-node system.
Minimum Requirements
These are the minimum recommended specifications for a Prediction Guard single-node system. Please keep in mind that actual hardware requirements may vary based on the models you choose to deploy.
- 32-cores CPU
- 256 GB RAM
- Minimum of 100 GB of free disk space
- 1 NVIDIA GPU of a supported type: (L4, L40S, A10, A100, H100/200, B100/200) with installed drivers on host
- Ubuntu or Debian Linux (LTS or newer)
Create Your System in the Admin Console
- Navigate and login to admin.predictionguard.com
- View the Systems page and click on + Create System in the top-right.
- Provide a System Name
- If you intend to use any models that are restricted by an API token on HuggingFace, be sure to provide your HuggingFace API key.
- Click Create System.
Installation Instructions
- Download the installation package using the command below or from the link provided by your Prediction Guard account representative.
- Untar the installation file:
- Run the installer, which will run pre-flight checks to ensure compatible environment:
Provide a password for the local admin console. This is rarely used, but can be helpful for updating certain fields for offline systems.
The installer will run through a series of pre-flight checks to ensure compatibility. If any of the pre-flight checks fail, a message will be displayed regarding which checks are failing. Either attempt to address and resolve the issue (some are related to available disk space, performance, etc.) or reach out to your Prediction Guard account representative for assistance.
Once the installer has completed, proceed to step 4.
- Shell into the system to run the bootstrap command:
-
Retrieve the bootstrap command from admin.predictionguard.com by navigating to Systems, then clicking the Deploy button in the row of the system you wish to deploy. Click the Copy button above the deploy command and proceed to step 6.
-
Paste the bootstrap command into the terminal where you are shelled into the system. This will install your authentication token and begin the initial bootstrapping of Prediction Guard services. After a few minutes, feel free to check the installation by checking the running pods in the
predictionguardnamespace:
You should see running pods in the namespace, including pg-inside indicating that the system has been successfully installed. The system should also show as Healthy in the Admin Console.
- Deploy any desired AI models from the Models page in the Admin Console. Pay attention to any settings around number of AI accelerators, CPU and memory allocation to the model and ensure it fits within your VM/machine.
Configuring Ingress and Reverse Proxy
Prediction Guard comes preconfigured for NGINX and a default Ingress which can be enabled on the system within the Edit section of the Systems page. Here you can configure the desired domain names and have NGINX deploy into the predictionguard namespace with preconfigured settings for the Prediction Guard API. Then, simply ensure that your DNS entry is routable to the ingress IP on your VM/machine.

