Kubernetes
Deploy Prediction Guard on a multi-node Kubernetes cluster
Minimum Requirements
These are the minimum recommended specifications for a Prediction Guard multi-node cluster. Actual hardware requirements may vary based on the models you choose to deploy.
- 32-core CPU per node
- 256 GB RAM per node
- Minimum 100 GB of free disk space per node
- 1 NVIDIA GPU of a supported type (L4, L40S, A10, A100, H100/200, B100/200) with installed drivers on each node
- Ubuntu or Debian Linux (LTS or newer)
- Kubernetes cluster (v1.24 or newer)
Deployment Process
1. Create Your AI System in the Admin Console
If you have not already created your AI system in the Admin Console, follow the Quick Start or the Custom System guide to create your system and generate the installation command.
2. Get the Deployment Command
Navigate to your system in the Admin Console and click the Deploy button in the top-right corner of the system management page.

This opens the Deploy Command modal. Select kubectl as the deployment method, then click Copy to copy the generated installation command.

3. Execute the Installation on Your Cluster
Paste and run the copied command on a machine with kubectl access to your cluster. The command authenticates with your Prediction Guard instance and bootstraps all services into the predictionguard namespace.

After a few minutes, verify the installation:
You should see running pods including pg-inside, indicating the system has been successfully installed. The system will also show as Healthy in the Admin Console.

Configuring Ingress and Reverse Proxy
Prediction Guard comes preconfigured for NGINX and a default Ingress which can be enabled on the system within the Edit section of the Systems page. Here you can configure the desired domain names and have NGINX deploy into the predictionguard namespace with preconfigured settings for the Prediction Guard API. Then, simply ensure that your DNS entry is routable to the ingress IP on your Kubernetes cluster.
Post-Deployment
Once deployed, your system is fully manageable from the Admin Console dashboard.

From here you can:
- API Keys: Manage API keys for secure access to your system endpoints
- Models: Deploy private, managed, or external models and their configurations
- MCP Servers: Configure Model Context Protocol servers and their connections
- Advanced Settings: Configure system settings, resource limits, networking, and cluster-specific options
Need help? Contact our support team for assistance with your Kubernetes deployment.

