Skip to main content
InferenceWall supports three deployment paths: a Python package installed directly on the host, a Docker container, or a Kubernetes deployment via Helm. All three paths expose the same API server and SDK. Choose based on your existing infrastructure.

Installation

# Lite — heuristic engine only, zero ML deps
pip install inferwall

# Standard — adds ONNX classifier + FAISS semantic engine
pip install inferwall[standard]

# Full — adds LLM-judge for borderline cases
pip install inferwall[full]
Pre-built wheels are available for Linux x86_64, Linux aarch64, macOS arm64, and Windows x86_64. Requires Python >= 3.10.

Deployment profiles

ProfileInstallEnginesLatency
Litepip install inferwallHeuristic (Rust)<0.3ms p99
Standardpip install inferwall[standard]+ Classifier (ONNX) + Semantic (FAISS)<80ms p99
Fullpip install inferwall[full]+ LLM-Judge<2s p99

Post-install setup

1

Generate API keys

inferwall admin setup
This generates a scan key (iwk_scan_…) and an admin key (iwk_admin_…) and writes them to .env.local.
2

Set environment variables

Export the generated keys before starting the server:
export IW_API_KEY=iwk_scan_yourkey
export IW_ADMIN_KEY=iwk_admin_yourkey
Or source the generated file directly:
source .env.local
3

Start the server

inferwall serve

# Or source the generated env file and serve in one command
source .env.local && inferwall serve
The server listens on 0.0.0.0:8000 by default.
4

Install ML models (Standard and Full only)

If you installed the standard or full profile, download the ML models:
inferwall models install --profile standard
Models are cached in ~/.cache/inferwall/models/ and downloaded from HuggingFace (~730 MB for Standard).
5

Run a health check

Confirm the server is up and signatures are loaded:
curl http://localhost:8000/v1/health
In development, you can skip API key setup entirely. Run inferwall serve without setting IW_API_KEY or IW_ADMIN_KEY and scan without any Authorization header. Dev mode is not suitable for production.

Environment variables

VariableDescriptionDefault
IW_API_KEYScan API keyNone (dev mode)
IW_ADMIN_KEYAdmin API keyNone (dev mode)
IW_HOSTServer bind host0.0.0.0
IW_PORTServer port8000
IW_TLSTLS mode: auto, off, or acmeoff
IW_PROFILEDeployment profile: lite, standard, fulllite
IW_LOG_LEVELLog verbosity: debug, info, warning, errorinfo
IW_REDIS_URLRedis URL for distributed sessionsNone

TLS modes

ModeBehavior
offPlain HTTP (default)
autoTLS using a certificate at the path provided in IW_TLS
acmeAutomatic certificate provisioning via ACME/Let’s Encrypt

Redis for distributed sessions

Set IW_REDIS_URL to enable distributed rate limiting and session state across multiple InferenceWall instances:
export IW_REDIS_URL=redis://redis:6379
When unset, InferenceWall uses in-process state, which is scoped to a single instance.

Health check endpoints

EndpointPurposeUse in
GET /v1/health/liveLiveness — is the process alive?Kubernetes livenessProbe
GET /v1/health/readyReadiness — can it handle requests?Kubernetes readinessProbe
GET /v1/healthFull health with signature count and engine statusMonitoring dashboards
# Liveness
curl http://localhost:8000/v1/health/live

# Readiness
curl http://localhost:8000/v1/health/ready

# Full health
curl http://localhost:8000/v1/health

Further reading

Environment variables reference

Complete list of all environment variables with types, defaults, and valid values.

Health API

Response schemas for the liveness, readiness, and full health endpoints.