Skip to main content
InferenceWall is an AI application firewall that sits between your users and your LLM. It scans every input and output for prompt injection, jailbreaks, content safety violations, and data leakage using a multi-layer detection pipeline — Rust-powered heuristic rules, ONNX ML classifiers, FAISS semantic similarity, and an optional LLM-judge — combined into a single anomaly score.

Quick start

Install InferenceWall and scan your first input in under five minutes.

Deployment profiles

Compare Lite, Standard, and Full profiles to match your latency and accuracy requirements.

How it works

Understand the detection pipeline, anomaly scoring, and policy evaluation.

Signature catalog

Browse all 100 built-in signatures and their MITRE ATLAS mappings.

Deployment modes

InferenceWall supports two primary deployment modes. Both use the same detection pipeline and policy system.
ModeHow you use itBest for
SDKImport inferwall and call scan_input() / scan_output() directly in PythonIn-process scanning inside existing Python services
API serverRun inferwall serve and call the HTTP REST API from any languagePolyglot stacks, sidecar deployments, shared scanning service

SDK mode

import inferwall

result = inferwall.scan_input("Ignore all previous instructions")
# decision='block', score=12.0, matches=[{signature_id: 'INJ-D-002', ...}]

result = inferwall.scan_output("Your API key is sk-1234...")
# decision='block', score=12.0, matches=[{signature_id: 'DL-S-001', ...}]

API server mode

inferwall serve

curl -X POST http://localhost:8000/v1/scan/input \
  -H "Content-Type: application/json" \
  -d '{"text": "What is the weather today?"}'

Deployment profiles

Choose a profile based on your latency budget and accuracy requirements. You can upgrade later without changing any application code.
ProfileInstall commandEnginesLatency p99
Litepip install inferwallHeuristic (Rust)<0.3 ms
Standardpip install inferwall[standard]+ ONNX classifier (DeBERTa/DistilBERT) + FAISS semantic (MiniLM)<80 ms
Fullpip install inferwall[full]+ LLM-judge (Phi-4 Mini Q4)<2 s
See Deployment profiles for a detailed breakdown of engines, dependencies, and model download instructions.

MITRE ATLAS coverage

All 100 built-in signatures are mapped to the MITRE ATLAS framework — the AI/ML counterpart to MITRE ATT&CK. InferenceWall implements three ATLAS mitigations: AML.M0015 (Adversarial Input Detection), AML.M0020 (Generative AI Guardrails), and AML.M0006 (Ensemble Methods). Coverage spans prompt injection, jailbreaks, data leakage, content safety, and agentic threats. See the signature catalog for the full mapping.

License

  • Engine (Rust core, Python SDK, CLI, API server): Apache-2.0
  • Community signatures (catalog/): CC BY-SA 4.0 — modifications must be shared back
InferenceWall reduces risk but does not eliminate it. False negatives and false positives are expected. Use InferenceWall as one layer in a defense-in-depth strategy, and evaluate detection accuracy for your specific use case.