Default policy
InferenceWall ships with a built-in default policy:signatures field is empty by default, meaning all signatures run with their built-in settings.
Enforcement modes
| Mode | Behavior |
|---|---|
monitor | All signatures run and log matches, but the decision is always allow. Nothing is blocked. |
enforce | Signatures contribute to anomaly scoring. Scans that exceed thresholds return flag or block. |
Thresholds
You can override any of the five thresholds in your policy:| Threshold | Description | Default | Strict |
|---|---|---|---|
inbound_flag | Score to flag incoming requests | 4.0 | 2.5 |
inbound_block | Score to block incoming requests | 10.0 | 7.0 |
outbound_flag | Score to flag outgoing responses | 3.0 | 2.0 |
outbound_block | Score to block outgoing responses | 7.0 | 5.0 |
early_exit | Score to skip downstream engines | 13.0 | 10.0 |
Per-signature overrides
Use thesignatures field to override individual signatures within a policy:
Override precedence
When determining how a signature behaves, InferenceWall applies this order:- Per-signature override — highest priority; always wins.
- Global policy mode — applies to all signatures not explicitly overridden.
- Signature default action — the
default_actionfield set by the signature author; lowest priority.
Recommended deployment workflow
Deploy in monitor mode
Set
mode: monitor in your policy. InferenceWall will scan all traffic and log every match, but will not block anything. This lets you see what your real traffic looks like without risk.Observe for 1–2 weeks
Review the logged matches. Look at score distributions, which signatures are firing, and on what content. Identify any signatures that fire frequently on legitimate traffic (false positives).
Configure allowlists for false positives
Demote noisy signatures to
monitor via per-signature overrides, or raise the relevant thresholds. This brings false positives under control before you start enforcing.Flip high-confidence signatures to enforce
Move your highest-severity signatures to
action: enforce individually. Start with credential leakage (DL-S-*) and coercive injection (INJ-D-029, INJ-D-030) — these have near-zero false positive rates.Creating a custom policy
Copy the default policy, modify it, and save it to~/.inferwall/policies/:
.yaml files in ~/.inferwall/policies/. To select a policy explicitly — for example in CI/CD or container deployments — set the IW_POLICY_PATH environment variable:
For the full policy customization reference, including all available fields and environment variable overrides, see Custom Policies.