Output Policies API
Per-org output content-scan policy — read the current config and PATCH updates. Admin-only, gated by ENABLE_OUTPUT_FILTERING.
Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /output-policies | Return the org's policy (merged with defaults) |
| PATCH | /output-policies | Merge supplied fields into the policy |
Both require a Bearer API key from a user with admin or owner
role. Both return 404 when ENABLE_OUTPUT_FILTERING is off
globally on the backend.
Policy shape
{
"enabled": true,
"mode": "flag",
"libraries": ["pii", "credentials", "prompt_injection"],
"deny_severity_threshold": "critical",
"redact_severity_threshold": "warning",
"request_id": "req_..."
}enabled— master switch. Whenfalse, no outcome scans run and receipts carryoutput_scan_flags = null.mode— one of"flag"(record-only),"deny"(refuse receipt on threshold), or"redact"(hash the cleaned outcome).libraries— which pattern libraries to run. Valid values:pii,credentials,prompt_injection.deny_severity_threshold— indenymode, the minimum hit severity that blocks the receipt.info|warning|critical.redact_severity_threshold— unused today (Phase 2 redacts every match regardless of severity). Reserved for a future per-severity redaction toggle.
GET /output-policies
Returns the full blob. An org that never called PATCH receives the default policy (enabled, flag mode, all libraries, critical deny threshold, warning redact threshold).
PATCH /output-policies
Request body
All fields optional. Only the fields supplied are changed:
{
"mode": "deny",
"deny_severity_threshold": "critical"
}The above leaves enabled, libraries, and redact_severity_threshold
untouched.
Errors
| Code | When |
|---|---|
400 INVALID_POLICY_FIELD | Request body has an unrecognised key |
400 INVALID_POLICY_MODE | mode not in {flag, deny, redact} |
400 INVALID_POLICY_LIBRARY | Library name not in {pii, credentials, prompt_injection} |
400 INVALID_POLICY_SEVERITY | Threshold not in {info, warning, critical} |
400 INVALID_POLICY_ENABLED | enabled is not a boolean |
| 401 | Missing or invalid Bearer token |
| 403 | Caller is not an admin on the org |
| 404 | ENABLE_OUTPUT_FILTERING is off |
How outcome scans land on receipts
Every POST /actions/{id}/notarize call with a completed outcome
runs the org's active policy against outcome_details. See
guides/output-filtering for the
full behaviour per mode; the short version:
| Mode | Receipt outcome |
|---|---|
flag | Signed receipt with output_scan_flags populated |
deny (threshold met) | 422 OUTPUT_SCAN_VIOLATION, no receipt |
deny (threshold not met) | Signed receipt with flags |
redact | Receipt signed over the cleaned outcome, flags populated |
The output_scan_flags blob is part of the signed canonical JSON,
so a verifier re-computing the signature will see the tamper if
anyone edits the flags after notarize.