Output Policies API

Per-org output content-scan policy — read the current config and PATCH updates. Admin-only, gated by ENABLE_OUTPUT_FILTERING.

Endpoints

Method	Path	Description
GET	`/output-policies`	Return the org's policy (merged with defaults)
PATCH	`/output-policies`	Merge supplied fields into the policy

Both require a Bearer API key from a user with admin or owner role. Both return 404 when ENABLE_OUTPUT_FILTERING is off globally on the backend.

Policy shape

{
  "enabled": true,
  "mode": "flag",
  "libraries": ["pii", "credentials", "prompt_injection"],
  "deny_severity_threshold": "critical",
  "redact_severity_threshold": "warning",
  "request_id": "req_..."
}

enabled — master switch. When false, no outcome scans run and receipts carry output_scan_flags = null.
mode — one of "flag" (record-only), "deny" (refuse receipt on threshold), or "redact" (hash the cleaned outcome).
libraries — which pattern libraries to run. Valid values: pii, credentials, prompt_injection.
deny_severity_threshold — in deny mode, the minimum hit severity that blocks the receipt. info | warning | critical.
redact_severity_threshold — unused today (Phase 2 redacts every match regardless of severity). Reserved for a future per-severity redaction toggle.

GET /output-policies

Returns the full blob. An org that never called PATCH receives the default policy (enabled, flag mode, all libraries, critical deny threshold, warning redact threshold).

PATCH /output-policies

Request body

All fields optional. Only the fields supplied are changed:

{
  "mode": "deny",
  "deny_severity_threshold": "critical"
}

The above leaves enabled, libraries, and redact_severity_threshold untouched.

Errors

Code	When
400 `INVALID_POLICY_FIELD`	Request body has an unrecognised key
400 `INVALID_POLICY_MODE`	`mode` not in `{flag, deny, redact}`
400 `INVALID_POLICY_LIBRARY`	Library name not in `{pii, credentials, prompt_injection}`
400 `INVALID_POLICY_SEVERITY`	Threshold not in `{info, warning, critical}`
400 `INVALID_POLICY_ENABLED`	`enabled` is not a boolean
401	Missing or invalid Bearer token
403	Caller is not an admin on the org
404	`ENABLE_OUTPUT_FILTERING` is off

How outcome scans land on receipts

Every POST /actions/{id}/notarize call with a completed outcome runs the org's active policy against outcome_details. See guides/output-filtering for the full behaviour per mode; the short version:

Mode	Receipt outcome
`flag`	Signed receipt with `output_scan_flags` populated
`deny` (threshold met)	422 `OUTPUT_SCAN_VIOLATION`, no receipt
`deny` (threshold not met)	Signed receipt with flags
`redact`	Receipt signed over the cleaned outcome, flags populated

The output_scan_flags blob is part of the signed canonical JSON, so a verifier re-computing the signature will see the tamper if anyone edits the flags after notarize.