Aira

Output Policies API

Per-org output content-scan policy — read the current config and PATCH updates. Admin-only, gated by ENABLE_OUTPUT_FILTERING.

Endpoints

MethodPathDescription
GET/output-policiesReturn the org's policy (merged with defaults)
PATCH/output-policiesMerge supplied fields into the policy

Both require a Bearer API key from a user with admin or owner role. Both return 404 when ENABLE_OUTPUT_FILTERING is off globally on the backend.

Policy shape

{
  "enabled": true,
  "mode": "flag",
  "libraries": ["pii", "credentials", "prompt_injection"],
  "deny_severity_threshold": "critical",
  "redact_severity_threshold": "warning",
  "request_id": "req_..."
}
  • enabled — master switch. When false, no outcome scans run and receipts carry output_scan_flags = null.
  • mode — one of "flag" (record-only), "deny" (refuse receipt on threshold), or "redact" (hash the cleaned outcome).
  • libraries — which pattern libraries to run. Valid values: pii, credentials, prompt_injection.
  • deny_severity_threshold — in deny mode, the minimum hit severity that blocks the receipt. info | warning | critical.
  • redact_severity_threshold — unused today (Phase 2 redacts every match regardless of severity). Reserved for a future per-severity redaction toggle.

GET /output-policies

Returns the full blob. An org that never called PATCH receives the default policy (enabled, flag mode, all libraries, critical deny threshold, warning redact threshold).

PATCH /output-policies

Request body

All fields optional. Only the fields supplied are changed:

{
  "mode": "deny",
  "deny_severity_threshold": "critical"
}

The above leaves enabled, libraries, and redact_severity_threshold untouched.

Errors

CodeWhen
400 INVALID_POLICY_FIELDRequest body has an unrecognised key
400 INVALID_POLICY_MODEmode not in {flag, deny, redact}
400 INVALID_POLICY_LIBRARYLibrary name not in {pii, credentials, prompt_injection}
400 INVALID_POLICY_SEVERITYThreshold not in {info, warning, critical}
400 INVALID_POLICY_ENABLEDenabled is not a boolean
401Missing or invalid Bearer token
403Caller is not an admin on the org
404ENABLE_OUTPUT_FILTERING is off

How outcome scans land on receipts

Every POST /actions/{id}/notarize call with a completed outcome runs the org's active policy against outcome_details. See guides/output-filtering for the full behaviour per mode; the short version:

ModeReceipt outcome
flagSigned receipt with output_scan_flags populated
deny (threshold met)422 OUTPUT_SCAN_VIOLATION, no receipt
deny (threshold not met)Signed receipt with flags
redactReceipt signed over the cleaned outcome, flags populated

The output_scan_flags blob is part of the signed canonical JSON, so a verifier re-computing the signature will see the tamper if anyone edits the flags after notarize.

On this page