Aira

Policy Engine

Aira evaluates every agent action against your policies at authorize() time, before the agent executes. Three modes: rules, AI, and consensus.

Overview

The policy engine runs at authorize() time, before the agent has done anything in the real world. When an agent calls authorize(), Aira runs every active policy against the action and decides one of three outcomes:

  • allow — the Authorization returns status=authorized and the agent may execute.
  • require_approval — the Authorization returns status=pending_approval and the action is held for a human reviewer.
  • deny — the SDK raises POLICY_DENIED and the agent must not execute.

Org admins define policies in the dashboard. No SDK changes are required.

Three Evaluation Modes

ModeHow it worksBest for
RulesDeterministic conditions on action fields.Simple, predictable gates (e.g. block every wire transfer above 10k).
AIA single LLM evaluates against a natural language policy.Nuanced decisions that need context (e.g. "deny if the action could expose PII").
ConsensusMultiple LLMs evaluate independently and vote.High-stakes decisions where a single model might be wrong.

Every evaluation, whether it allows, holds, or denies, produces a cryptographic artifact signed with Ed25519. If the action is later notarized, the policy evaluation is embedded in the final receipt payload.

The policy consensus mode is distinct from the Cases API. Cases are a standalone API for multi-model decision evaluation. They share the same underlying scoring algorithm but serve different purposes.


Rules Mode

Rules are deterministic conditions evaluated against action fields. No LLM is involved. Evaluation is instant and predictable.

Condition Format

Each rule is a JSON condition object:

{
  "field": "action_type",
  "operator": "equals",
  "value": "wire_transfer"
}

Supported Operators

OperatorDescriptionExample
equalsExact match.action_type equals wire_transfer
not_equalsDoes not match.status not equals test
inValue in list.agent_id in ["billing-agent", "payments-agent"]
not_inValue not in list.agent_id not in ["test-agent"]
containsString contains substring.details contains transfer
gt / gteGreater than or equal.amount gt 10000
lt / lteLess than or equal.amount lt 100

Compound Conditions

Combine conditions with all (AND) or any (OR):

{
  "all": [
    { "field": "action_type", "operator": "equals", "value": "wire_transfer" },
    { "field": "agent_id", "operator": "in", "value": ["billing-agent", "payments-agent"] }
  ]
}

When the condition matches, the policy's decision (allow, require_approval, or deny) is applied.


AI Mode

A single LLM evaluates the action against a natural language policy written by the admin.

How It Works

  1. The admin writes a policy in plain English.
  2. On every authorize() call, the action details are sent to the configured model along with the policy text.
  3. The model returns a structured decision: allow, require_approval, or deny, with reasoning and confidence.

Example Policy

Policy text (written by admin in dashboard):

Deny any action that sends customer data to a third-party API unless the agent is the data-export-agent and the endpoint is on our approved list.

Evaluation result:

{
  "policy_uuid": "pol_01JAB...",
  "policy_name": "PII Export Gate",
  "decision": "deny",
  "reasoning": "Action sends customer email addresses to an external webhook that is not on the approved endpoint list.",
  "confidence": 0.94
}

The LLM model used for evaluation is configured per-policy in the dashboard. Any supported model can be used.


Consensus Mode

Multiple LLMs evaluate the action independently. The result is a vote with disagreement scoring.

How It Works

  1. The admin selects 2 to 5 models for the policy.
  2. Each model independently evaluates the action against the policy text.
  3. Agreement is scored using the consensus scoring algorithm.
  4. If models disagree beyond the threshold, the policy decision is require_approval and the action is held for a human.

Example with 3 Models

Policy: "Deny any financial transaction above 25,000 EUR unless the customer has been verified in the last 30 days."

ModelDecisionConfidence
claude-sonnet-4-6deny0.92
gpt-5.4deny0.88
gemini-3.1-prodeny0.85

Result: All models agree, authorize() raises POLICY_DENIED with high confidence. No human review needed.

If one model had returned allow, the disagreement score would exceed the threshold and the decision would flip to require_approval. The Authorization returns status=pending_approval and a human reviews, exactly like the held-action flow described in Human Approval.


Creating Policies

Policies are created and managed in the Aira dashboard.

  1. Go to Settings > Policies in the dashboard.
  2. Click Create Policy.
  3. Choose the evaluation mode:
    • Rules opens the condition builder (visual UI for building JSON conditions).
    • AI opens a text editor for the natural language policy, plus a model selector.
    • Consensus opens the same text editor, plus a multi-model selector.
  4. Set the policy decision on match: allow, require_approval, or deny.
  5. Set priority (higher priority policies are evaluated first).
  6. Optionally scope the policy to specific agents or action types.
  7. Save and activate.

Policies can be created in draft mode and activated later. Draft policies are not evaluated.


Priority and Evaluation Order

Policies are evaluated in priority order, highest first. The rules are:

  1. Higher priority first. Policies with a higher priority number are evaluated before lower ones.
  2. First deny wins. If any policy denies, evaluation stops immediately and authorize() raises POLICY_DENIED.
  3. require_approval beats allow. If any policy requires approval and none deny, the Authorization returns status=pending_approval.
  4. All allow means authorized. If all matching policies return allow, the Authorization returns status=authorized.
  5. No matching policies means authorized. If no policies match (or none are configured), the Authorization returns status=authorized.

Policy Evaluation Audit Trail

Every policy evaluation is persisted server-side and linked to the action it evaluated.

  • Stored with the action: each evaluation records the policy ID, policy name, decision, reasoning (for AI and consensus modes), confidence, and the models used.
  • Surfaced on the Authorization response: when a policy raises a warning (for example, holding an action for approval), the message appears in the warnings field. The decision itself is reflected in the action's status (authorized, pending_approval, or denied_by_policy).
  • Audit trail in dashboard: every evaluation is visible in the action's detail view under "Policy Evaluations".

This creates a durable record that the policy was checked at authorization time, before the agent acted.


Testing Policies

Test policies before activating them using the dry-run endpoint.

POST /api/v1/policies/{policy_uuid}/dry-run
Authorization: Bearer aira_live_xxxxx
Content-Type: application/json
{
  "action_type": "wire_transfer",
  "details": "Transfer 50,000 EUR to vendor account",
  "agent_id": "payments-agent",
  "model_id": "claude-sonnet-4-6"
}

Response:

{
  "policy_uuid": "pol_01JAB...",
  "policy_name": "High-Value Transfer Gate",
  "decision": "deny",
  "reasoning": "Wire transfer amount exceeds 25,000 EUR threshold.",
  "confidence": 0.96,
  "dry_run": true
}

Dry-run evaluations are not recorded in the audit trail and do not affect any real action.


API Reference

Full CRUD operations for policies are available via the API. See Policies API Reference for complete endpoint documentation.

Key endpoints:

MethodEndpointDescription
POST/api/v1/policiesCreate a policy.
GET/api/v1/policiesList all policies.
GET/api/v1/policies/{id}Get a policy.
PATCH/api/v1/policies/{id}Update a policy.
DELETE/api/v1/policies/{id}Delete a policy.
POST/api/v1/policies/{id}/dry-runTest a policy (dry run).
POST/api/v1/policies/{id}/activateActivate a draft policy.
POST/api/v1/policies/{id}/deactivateDeactivate a policy.

SDK Integration

No SDK changes are needed. Policies are evaluated server-side on every authorize() call automatically.

When a policy denies, the SDK raises AiraError with code POLICY_DENIED. When a policy requires approval, the authorize() call returns normally with status="pending_approval".

Python

from aira import Aira, AiraError

aira = Aira(api_key="aira_live_...")

try:
    auth = aira.authorize(
        action_type="wire_transfer",
        details="Send 75,000 EUR to vendor X",
        agent_id="payments-agent",
    )
except AiraError as e:
    if e.code == "POLICY_DENIED":
        print(f"Blocked by policy: {e.message}")
        return
    raise

if auth.status == "authorized":
    # Execute the action, then notarize
    result = send_wire(75000, to="vendor")
    aira.notarize(
        action_uuid=auth.action_uuid,
        outcome="completed",
        outcome_details=f"Sent. ref={result.id}",
    )
elif auth.status == "pending_approval":
    # Policy held this for human review
    queue.enqueue(auth.action_uuid)

TypeScript

import { Aira, AiraError } from "aira-sdk";

const aira = new Aira({ apiKey: "aira_live_..." });

try {
  const auth = await aira.authorize({
    actionType: "wire_transfer",
    details: "Send 75,000 EUR to vendor X",
    agentId: "payments-agent",
  });

  if (auth.status === "authorized") {
    const result = await sendWire(75000, "vendor");
    await aira.notarize({
      actionId: auth.action_uuid,
      outcome: "completed",
      outcomeDetails: `Sent. ref=${result.id}`,
    });
  } else if (auth.status === "pending_approval") {
    await queue.enqueue(auth.action_uuid);
  }
} catch (e) {
  if (e instanceof AiraError && e.code === "POLICY_DENIED") {
    console.error(`Blocked: ${e.message}`);
    return;
  }
  throw e;
}

The policy decision is reflected in the action's status on the Authorization response. If a policy raised a warning (for example, held the action for approval), the message appears in the warnings array:

{
  "action_uuid": "act_01J9B...",
  "status": "pending_approval",
  "created_at": "2026-04-07T14:30:00.000Z",
  "request_id": "req_01JAB...",
  "warnings": [
    "Action held for approval by policy 'High-Value Transfer Gate'."
  ]
}

A denial is surfaced as a 403 POLICY_DENIED error with the offending policy ID in details.policy_uuid. Full evaluation context (reasoning, confidence, model outputs) remains available in the dashboard audit trail.

On this page