Aira

Performance

Benchmarks and latency numbers — what Aira adds to your agent stack.

The short answer

Aira adds ~95μs of in-process overhead per action (policy evaluation + Ed25519 receipt signing). On a real API call over the network, total authorize latency is ~90ms — dominated by TLS + DB round trip, not crypto.

For comparison: a typical LLM call (Claude, GPT, Gemini) takes 1-5 seconds. Aira's overhead is invisible.

Cryptographic primitives

Measured on a single core. These are the same primitives the Aira backend uses to mint receipts, evaluate policies, and build Merkle trees.

Operationp50ThroughputWhat it does
Policy evaluation (3 rules)0.5μs2M ops/sEvaluate conditions against action context
SHA-256 hash0.5μs1.85M ops/sFingerprint the action payload
Canonical JSON2.3μs429K ops/sDeterministic serialization for signing
Content scan (6 patterns)5.8μs171K ops/sRegex PII/credential detection
Full receipt mint95μs10.5K ops/sPolicy eval + SHA-256 + Ed25519 sign
Ed25519 signature95μs10.6K ops/sReceipt signing
Ed25519 verify204μs4.9K ops/sReceipt verification
Merkle root (100 receipts)78μs12.8K ops/sSettlement/bundle commitment
Merkle root (1,000 receipts)904μs1.1K ops/sLarge settlement batch
Drift divergence (KL)1.1μs889K ops/sBehavioral drift detection

These numbers are in-process CPU time only — no network, no database. Run the benchmark yourself: python scripts/benchmark.py in the Python SDK repo.

API latency (real-world)

Measured against the production cloud API (api.airaproof.com, EU/Frankfurt):

Endpointp50What it includes
POST /actions (authorize)~90msTLS + auth + policy eval + DB write + receipt
GET /health~80msTLS + DB health check
POST /actions/{id}/notarize~100msTLS + auth + Ed25519 sign + RFC 3161 timestamp

These numbers include network latency from the client. From within the same datacenter (e.g., your agent and Aira on the same network), expect ~10-20ms.

Gateway overhead

The Aira Gateway is a transparent LLM proxy. It adds:

PhaseOverheadWhat
Pre-flight (authorize)~5msPolicy evaluation + action creation
Post-flight (notarize)~10msReceipt minting + signature
Total added latency~15msOn top of the upstream LLM call

A typical LLM call takes 1-5 seconds. Aira adds ~15ms — less than 1% overhead.

Concurrent capacity

On a single 4-vCPU server (the minimum spec for self-hosted):

MetricValue
Concurrent authorize requests~200/s
Concurrent receipt verifications~100/s
Concurrent gateway proxiesLimited by upstream LLM rate limits
WebSocket / SSE streams~500 concurrent

The cloud deployment runs 4 uvicorn workers with async I/O — most of the time is spent waiting on DB and external APIs, not CPU.

What determines your actual throughput

  1. Database — PostgreSQL handles the action/receipt writes. With connection pooling (asyncpg), this is rarely the bottleneck.
  2. AI provider rate limits — Cases and gateway calls are limited by OpenAI/Anthropic/Google rate limits, not Aira.
  3. Network — Client-to-Aira latency. Co-locate your agents with Aira for best results.
  4. Memory — The spaCy NER model for content scanning uses ~200MB. The rest of the API uses minimal memory.

Self-hosted performance

Self-hosted deployments have the same performance characteristics as cloud. The recommended spec (8 GB RAM, 4 vCPUs) handles typical enterprise workloads with room to spare.

For high-throughput use cases (1,000+ actions/minute), increase to 8 vCPUs and add a connection pooler (PgBouncer) in front of PostgreSQL.

Run your own benchmarks

# Crypto primitives (no server needed)
pip install cryptography
python scripts/benchmark.py

# API latency (requires running Aira)
curl -w "total: %{time_total}s\n" -X POST https://your-domain/api/v1/actions \
  -H "Authorization: Bearer aira_live_..." \
  -H "Content-Type: application/json" \
  -d '{"action_type":"benchmark","details":"test"}'

On this page