Performance
Benchmarks and latency numbers — what Aira adds to your agent stack.
The short answer
Aira adds ~95μs of in-process overhead per action (policy evaluation + Ed25519 receipt signing). On a real API call over the network, total authorize latency is ~90ms — dominated by TLS + DB round trip, not crypto.
For comparison: a typical LLM call (Claude, GPT, Gemini) takes 1-5 seconds. Aira's overhead is invisible.
Cryptographic primitives
Measured on a single core. These are the same primitives the Aira backend uses to mint receipts, evaluate policies, and build Merkle trees.
| Operation | p50 | Throughput | What it does |
|---|---|---|---|
| Policy evaluation (3 rules) | 0.5μs | 2M ops/s | Evaluate conditions against action context |
| SHA-256 hash | 0.5μs | 1.85M ops/s | Fingerprint the action payload |
| Canonical JSON | 2.3μs | 429K ops/s | Deterministic serialization for signing |
| Content scan (6 patterns) | 5.8μs | 171K ops/s | Regex PII/credential detection |
| Full receipt mint | 95μs | 10.5K ops/s | Policy eval + SHA-256 + Ed25519 sign |
| Ed25519 signature | 95μs | 10.6K ops/s | Receipt signing |
| Ed25519 verify | 204μs | 4.9K ops/s | Receipt verification |
| Merkle root (100 receipts) | 78μs | 12.8K ops/s | Settlement/bundle commitment |
| Merkle root (1,000 receipts) | 904μs | 1.1K ops/s | Large settlement batch |
| Drift divergence (KL) | 1.1μs | 889K ops/s | Behavioral drift detection |
These numbers are in-process CPU time only — no network, no database. Run the benchmark yourself: python scripts/benchmark.py in the Python SDK repo.
API latency (real-world)
Measured against the production cloud API (api.airaproof.com, EU/Frankfurt):
| Endpoint | p50 | What it includes |
|---|---|---|
POST /actions (authorize) | ~90ms | TLS + auth + policy eval + DB write + receipt |
GET /health | ~80ms | TLS + DB health check |
POST /actions/{id}/notarize | ~100ms | TLS + auth + Ed25519 sign + RFC 3161 timestamp |
These numbers include network latency from the client. From within the same datacenter (e.g., your agent and Aira on the same network), expect ~10-20ms.
Gateway overhead
The Aira Gateway is a transparent LLM proxy. It adds:
| Phase | Overhead | What |
|---|---|---|
| Pre-flight (authorize) | ~5ms | Policy evaluation + action creation |
| Post-flight (notarize) | ~10ms | Receipt minting + signature |
| Total added latency | ~15ms | On top of the upstream LLM call |
A typical LLM call takes 1-5 seconds. Aira adds ~15ms — less than 1% overhead.
Concurrent capacity
On a single 4-vCPU server (the minimum spec for self-hosted):
| Metric | Value |
|---|---|
| Concurrent authorize requests | ~200/s |
| Concurrent receipt verifications | ~100/s |
| Concurrent gateway proxies | Limited by upstream LLM rate limits |
| WebSocket / SSE streams | ~500 concurrent |
The cloud deployment runs 4 uvicorn workers with async I/O — most of the time is spent waiting on DB and external APIs, not CPU.
What determines your actual throughput
- Database — PostgreSQL handles the action/receipt writes. With connection pooling (asyncpg), this is rarely the bottleneck.
- AI provider rate limits — Cases and gateway calls are limited by OpenAI/Anthropic/Google rate limits, not Aira.
- Network — Client-to-Aira latency. Co-locate your agents with Aira for best results.
- Memory — The spaCy NER model for content scanning uses ~200MB. The rest of the API uses minimal memory.
Self-hosted performance
Self-hosted deployments have the same performance characteristics as cloud. The recommended spec (8 GB RAM, 4 vCPUs) handles typical enterprise workloads with room to spare.
For high-throughput use cases (1,000+ actions/minute), increase to 8 vCPUs and add a connection pooler (PgBouncer) in front of PostgreSQL.
Run your own benchmarks
# Crypto primitives (no server needed)
pip install cryptography
python scripts/benchmark.py
# API latency (requires running Aira)
curl -w "total: %{time_total}s\n" -X POST https://your-domain/api/v1/actions \
-H "Authorization: Bearer aira_live_..." \
-H "Content-Type: application/json" \
-d '{"action_type":"benchmark","details":"test"}'