Performance

The short answer

Aira adds ~95μs of in-process overhead per action (policy evaluation + Ed25519 receipt signing). On a real API call over the network, total authorize latency is ~90ms — dominated by TLS + DB round trip, not crypto.

For comparison: a typical LLM call (Claude, GPT, Gemini) takes 1-5 seconds. Aira's overhead is invisible.

Cryptographic primitives

Measured on a single core. These are the same primitives the Aira backend uses to mint receipts, evaluate policies, and build Merkle trees.

Operation	p50	Throughput	What it does
Policy evaluation (3 rules)	0.5μs	2M ops/s	Evaluate conditions against action context
SHA-256 hash	0.5μs	1.85M ops/s	Fingerprint the action payload
Canonical JSON	2.3μs	429K ops/s	Deterministic serialization for signing
Content scan (6 patterns)	5.8μs	171K ops/s	Regex PII/credential detection
Full receipt mint	95μs	10.5K ops/s	Policy eval + SHA-256 + Ed25519 sign
Ed25519 signature	95μs	10.6K ops/s	Receipt signing
Ed25519 verify	204μs	4.9K ops/s	Receipt verification
Merkle root (100 receipts)	78μs	12.8K ops/s	Settlement/bundle commitment
Merkle root (1,000 receipts)	904μs	1.1K ops/s	Large settlement batch
Drift divergence (KL)	1.1μs	889K ops/s	Behavioral drift detection

These numbers are in-process CPU time only — no network, no database. Run the benchmark yourself: python benchmark.py in the examples repo.

API latency (real-world)

Measured against the production cloud API (api.airaproof.com, EU/Frankfurt):

Endpoint	p50	What it includes
`POST /actions` (authorize)	~90ms	TLS + auth + policy eval + DB write + receipt
`GET /health`	~80ms	TLS + DB health check
`POST /actions/{id}/notarize`	~100ms	TLS + auth + Ed25519 sign + RFC 3161 timestamp

These numbers include network latency from the client. From within the same datacenter (e.g., your agent and Aira on the same network), expect ~10-20ms.

Gateway overhead

The Aira Gateway is a transparent LLM proxy. It adds:

Phase	Overhead	What
Pre-flight (authorize)	~5ms	Policy evaluation + action creation
Post-flight (notarize)	~10ms	Receipt minting + signature
Total added latency	~15ms	On top of the upstream LLM call

A typical LLM call takes 1-5 seconds. Aira adds ~15ms — less than 1% overhead.

Concurrent capacity

On a single 4-vCPU server (the minimum spec for self-hosted):

Metric	Value
Concurrent authorize requests	~200/s
Concurrent receipt verifications	~100/s
Concurrent gateway proxies	Limited by upstream LLM rate limits
WebSocket / SSE streams	~500 concurrent

The cloud deployment runs 4 uvicorn workers with async I/O — most of the time is spent waiting on DB and external APIs, not CPU.

What determines your actual throughput

Database — PostgreSQL handles the action/receipt writes. With connection pooling (asyncpg), this is rarely the bottleneck.
AI provider rate limits — Cases and gateway calls are limited by OpenAI/Anthropic/Google rate limits, not Aira.
Network — Client-to-Aira latency. Co-locate your agents with Aira for best results.
Memory — The spaCy NER model for content scanning uses ~200MB. The rest of the API uses minimal memory.

Self-hosted performance

Self-hosted deployments have the same performance characteristics as cloud. The recommended spec (8 GB RAM, 4 vCPUs) handles typical enterprise workloads with room to spare.

For high-throughput use cases (1,000+ actions/minute), increase to 8 vCPUs and add a connection pooler (PgBouncer) in front of PostgreSQL.

Run your own benchmarks

# Crypto primitives (no server needed)
pip install cryptography
python scripts/benchmark.py

# API latency (requires running Aira)
curl -w "total: %{time_total}s\n" -X POST https://your-domain/api/v1/actions \
  -H "Authorization: Bearer aira_live_..." \
  -H "Content-Type: application/json" \
  -d '{"action_type":"benchmark","details":"test"}'

Performance

On this page