WS1 (Rust SDK), WS2 (A2A Authorization), WS5 (Developer Experience) all delivered, QA gates passed, committed to main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
29 KiB
Context
SentryAgent.ai has completed four phases: Phase 1 (MVP — core agent registry, OAuth 2.0, audit log), Phase 2 (Production-Ready — Vault, 4 language SDKs, OPA, React dashboard, Prometheus, Terraform multi-region), Phase 3 (Enterprise — multi-tenancy, W3C DIDs, OIDC, AGNTCY federation, webhooks, SOC 2 controls), and Phase 4 (Developer Growth — production hardening, developer portal, CLI, agent marketplace, GitHub Actions, Stripe billing). The product is technically complete, commercially launched, and has an active developer community.
Phase 5 operates on a stable, proven foundation. Every new workstream is additive — no existing service is refactored, only extended. The architecture constraint that governs Phase 5 is: any new service MUST follow the existing DI pattern (constructor injection of typed interfaces), MUST emit Prometheus metrics, and MUST be covered by the existing QA gate (>80% coverage, OpenAPI spec-first).
Goals / Non-Goals
Goals:
- Complete language SDK parity — Rust is the only major language missing a first-party SDK
- Introduce A2A delegation as a first-class authorization primitive aligned with AGNTCY multi-agent workflows
- Give paying tenants visibility into their own usage patterns through analytics
- Expose multi-tier rate limits as a self-service commercial lever
- Eliminate DX friction for new developers — scaffold generation reduces time-to-first-request to under 5 minutes
- Certify AGNTCY compliance formally — this is a competitive moat
Non-Goals:
- Real-time WebSocket-based analytics streaming (batch/polling is acceptable for MVP analytics)
- Full marketplace monetization (agent listings with pricing — discovery only, no transactions, out of scope)
- Native mobile SDK (iOS/Android)
- GraphQL API surface
- Webhook delivery for analytics events (Phase 6 if needed)
Decisions
ADR-1: Rust SDK is a standalone Cargo crate in sdk-rust/ with no code generation
Decision: The Rust SDK is a hand-authored Cargo crate at sdk-rust/, not generated from the OpenAPI spec using openapi-generator.
Rationale: Code generation produces idiomatically poor Rust — openapi-generator's Rust output does not use async/await idiomatically, does not produce proper thiserror-based error types, and generates unwrap() calls in critical paths. Hand-authored code ensures idiomatic Rust: async/await throughout, Arc<Mutex<TokenCache>> for thread-safe token caching, Result<T, AgentIdPError> for every fallible operation, and zero unwrap() in library code. The SDK API surface mirrors the Go SDK pattern (the most recently authored, cleanest SDK) to minimize cognitive load for polyglot teams.
Alternatives considered: openapi-generator --generator rust — produces non-idiomatic output, requires post-processing, hard to maintain. progenitor (Oxide) — excellent output but requires forking Oxide's toolchain and adds a complex build dependency.
ADR-2: A2A delegation chains are stored in PostgreSQL, verified cryptographically at request time
Decision: Delegation chains are stored as rows in a delegation_chains table. Each row captures: delegator agent ID, delegatee agent ID, granted scopes, expiry, and a cryptographic signature over the delegation payload using the delegator's credential secret. Verification at POST /oauth2/token/verify-delegation reconstructs and verifies the chain signature.
Rationale: Storing the full delegation chain in the database enables: (1) audit log entries with full chain context, (2) revocation of any link in a chain (invalidating all downstream delegations), and (3) analytics over delegation depth and patterns. Cryptographic signing at issuance means the database is the source of truth but is not trusted blindly — the chain is independently verifiable.
Alternatives considered: JWT-encoded delegation claims only (no DB storage) — enables verification without a DB hit but prevents revocation and audit. Blockchain-anchored delegation — extreme overkill for MVP scale, operational complexity exceeds benefit.
ADR-3: Analytics are computed from usage_events table using pre-aggregated daily summaries
Decision: The analytics endpoints (GET /analytics/usage-summary, GET /analytics/agent-activity, GET /analytics/token-trends) query a new analytics_daily_aggregates table that is populated by a nightly aggregation job (pg_cron or a Node.js cron via node-cron). Raw usage_events rows are not queried at API request time.
Rationale: The usage_events table is append-only and grows without bound. Scanning it for date-range analytics would produce full-table scans at production scale. Pre-aggregated daily summaries (tenant_id, agent_id, date, metric_type, count) enable O(days) queries regardless of event volume. The aggregation job runs at 00:05 UTC daily to aggregate the previous day's events.
Alternatives considered: Real-time aggregation using PostgreSQL window functions — acceptable at small scale, degrades catastrophically at 10M+ events. TimescaleDB hypertables — excellent solution but adds an infrastructure dependency (separate DB engine) disproportionate to Phase 5 scope.
ADR-4: Multi-tier rate limits are enforced in a new TierRateLimiter middleware that reads tier from tenant_subscriptions
Decision: A new TierRateLimiter middleware replaces the flat rate limiter for authenticated routes. It reads the tenant's current tier (free | pro | enterprise) from a Redis-cached lookup of tenant_subscriptions and applies the tier-appropriate rate limit from a static tier definition map. The tier definition map is the single source of truth — also returned verbatim by GET /tiers.
Rationale: The existing RateLimiterRedis middleware applies a single flat limit across all tenants. Multi-tier enforcement requires per-tenant limit keys (already supported by rate-limiter-flexible via the keyPrefix option) and per-tier limit configurations. Centralizing tier definitions in a static config (not a database table) avoids the complexity of dynamic tier management and keeps tier changes as code changes (reviewed, versioned, deployed).
Alternatives considered: API gateway (Kong, AWS API Gateway) for rate limiting — correct long-term architecture but adds operational complexity and cost beyond Phase 5 scope. Per-tenant custom limits stored in DB — too flexible, hard to reason about, no self-service model.
ADR-5: Scaffold generator produces a ZIP archive served from GET /sdk/scaffold/:agentId
Decision: ScaffoldService generates an in-memory ZIP archive (using archiver) containing language-specific starter files pre-populated with the agent's clientId and the API URL. The endpoint streams the ZIP directly from memory — no disk I/O, no S3.
Rationale: Scaffold generation is a low-frequency, low-latency-sensitive operation (developers use it once per new project). In-memory generation avoids disk I/O, eliminates cleanup complexity, and produces no persistent artifacts on the server. The archiver library supports in-memory streaming to an HTTP response via Node.js streams. Each scaffold is generated on demand and is not cached — the agent's credentials could rotate between requests.
Alternatives considered: Pre-built scaffold templates on S3 with client ID injected at runtime — adds AWS dependency, complicates credential injection. GitHub template repositories — developer must authenticate with GitHub, adds friction. Static downloadable templates — not pre-wired with agent credentials, defeats the purpose.
ADR-6: AGNTCY compliance report is generated on demand from live system state, not cached
Decision: GET /agntcy/compliance-report queries live system state — registered agents, DID documents, OIDC configuration, federation policies, audit log retention settings — and generates a structured compliance report in real time. No pre-computed report cache.
Rationale: Compliance reports must reflect current system state. A cached report could misrepresent configuration that has changed since the last cache population. The compliance report endpoint is not on the critical path (it is used by compliance officers, not application code) — latency of 500–2000ms is acceptable. The report format is machine-readable JSON (with an optional PDF export hint for human-readable presentation).
Alternatives considered: Pre-generated nightly compliance reports stored in S3 — stale by definition, adds S3 dependency. Compliance report built into the monitoring stack (Grafana) — mixing compliance and observability concerns violates single responsibility.
Component Architecture — How Phase 5 Extends Phase 4
┌─────────────────────────────────────────────────────────────────────────────────┐
│ SentryAgent.ai Platform — Phase 5 │
│ │
│ ┌──────────────────────────────┐ ┌──────────────────────────────────────┐ │
│ │ Developer Portal (Next.js) │ │ Web Dashboard (React 18) │ │
│ │ ┌────────────────────────┐ │ │ ┌──────────────────────────────────┐│ │
│ │ │ API Explorer │ │ │ │ Analytics Tab (NEW - WS3) ││ │
│ │ │ (Elements v5 — WS5) │ │ │ │ - Agent Activity Heatmap ││ │
│ │ ├────────────────────────┤ │ │ │ - Token Issuance Trends ││ │
│ │ │ Scaffold Download (WS5)│ │ │ │ - Rotation Frequency ││ │
│ │ └────────────────────────┘ │ │ └──────────────────────────────────┘│ │
│ └──────────────────────────────┘ └──────────────────────────────────────┘ │
│ │ │ │
│ └──────────────┬─────────────────────────┘ │
│ │ HTTPS │
│ ┌─────────────────────────────▼────────────────────────────────────────────┐ │
│ │ Express API (Node.js / TypeScript) │ │
│ │ │ │
│ │ ┌──────────────┐ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ Delegation │ │ Analytics │ │ Tiers & │ │ Scaffold │ │ │
│ │ │ Router (WS2) │ │ Router (WS3)│ │ Upgrade(WS4)│ │ Router (WS5) │ │ │
│ │ └──────┬───────┘ └──────┬──────┘ └──────┬──────┘ └──────┬────────┘ │ │
│ │ │ │ │ │ │ │
│ │ ┌──────▼───────┐ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼────────┐ │ │
│ │ │Delegation │ │Analytics │ │BillingService│ │Scaffold │ │ │
│ │ │Service (WS2) │ │Service (WS3)│ │(extended WS4)│ │Service (WS5) │ │ │
│ │ └──────┬───────┘ └──────┬──────┘ └─────────────┘ └──────┬────────┘ │ │
│ │ │ │ │ │ │
│ │ ┌──────▼───────────────────▼───────────────────────────────────▼────────┐ │ │
│ │ │ TierRateLimiter Middleware (WS4) │ │ │
│ │ │ Reads tenant tier from Redis → applies tier-specific limits │ │ │
│ │ └───────────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌────────────────────────────────────────────────┐ │ │
│ │ │ AGNTCY Routes │ │ Existing Phase 1–4 Routes (unchanged) │ │ │
│ │ │ (WS6) │ │ /agents, /oauth2, /credentials, /audit, │ │ │
│ │ │ /agntcy/ │ │ /marketplace, /billing, /health, /oidc, etc. │ │ │
│ │ │ compliance-report│ └────────────────────────────────────────────────┘ │ │
│ │ │ /agents/:id/ │ │ │
│ │ │ agent-card │ │ │
│ │ └──────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────┬┴──────────────────────┐ │
│ │ │ │ │
│ ┌────────────▼────────┐ ┌───────────▼──────────┐ ┌────────▼──────────────┐ │
│ │ PostgreSQL 14+ │ │ Redis 7+ │ │ External Services │ │
│ │ │ │ │ │ │ │
│ │ delegation_chains │ │ tier_cache:{tenantId} │ │ Stripe (billing) │ │
│ │ (WS2 - new) │ │ delegation_cache:{id} │ │ HashiCorp Vault │ │
│ │ analytics_daily_ │ │ analytics_cache:{k} │ │ OPA Policy Engine │ │
│ │ aggregates (WS3) │ │ │ │ │ │
│ │ tenant_subscriptions │ │ │ │ │ │
│ │ usage_events │ │ │ │ │ │
│ │ (Phase 4 — existing) │ │ │ │ │ │
│ └─────────────────────┘ └────────────────────────┘ └───────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────────┐ │
│ │ External SDKs & Tooling │ │
│ │ sdk-rust/ (WS1 — new) │ cli/ (extended WS5) │ AGNTCY Test Suite │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
System-Level Data Flows
WS2: A2A Delegation Flow
Agent A (Delegator) SentryAgent.ai API Agent B (Delegatee)
│ │ │
│ POST /oauth2/token/delegate │ │
│ { agentId: A, delegateeId: B, │ │
│ scopes: [...], ttl: 3600 } │ │
│──────────────────────────────────>│ │
│ │ 1. Authenticate Agent A │
│ │ 2. Validate B exists │
│ │ 3. Verify scopes ⊆ A's scopes │
│ │ 4. Sign delegation payload │
│ │ with A's credential │
│ │ 5. INSERT delegation_chains │
│ │ 6. Return delegation token │
│ { delegationToken, chainId } │ │
│<──────────────────────────────────│ │
│ │ │
│ (out of band: share delegationToken with Agent B) │
│────────────────────────────────────────────────────────────────> │
│ │ │
│ │ POST /oauth2/token/verify-delegation
│ │ { delegationToken } │
│ │<─────────────────────────────│
│ │ 1. Decode token │
│ │ 2. Fetch chain from DB │
│ │ 3. Verify signature │
│ │ 4. Check expiry & revocation │
│ │ 5. Return chain + scopes │
│ │ { valid, scopes, chainId } │
│ │─────────────────────────────>│
WS3: Analytics Aggregation Flow
API Request (any route) Middleware PostgreSQL
│ │ │
│ (every authenticated req) │ │
│──────────────────────────── >│ │
│ │ increment in-memory counter │
│ │ {tenantId, agentId, metric} │
│ │ │
│ │ (every 60s flush — Phase 4) │
│ │──────────────────────────────>│
│ │ INSERT usage_events │
│ │ │
│ │
(00:05 UTC daily) │ │
node-cron job │ │
──────────────────────>│ │
│ aggregate usage_events │
│ for previous day │
│──────────────────────────────>│
│ INSERT analytics_daily_ │
│ aggregates (upsert) │
GET /analytics/agent-activity AnalyticsController AnalyticsService
│ │ │
│──────────────────────────────>│ │
│ │ checkCache(Redis) │
│ │ (miss) → queryAggregates() │
│ │─────────────────────────────>│
│ │ SELECT from │
│ │ analytics_daily_aggregates │
│ │<─────────────────────────────│
│ │ writeCache(Redis, 5min TTL) │
│ { agents: [...heatmap] } │ │
│<──────────────────────────────│ │
WS5: Scaffold Generation Flow
Developer (CLI) SentryAgent.ai API ScaffoldService
│ │ │
│ sentryagent scaffold │ │
│ --agent-id abc123 │ │
│ --language typescript │ │
│ │ │
│ GET /sdk/scaffold/abc123 │ │
│ ?language=typescript │ │
│─────────────────────────────────>│ │
│ │ authenticate request │
│ │ fetch agent credentials │
│ │──────────────────────────>│
│ │ generateScaffold( │
│ │ agentId, clientId, │
│ │ language, apiUrl) │
│ │ build ZIP in-memory: │
│ │ - package.json │
│ │ - index.ts │
│ │ - .env.example │
│ │ - README.md │
│ │<──────────────────────────│
│ (ZIP stream, Content- │ │
│ Disposition: attachment) │ │
│<─────────────────────────────────│ │
│ │ │
│ unzip → ready-to-run project │ │
Risks / Trade-offs
- [Risk] Rust SDK compile times in CI — Mitigation: Use
sccachein CI to cache compiled Rust dependencies. The SDK has minimal dependencies — compile time is bounded. - [Risk] A2A delegation scope creep — Mitigation: Delegated scopes are strictly a subset of the delegator's own scopes (enforced at issuance, not just verification). A delegatee cannot escalate privileges beyond what the delegator holds.
- [Risk] Analytics aggregation job failure leaves stale data — Mitigation: Aggregation job is idempotent (upsert on
(tenant_id, agent_id, date, metric_type)). A failed job can be re-run for any date without producing duplicate data. - [Risk] Scaffold ZIP includes clientId but not clientSecret — Mitigation: The scaffold
.env.exampleincludesAGENT_CLIENT_ID=<your-client-id>with a placeholder forAGENT_CLIENT_SECRET=<your-client-secret>. The secret is never returned by the scaffold endpoint — developers copy it from the credentials page once. - [Risk] Elements (Swagger UI v5) breaking change in portal — Mitigation: Elements is a drop-in React component. The existing
swagger-ui-reactdependency is replaced, not wrapped. The/api-explorerpage is isolated — no other portal pages are affected. - [Risk] AGNTCY compliance report reflects live state but AGNTCY spec may update — Mitigation: The report includes the AGNTCY spec version it was evaluated against (
agntcy_spec_versionfield). Report consumers can detect when the evaluation is stale relative to a newer AGNTCY spec.
Migration Plan
- WS1 first (independent, no API changes): Build and publish the Rust SDK. No server-side migrations required.
- WS2 second (requires migration
008_add_delegation_chains.sql): Apply migration first, then deploy delegation endpoints. No breaking changes to existing endpoints. - WS3 + WS4 in parallel (WS3 requires migration
009_add_analytics_aggregates.sql; WS4 requires no migration): Apply WS3 migration, deploy analytics endpoints, schedule nightly aggregation job. WS4 tier rate limiter deploys behindTIER_RATE_LIMITING_ENABLEDfeature flag. - WS5 (extends portal and CLI — independent deployments): Deploy portal with Elements upgrade. Publish updated CLI to npm with
scaffoldcommand. - WS6 last (reads live system state — no migrations): Deploy AGNTCY compliance endpoints. Run interoperability test suite in CI on every commit going forward.
Rollback strategy per workstream:
- WS1 (Rust SDK): Publish to crates.io is permanent — yanked if critical bug found. No server-side rollback needed.
- WS2 (A2A): Disable delegation routes via
A2A_ENABLED=falsefeature flag.delegation_chainstable is additive — leaving it in place causes no harm. - WS3 (Analytics): Disable analytics routes via
ANALYTICS_ENABLED=false. Aggregation job is a cron — disable in deployment config. - WS4 (Tiers): Revert
TierRateLimitermiddleware to flatRateLimiterRedismiddleware viaTIER_RATE_LIMITING_ENABLED=false. - WS5 (DX): Revert portal deploy to previous version. Publish CLI patch release removing scaffold command.
- WS6 (AGNTCY): Disable AGNTCY routes via
AGNTCY_ENABLED=falsefeature flag. No state changes — read-only endpoints.