- devops docs: 8 files updated for Phase 6 state; field-trial.md added (946-line runbook) - developer docs: api-reference (50+ endpoints), quick-start, 5 existing guides updated, 5 new guides added - engineering docs: all 12 files updated (services, architecture, SDK guide, testing, overview) - OpenSpec archives: phase-7-devops-field-trial, developer-docs-phase6-update, engineering-docs-phase6-update - VALIDATOR.md + scripts/start-validator.sh: V&V Architect tooling added - .gitignore: exclude session artifacts, build artifacts, and agent workspaces Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 KiB
WS2 — Architecture Documentation Updates
Target file: docs/engineering/02-architecture.md
Operation: Surgical replacements and additions to the existing document. Apply in the order listed below.
Change 1 — Replace Component Diagram
Location: Section ## 1. Component Diagram
Old text (entire Mermaid block and surrounding content — replace from \``mermaidthrough the closing````):
```mermaid
graph TD
Client["Client (AI Agent / Browser / CI)"]
Client -->|HTTPS| ExpressApp["Express App (AgentIdP)"]
subgraph ExpressApp["Express App — src/app.ts"]
Router["Router (src/routes/)"]
AuthMW["authMiddleware (src/middleware/auth.ts)"]
OpaMW["opaMiddleware (src/middleware/opa.ts)"]
Controller["Controller (src/controllers/)"]
Service["Service (src/services/)"]
Repository["Repository (src/repositories/)"]
Router --> AuthMW --> OpaMW --> Controller --> Service --> Repository
end
Repository -->|parameterized SQL| PG["PostgreSQL 14\n(agents, credentials, audit_events, token_revocations)"]
Service -->|Redis commands| Redis["Redis 7\n(token revocation list, monthly counts, rate-limit counters)"]
Service -->|KV v2 read/write| Vault["HashiCorp Vault\n(opt-in — when VAULT_ADDR is set)"]
ExpressApp -->|evaluate input| OPA["OPA Policy Engine\n(policies/authz.rego + data/scopes.json)"]
ExpressApp -->|expose| Metrics["/metrics (prom-client)"]
Dashboard["Dashboard SPA (React 18 + Vite 5)\ndashboard/dist/ served from /dashboard"]
Client -->|browser| Dashboard
Dashboard -->|REST API calls| ExpressApp
Grafana["Grafana (port 3001)"] -->|scrapes| Metrics
**New text (replace with the expanded diagram):**
graph TD
Client["Client (AI Agent / Browser / CI)"]
Client -->|HTTPS| ExpressApp["Express App (AgentIdP)"]
subgraph ExpressApp["Express App — src/app.ts"]
Router["Router (src/routes/)"]
AuthMW["authMiddleware (src/middleware/auth.ts)"]
TierMW["tierMiddleware (src/middleware/tier.ts)"]
OpaMW["opaMiddleware (src/middleware/opa.ts)"]
Controller["Controller (src/controllers/)"]
Service["Service (src/services/)"]
Repository["Repository (src/repositories/)"]
Router --> AuthMW --> TierMW --> OpaMW --> Controller --> Service --> Repository
end
Repository -->|parameterized SQL| PG["PostgreSQL 14\n(agents, credentials, audit_events,\nanalytics_events, organizations,\nfederation_partners, webhook_subscriptions,\nagent_did_keys, delegation_chains)"]
Service -->|Redis commands| Redis["Redis 7\n(token revocation list, daily tier counters,\nJWKS cache, compliance report cache,\nDID document cache)"]
Service -->|KV v2 read/write| Vault["HashiCorp Vault\n(opt-in — credentials, DID private keys,\nwebhook secrets — when VAULT_ADDR is set)"]
ExpressApp -->|evaluate input| OPA["OPA Policy Engine\n(policies/authz.rego + data/scopes.json)"]
ExpressApp -->|expose| Metrics["/metrics (prom-client)"]
ExpressApp -->|checkout session / webhooks| Stripe["Stripe\n(billing — when STRIPE_SECRET_KEY is set)"]
Dashboard["Dashboard SPA (React 18 + Vite 5)\ndashboard/dist/ served from /dashboard"]
Portal["Developer Portal (Next.js 14)\nportal/ — served separately on port 3002"]
Client -->|browser| Dashboard
Client -->|browser| Portal
Dashboard -->|REST API calls| ExpressApp
Portal -->|REST API calls| ExpressApp
Grafana["Grafana (port 3001)"] -->|scrapes| Metrics
OIDCProvider["OIDC Provider (oidc-provider v9)\nmounted at /oidc — A2A delegation tokens"]
ExpressApp --- OIDCProvider
---
## Change 2 — Add New Services to Section 2 (HTTP Request Lifecycle)
**Location:** Section `## 2. HTTP Request Lifecycle`
**Find the paragraph that starts with:**
- The service (
src/services/*.ts) executes all business logic — enforces free-tier limits, resolves domain rules, and calls repositories.
**Replace that single numbered item with:**
- The service (
src/services/*.ts) executes all business logic — enforces tier limits, resolves domain rules, and calls repositories. Phase 3–6 introduces specialised services:AnalyticsService(fire-and-forget event recording),TierService(enforces per-tier agent and call limits),ComplianceService(AGNTCY compliance reports, cached 5 min in Redis),FederationService(cross-IdP JWT verification with cached JWKS),DIDService(W3C DID document generation and caching),WebhookService(subscription management with Vault-backed HMAC secrets), andBillingService(Stripe Checkout and webhook processing). The service has no knowledge of HTTP.
---
## Change 3 — Add Tier Enforcement Middleware Description
**Location:** Section `## 2. HTTP Request Lifecycle`
**Find item 5:**
opaMiddleware(src/middleware/opa.ts) evaluates the OPA policy
**Insert a new item between item 4 (authMiddleware) and item 5 (opaMiddleware). Re-number subsequent items accordingly. The new item is:**
tierMiddleware(src/middleware/tier.ts) enforces per-tier daily API call limits. It reads the organisation's current tier fromTierService.fetchTier(orgId), checks the daily call counter from Redis keyrate:tier:calls:<orgId>againstTIER_CONFIG[tier].maxCallsPerDay, increments the counter on each passing request (fire-and-forgetINCRwith TTL set to next UTC midnight), and throwsTierLimitError(429) when the limit is reached. This middleware is applied only to API routes, not to/health,/metrics, or/dashboard.
Re-number the former item 5 (opaMiddleware) through the end of the list as 6 through 11 (adding one to each subsequent number).
---
## Change 4 — Add New Data Flows Section
**Location:** After the closing of `## 3. OAuth 2.0 Client Credentials Flow` and before `## 4. Multi-Region Deployment Topology`
**Insert the following new section:**
```markdown
---
## 3b. Analytics Event Capture Flow
Every successful token issuance writes a fire-and-forget analytics event:
```mermaid
sequenceDiagram
participant Controller as TokenController
participant OAuth2Svc as OAuth2Service
participant AnalyticsSvc as AnalyticsService
participant PG as PostgreSQL
Controller->>OAuth2Svc: issueToken(clientId, clientSecret, scope, ...)
OAuth2Svc->>OAuth2Svc: signToken() — RS256 JWT
OAuth2Svc-->>Controller: ITokenResponse
Note over OAuth2Svc,AnalyticsSvc: fire-and-forget (void)
OAuth2Svc-)AnalyticsSvc: recordEvent(tenantId, 'token_issued')
AnalyticsSvc-)PG: INSERT INTO analytics_events ... ON CONFLICT DO UPDATE count + 1
recordEvent uses PostgreSQL UPSERT — one row per (organization_id, date, metric_type). If the INSERT conflicts (same date, same org, same metric), the count column is incremented atomically. This keeps the table compact (one row per day per metric type per org) and fast to query.
3c. Tier Enforcement Middleware Chain
sequenceDiagram
actor Agent
participant TierMW as tierMiddleware
participant TierSvc as TierService
participant Redis
participant PG as PostgreSQL
Agent->>TierMW: API request (with valid Bearer token)
TierMW->>TierSvc: fetchTier(orgId)
TierSvc->>PG: SELECT tier FROM organizations WHERE organization_id = $1
PG-->>TierSvc: 'pro'
TierSvc-->>TierMW: 'pro'
TierMW->>Redis: GET rate:tier:calls:<orgId>
Redis-->>TierMW: "4999" (current daily count)
Note over TierMW: TIER_CONFIG['pro'].maxCallsPerDay = 50000 — limit not reached
TierMW-)Redis: INCR rate:tier:calls:<orgId> (fire-and-forget, TTL = next UTC midnight)
TierMW->>Agent: next() — request proceeds to opaMiddleware
When the counter equals or exceeds the tier limit, tierMiddleware throws TierLimitError (429) before opaMiddleware runs. The daily counter resets at UTC midnight via Redis TTL.
3d. A2A Delegation End-to-End Flow
sequenceDiagram
actor Delegator as Delegator Agent
actor Delegatee as Delegatee Agent
participant AgentIdP
participant DelegationSvc as DelegationService
participant OIDCProvider as OIDC Provider
participant PG as PostgreSQL
Delegator->>AgentIdP: POST /api/v1/oauth2/token/delegate<br/>{ delegatee_id, scope }
AgentIdP->>DelegationSvc: createDelegation(delegatorId, delegateeId, scope)
DelegationSvc->>PG: INSERT INTO delegation_chains ...
PG-->>DelegationSvc: chain_id
DelegationSvc->>OIDCProvider: issue delegation JWT (delegator claims + delegatee sub)
OIDCProvider-->>DelegationSvc: signed delegation token
DelegationSvc-->>AgentIdP: IDelegationChain (with token)
AgentIdP-->>Delegator: 201 { token, chain_id }
Note over Delegatee,AgentIdP: Delegatee uses the delegation token
Delegatee->>AgentIdP: POST /api/v1/oauth2/token/verify-delegation<br/>{ token }
AgentIdP->>DelegationSvc: verifyDelegation(token, delegateeId)
DelegationSvc->>PG: SELECT * FROM delegation_chains WHERE chain_id = $1 AND status = 'active'
PG-->>DelegationSvc: chain row (not expired, not revoked)
DelegationSvc->>OIDCProvider: verify token signature
OIDCProvider-->>DelegationSvc: verified claims
DelegationSvc-->>AgentIdP: IDelegationVerifyResult { valid: true, ... }
AgentIdP-->>Delegatee: 200 { valid: true, delegatorId, scope }
Change 5 — Add New PostgreSQL Tables to Section 2
Location: Section ## 2. HTTP Request Lifecycle, item 8 (Repository layer description).
Find the text:
8. The repository (`src/repositories/*.ts`) executes parameterized SQL against PostgreSQL via `node-postgres`, or issues Redis commands via the `redis` client. No business logic lives here.
Replace with:
8. The repository (`src/repositories/*.ts`) executes parameterized SQL against PostgreSQL via `node-postgres`, or issues Redis commands via the `redis` client. No business logic lives here. Phase 3–6 added the following tables: `analytics_events` (daily metric counters), `organizations` (org tier and billing), `federation_partners` (cross-IdP trust registry), `webhook_subscriptions` and `webhook_deliveries` (outbound event delivery), `agent_did_keys` (public EC keys for DID documents), `delegation_chains` (A2A delegation records), `tenant_subscriptions` (Stripe subscription status).