- devops docs: 8 files updated for Phase 6 state; field-trial.md added (946-line runbook) - developer docs: api-reference (50+ endpoints), quick-start, 5 existing guides updated, 5 new guides added - engineering docs: all 12 files updated (services, architecture, SDK guide, testing, overview) - OpenSpec archives: phase-7-devops-field-trial, developer-docs-phase6-update, engineering-docs-phase6-update - VALIDATOR.md + scripts/start-validator.sh: V&V Architect tooling added - .gitignore: exclude session artifacts, build artifacts, and agent workspaces Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 KiB
Architecture
Component Overview
┌───────────────────────────────────────────┐
│ Next.js Portal (port 3001) │
│ portal/ — Next.js 14 │
│ /login /agents /credentials /audit │
│ /analytics /settings/tier /compliance │
│ /webhooks /marketplace │
└────────────────┬──────────────────────────┘
│ HTTP (localhost:3000)
┌────────────────▼──────────────────────────┐
│ AgentIdP Application │
│ Node.js / Express (port 3000) │
│ │
│ TLS MW → Helmet → CORS → Morgan │
│ Metrics MW → OrgContext MW │
│ UsageMetering MW → TierEnforcement MW │
│ Auth MW → OPA MW → Routes │
│ ↓ │
│ Controllers → Services → Repos │
└──────────┬───────────────┬────────────────┘
│ │
┌────────────────▼──┐ ┌────────▼────────┐
│ PostgreSQL 14 │ │ Redis 7 │
│ Port 5432 │ │ Port 6379 │
│ │ │ │
│ 26 migrations │ │ Rate limits │
│ (001–026) │ │ Token revoke │
│ organizations │ │ Monthly counts │
│ agents + DID keys │ │ Tier counters │
│ credentials │ │ Compliance cache│
│ audit_events │ │ │
│ token_revocations │ └──────────────────┘
│ oidc_keys │
│ federation_partne-│ ┌──────────────────┐
│ rs │ │ HashiCorp Vault │
│ webhook_subscript-│ │ (optional) │
│ ions + deliveries │ │ KV v2 — creds │
│ agent_marketplace │ └──────────────────┘
│ github_oidc_trust │
│ billing │ ┌──────────────────┐
│ delegation_chains │ │ Stripe │
│ analytics_events │ │ (optional) │
│ tenant_tiers │ │ Billing/upgrades │
└────────────────────┘ └──────────────────┘
Components
AgentIdP Application
A stateless Express HTTP server. Every request is handled independently — no in-process shared state. This means it can be horizontally scaled (multiple instances) as long as all instances share the same PostgreSQL and Redis.
Internal layers:
| Layer | Responsibility |
|---|---|
| Routes | Wire HTTP methods and paths to controllers |
| TLS middleware | Redirect HTTP → HTTPS when ENFORCE_TLS=true |
| Auth middleware | Validate Bearer JWT (RS256 + Redis revocation check) |
| OrgContext middleware | Resolve organization_id from JWT and attach to req |
| UsageMetering middleware | Fire-and-forget analytics event recording |
| TierEnforcement middleware | Enforce daily API call and token limits via Redis (when TIER_ENFORCEMENT=true) |
| OPA middleware | Scope-based authorization via embedded Wasm or JSON policy |
| Controllers | Parse and validate request, call service, return response |
| Services | Business logic — no direct DB access |
| Repositories | All SQL queries — no business logic |
| Utils | JWT sign/verify, bcrypt, error types, async handler |
PostgreSQL 14+
Primary durable data store. All agent identities, credentials, audit events, and token revocation records live here. See database.md for schema details.
The application connects via a connection pool (pg.Pool) initialised from DATABASE_URL. The pool is a singleton shared across all request handlers.
Redis 7+
Ephemeral store for three use cases:
| Key pattern | Example | Purpose | TTL |
|---|---|---|---|
revoked:<jti> |
revoked:f1e2d3c4-... |
Revoked token JTI | Remaining token lifetime |
rate:<client_id>:<window> |
rate:a1b2c3...:29086156 |
Request count per window | RATE_LIMIT_WINDOW_MS |
monthly:<client_id>:<year>:<month> |
monthly:a1b2c3...:2026:3 |
Monthly token issuance count | End of month |
rate:tier:calls:<tenantId> |
rate:tier:calls:org-uuid |
Daily API call counter for tier enforcement | Until midnight UTC |
rate:tier:tokens:<tenantId> |
rate:tier:tokens:org-uuid |
Daily token issuance counter for tier enforcement | Until midnight UTC |
compliance:report:<tenantId> |
compliance:report:org-uuid |
Cached compliance report JSON | 5 minutes |
Redis is supplementary, not the source of truth. Token revocations are also written to the token_revocations PostgreSQL table for durability across Redis restarts. On Redis restart, the revocation list is cold — previously revoked tokens will pass auth until the PostgreSQL-backed warm-up is implemented (Phase 2).
Request Data Flow
HTTP Request
│
▼
Express Router (matches path + method)
│
▼
Auth Middleware
- Extract Bearer token from Authorization header
- Verify RS256 signature using JWT_PUBLIC_KEY
- Check Redis for revocation (key: revoked:<jti>)
- Attach decoded payload to req.user
│
▼
Rate Limit Middleware
- Key: rate:<client_id>:<60s-window>
- Increment counter in Redis (INCR + EXPIRE)
- Set X-RateLimit-* headers
- Reject with 429 if count > 100
│
▼
Controller
- Validate request body / query params (Joi schemas)
- Call service method
- Return HTTP response
│
▼
Service
- Business logic and orchestration
- Calls one or more repositories
- Fires audit log writes (async, fire-and-forget)
│
▼
Repository
- Executes parameterised SQL queries
- Maps DB rows to typed interfaces
- Returns typed results to service
│
▼
PostgreSQL / Redis
Service Map
| Route prefix | Controller | Service(s) | Repository/ies |
|---|---|---|---|
/api/v1/agents |
AgentController |
AgentService |
AgentRepository |
/api/v1/credentials |
CredentialController |
CredentialService |
CredentialRepository |
/api/v1/token |
TokenController |
OAuth2Service |
TokenRepository, CredentialRepository, AgentRepository |
/api/v1/audit |
AuditController |
AuditService |
AuditRepository |
/api/v1/organizations |
OrgController |
OrgService |
OrgRepository |
/api/v1/compliance/* |
ComplianceController |
ComplianceService |
AuditRepository |
/api/v1/analytics/* |
AnalyticsController |
AnalyticsService |
direct pool queries |
/api/v1/tiers/* |
TierController |
TierService |
pool queries, Stripe SDK |
/api/v1/webhooks |
WebhookController |
WebhookService |
WebhookRepository |
/api/v1/federation |
FederationController |
FederationService |
direct pool queries |
/api/v1/marketplace |
MarketplaceController |
MarketplaceService |
direct pool queries |
/api/v1/billing |
BillingController |
BillingService |
direct pool queries |
/.well-known/did.json, /api/v1/did/* |
DIDController |
DIDService |
AgentRepository |
/.well-known/openid-configuration, /api/v1/oidc/* |
OIDCController |
OIDCKeyService, IDTokenService |
direct pool queries |
/api/v1/oidc/trust-policies |
OIDCTrustPolicyController |
OIDCTrustPolicyService |
direct pool queries |
/api/v1/delegation |
DelegationController |
DelegationService |
direct pool queries |
/api/v1/scaffold |
ScaffoldController |
ScaffoldService |
— |
/health |
inline | — | pool, redis |
/metrics |
inline | — | prom-client |
New Services (Phases 3–6)
| Service | Source file | Responsibility |
|---|---|---|
AnalyticsService |
src/services/AnalyticsService.ts |
Fire-and-forget recordEvent, time-series getTokenTrend, heatmap getAgentActivity, per-agent getAgentUsageSummary |
TierService |
src/services/TierService.ts |
getStatus (reads tenant_tiers), initiateUpgrade (creates Stripe Checkout Session), applyUpgrade (handles Stripe webhook), enforceAgentLimit |
ComplianceService |
src/services/ComplianceService.ts |
generateReport (Redis-cached 5 min), exportAgentCards (AGNTCY format) |
DelegationService |
src/services/DelegationService.ts |
A2A delegation chain creation and verification |
DIDService |
src/services/DIDService.ts |
did:web identifier generation and DID document management |
OIDCKeyService |
src/services/OIDCKeyService.ts |
OIDC key rotation, JWKS endpoint |
IDTokenService |
src/services/IDTokenService.ts |
OIDC ID token issuance |
FederationService |
src/services/FederationService.ts |
Cross-tenant agent identity federation |
WebhookService |
src/services/WebhookService.ts |
Event subscriptions, delivery with retry, dead-letter queue |
VaultService |
src/services/VaultService.ts |
HashiCorp Vault KV v2 read/write for credential storage |
BillingService |
src/services/BillingService.ts |
Stripe customer and subscription management |
MarketplaceService |
src/services/MarketplaceService.ts |
Agent listing and discovery |
OIDCTrustPolicyService |
src/services/OIDCTrustPolicyService.ts |
GitHub OIDC trust policy management |
EventPublisher |
src/services/EventPublisher.ts |
Routes domain events to webhook delivery and Kafka (if configured) |
Ports
| Service | Internal port | Exposed port (local dev) |
|---|---|---|
| AgentIdP app | 3000 | 3000 |
| Next.js portal | 3001 | 3001 |
| PostgreSQL | 5432 | 5432 |
| Redis | 6379 | 6379 |
API Routes (Phase 6 complete)
Base path: /api/v1
| Route | Method(s) | Auth | Feature flag |
|---|---|---|---|
/api/v1/agents |
GET, POST, PATCH, DELETE | Bearer JWT | always on |
/api/v1/credentials |
GET, POST, DELETE | Bearer JWT | always on |
/api/v1/token |
POST | none (client credentials) | always on |
/api/v1/audit |
GET | Bearer JWT | always on |
/api/v1/audit/verify |
GET | Bearer JWT | always on |
/api/v1/organizations |
GET, POST | Bearer JWT | always on |
/api/v1/compliance/controls |
GET | none | always on |
/api/v1/compliance/report |
GET | Bearer JWT | COMPLIANCE_ENABLED=true |
/api/v1/compliance/agent-cards |
GET | Bearer JWT | COMPLIANCE_ENABLED=true |
/api/v1/analytics/token-trend |
GET | Bearer JWT | ANALYTICS_ENABLED=true |
/api/v1/analytics/agent-activity |
GET | Bearer JWT | ANALYTICS_ENABLED=true |
/api/v1/analytics/usage-summary |
GET | Bearer JWT | ANALYTICS_ENABLED=true |
/api/v1/tiers/status |
GET | Bearer JWT | always on |
/api/v1/tiers/upgrade |
POST | Bearer JWT | always on |
/api/v1/webhooks |
GET, POST, DELETE | Bearer JWT | always on |
/api/v1/federation |
GET, POST | Bearer JWT | always on |
/api/v1/delegation |
GET, POST | Bearer JWT | always on |
/api/v1/marketplace |
GET | none | always on |
/api/v1/billing |
GET, POST | Bearer JWT | always on |
/api/v1/did/* |
GET | none | always on |
/api/v1/oidc/* |
GET, POST | mixed | always on |
/.well-known/openid-configuration |
GET | none | always on |
/.well-known/jwks.json |
GET | none | always on |
/.well-known/did.json |
GET | none | always on |
/health |
GET | none | always on |
/metrics |
GET | none | always on |
Graceful Shutdown
The server listens for SIGTERM and SIGINT. On receipt:
server.close()is called — stops accepting new connections- In-flight requests complete
process.exit(0)is called
The PostgreSQL pool and Redis client are not explicitly closed in the current shutdown path. This is safe for single-instance deployments; connection cleanup is handled by the OS.