Files
sentryagent-idp/docs/devops/architecture.md
SentryAgent.ai Developer 8cabc0191c docs: commit all Phase 6 documentation updates and OpenSpec archives
- devops docs: 8 files updated for Phase 6 state; field-trial.md added (946-line runbook)
- developer docs: api-reference (50+ endpoints), quick-start, 5 existing guides updated, 5 new guides added
- engineering docs: all 12 files updated (services, architecture, SDK guide, testing, overview)
- OpenSpec archives: phase-7-devops-field-trial, developer-docs-phase6-update, engineering-docs-phase6-update
- VALIDATOR.md + scripts/start-validator.sh: V&V Architect tooling added
- .gitignore: exclude session artifacts, build artifacts, and agent workspaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 02:24:24 +00:00

13 KiB
Raw Blame History

Architecture

Component Overview

                    ┌───────────────────────────────────────────┐
                    │         Next.js Portal (port 3001)         │
                    │         portal/ — Next.js 14               │
                    │  /login /agents /credentials /audit        │
                    │  /analytics /settings/tier /compliance     │
                    │  /webhooks /marketplace                    │
                    └────────────────┬──────────────────────────┘
                                     │ HTTP (localhost:3000)
                    ┌────────────────▼──────────────────────────┐
                    │         AgentIdP Application               │
                    │         Node.js / Express (port 3000)      │
                    │                                            │
                    │  TLS MW → Helmet → CORS → Morgan           │
                    │  Metrics MW → OrgContext MW                │
                    │  UsageMetering MW → TierEnforcement MW     │
                    │  Auth MW → OPA MW → Routes                 │
                    │        ↓                                   │
                    │  Controllers → Services → Repos            │
                    └──────────┬───────────────┬────────────────┘
                               │               │
              ┌────────────────▼──┐   ┌────────▼────────┐
              │   PostgreSQL 14    │   │    Redis 7       │
              │    Port 5432       │   │   Port 6379      │
              │                    │   │                  │
              │  26 migrations     │   │  Rate limits     │
              │  (001026)         │   │  Token revoke    │
              │  organizations     │   │  Monthly counts  │
              │  agents + DID keys │   │  Tier counters   │
              │  credentials       │   │  Compliance cache│
              │  audit_events      │   │                  │
              │  token_revocations │   └──────────────────┘
              │  oidc_keys         │
              │  federation_partne-│   ┌──────────────────┐
              │  rs                │   │  HashiCorp Vault  │
              │  webhook_subscript-│   │  (optional)       │
              │  ions + deliveries │   │  KV v2 — creds    │
              │  agent_marketplace │   └──────────────────┘
              │  github_oidc_trust │
              │  billing           │   ┌──────────────────┐
              │  delegation_chains │   │  Stripe           │
              │  analytics_events  │   │  (optional)       │
              │  tenant_tiers      │   │  Billing/upgrades │
              └────────────────────┘   └──────────────────┘

Components

AgentIdP Application

A stateless Express HTTP server. Every request is handled independently — no in-process shared state. This means it can be horizontally scaled (multiple instances) as long as all instances share the same PostgreSQL and Redis.

Internal layers:

Layer Responsibility
Routes Wire HTTP methods and paths to controllers
TLS middleware Redirect HTTP → HTTPS when ENFORCE_TLS=true
Auth middleware Validate Bearer JWT (RS256 + Redis revocation check)
OrgContext middleware Resolve organization_id from JWT and attach to req
UsageMetering middleware Fire-and-forget analytics event recording
TierEnforcement middleware Enforce daily API call and token limits via Redis (when TIER_ENFORCEMENT=true)
OPA middleware Scope-based authorization via embedded Wasm or JSON policy
Controllers Parse and validate request, call service, return response
Services Business logic — no direct DB access
Repositories All SQL queries — no business logic
Utils JWT sign/verify, bcrypt, error types, async handler

PostgreSQL 14+

Primary durable data store. All agent identities, credentials, audit events, and token revocation records live here. See database.md for schema details.

The application connects via a connection pool (pg.Pool) initialised from DATABASE_URL. The pool is a singleton shared across all request handlers.

Redis 7+

Ephemeral store for three use cases:

Key pattern Example Purpose TTL
revoked:<jti> revoked:f1e2d3c4-... Revoked token JTI Remaining token lifetime
rate:<client_id>:<window> rate:a1b2c3...:29086156 Request count per window RATE_LIMIT_WINDOW_MS
monthly:<client_id>:<year>:<month> monthly:a1b2c3...:2026:3 Monthly token issuance count End of month
rate:tier:calls:<tenantId> rate:tier:calls:org-uuid Daily API call counter for tier enforcement Until midnight UTC
rate:tier:tokens:<tenantId> rate:tier:tokens:org-uuid Daily token issuance counter for tier enforcement Until midnight UTC
compliance:report:<tenantId> compliance:report:org-uuid Cached compliance report JSON 5 minutes

Redis is supplementary, not the source of truth. Token revocations are also written to the token_revocations PostgreSQL table for durability across Redis restarts. On Redis restart, the revocation list is cold — previously revoked tokens will pass auth until the PostgreSQL-backed warm-up is implemented (Phase 2).

Request Data Flow

HTTP Request
    │
    ▼
Express Router (matches path + method)
    │
    ▼
Auth Middleware
  - Extract Bearer token from Authorization header
  - Verify RS256 signature using JWT_PUBLIC_KEY
  - Check Redis for revocation (key: revoked:<jti>)
  - Attach decoded payload to req.user
    │
    ▼
Rate Limit Middleware
  - Key: rate:<client_id>:<60s-window>
  - Increment counter in Redis (INCR + EXPIRE)
  - Set X-RateLimit-* headers
  - Reject with 429 if count > 100
    │
    ▼
Controller
  - Validate request body / query params (Joi schemas)
  - Call service method
  - Return HTTP response
    │
    ▼
Service
  - Business logic and orchestration
  - Calls one or more repositories
  - Fires audit log writes (async, fire-and-forget)
    │
    ▼
Repository
  - Executes parameterised SQL queries
  - Maps DB rows to typed interfaces
  - Returns typed results to service
    │
    ▼
PostgreSQL / Redis

Service Map

Route prefix Controller Service(s) Repository/ies
/api/v1/agents AgentController AgentService AgentRepository
/api/v1/credentials CredentialController CredentialService CredentialRepository
/api/v1/token TokenController OAuth2Service TokenRepository, CredentialRepository, AgentRepository
/api/v1/audit AuditController AuditService AuditRepository
/api/v1/organizations OrgController OrgService OrgRepository
/api/v1/compliance/* ComplianceController ComplianceService AuditRepository
/api/v1/analytics/* AnalyticsController AnalyticsService direct pool queries
/api/v1/tiers/* TierController TierService pool queries, Stripe SDK
/api/v1/webhooks WebhookController WebhookService WebhookRepository
/api/v1/federation FederationController FederationService direct pool queries
/api/v1/marketplace MarketplaceController MarketplaceService direct pool queries
/api/v1/billing BillingController BillingService direct pool queries
/.well-known/did.json, /api/v1/did/* DIDController DIDService AgentRepository
/.well-known/openid-configuration, /api/v1/oidc/* OIDCController OIDCKeyService, IDTokenService direct pool queries
/api/v1/oidc/trust-policies OIDCTrustPolicyController OIDCTrustPolicyService direct pool queries
/api/v1/delegation DelegationController DelegationService direct pool queries
/api/v1/scaffold ScaffoldController ScaffoldService
/health inline pool, redis
/metrics inline prom-client

New Services (Phases 36)

Service Source file Responsibility
AnalyticsService src/services/AnalyticsService.ts Fire-and-forget recordEvent, time-series getTokenTrend, heatmap getAgentActivity, per-agent getAgentUsageSummary
TierService src/services/TierService.ts getStatus (reads tenant_tiers), initiateUpgrade (creates Stripe Checkout Session), applyUpgrade (handles Stripe webhook), enforceAgentLimit
ComplianceService src/services/ComplianceService.ts generateReport (Redis-cached 5 min), exportAgentCards (AGNTCY format)
DelegationService src/services/DelegationService.ts A2A delegation chain creation and verification
DIDService src/services/DIDService.ts did:web identifier generation and DID document management
OIDCKeyService src/services/OIDCKeyService.ts OIDC key rotation, JWKS endpoint
IDTokenService src/services/IDTokenService.ts OIDC ID token issuance
FederationService src/services/FederationService.ts Cross-tenant agent identity federation
WebhookService src/services/WebhookService.ts Event subscriptions, delivery with retry, dead-letter queue
VaultService src/services/VaultService.ts HashiCorp Vault KV v2 read/write for credential storage
BillingService src/services/BillingService.ts Stripe customer and subscription management
MarketplaceService src/services/MarketplaceService.ts Agent listing and discovery
OIDCTrustPolicyService src/services/OIDCTrustPolicyService.ts GitHub OIDC trust policy management
EventPublisher src/services/EventPublisher.ts Routes domain events to webhook delivery and Kafka (if configured)

Ports

Service Internal port Exposed port (local dev)
AgentIdP app 3000 3000
Next.js portal 3001 3001
PostgreSQL 5432 5432
Redis 6379 6379

API Routes (Phase 6 complete)

Base path: /api/v1

Route Method(s) Auth Feature flag
/api/v1/agents GET, POST, PATCH, DELETE Bearer JWT always on
/api/v1/credentials GET, POST, DELETE Bearer JWT always on
/api/v1/token POST none (client credentials) always on
/api/v1/audit GET Bearer JWT always on
/api/v1/audit/verify GET Bearer JWT always on
/api/v1/organizations GET, POST Bearer JWT always on
/api/v1/compliance/controls GET none always on
/api/v1/compliance/report GET Bearer JWT COMPLIANCE_ENABLED=true
/api/v1/compliance/agent-cards GET Bearer JWT COMPLIANCE_ENABLED=true
/api/v1/analytics/token-trend GET Bearer JWT ANALYTICS_ENABLED=true
/api/v1/analytics/agent-activity GET Bearer JWT ANALYTICS_ENABLED=true
/api/v1/analytics/usage-summary GET Bearer JWT ANALYTICS_ENABLED=true
/api/v1/tiers/status GET Bearer JWT always on
/api/v1/tiers/upgrade POST Bearer JWT always on
/api/v1/webhooks GET, POST, DELETE Bearer JWT always on
/api/v1/federation GET, POST Bearer JWT always on
/api/v1/delegation GET, POST Bearer JWT always on
/api/v1/marketplace GET none always on
/api/v1/billing GET, POST Bearer JWT always on
/api/v1/did/* GET none always on
/api/v1/oidc/* GET, POST mixed always on
/.well-known/openid-configuration GET none always on
/.well-known/jwks.json GET none always on
/.well-known/did.json GET none always on
/health GET none always on
/metrics GET none always on

Graceful Shutdown

The server listens for SIGTERM and SIGINT. On receipt:

  1. server.close() is called — stops accepting new connections
  2. In-flight requests complete
  3. process.exit(0) is called

The PostgreSQL pool and Redis client are not explicitly closed in the current shutdown path. This is safe for single-instance deployments; connection cleanup is handled by the OS.