feat(openspec): Phase 3 Enterprise — proposal, design, specs, and tasks

Scaffolds the phase-3-enterprise OpenSpec change (proposal only — awaiting CEO
approval before implementation). 6 workstreams, 95 implementation tasks:

WS1: Multi-Tenancy (21 tasks) — org model, RLS, admin API
WS2: W3C DIDs (12 tasks) — DID:WEB, agent DID documents, AGNTCY cards
WS3: OIDC (12 tasks) — oidc-provider, ID tokens, JWKS, discovery
WS4: Federation (11 tasks) — cross-instance trust, JWT assertions
WS5: Webhooks (17 tasks) — subscriptions, Bull queue, HMAC, retry
WS6: SOC2 (22 tasks) — encryption at rest, Merkle audit chain, controls

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
SentryAgent.ai Developer
2026-03-29 12:53:31 +00:00
parent d42c653eea
commit cb7d079ef6
10 changed files with 2922 additions and 0 deletions

View File

@@ -0,0 +1,165 @@
# Phase 3: Enterprise — Change Proposal
**Date**: 2026-03-29
**Author**: Virtual Architect
**Status**: Proposed — awaiting CEO approval
---
## Summary
Phase 1 delivered a complete, working AgentIdP MVP. Phase 2 made it production-ready: Vault-backed secrets, multi-language SDKs, OPA policy engine, React dashboard, Prometheus/Grafana observability, and multi-region Terraform deployment. Phase 3 makes AgentIdP enterprise-grade: the platform moves from a single-tenant developer tool to a multi-tenant enterprise identity platform with W3C DID support, OIDC compliance, AGNTCY federation, real-time event streaming, and SOC 2 Type II controls.
---
## Problem Statement
Phase 1 and Phase 2 are functional and production-ready but have the following enterprise gaps:
| Gap | Risk |
|-----|------|
| Single-tenant architecture | Cannot serve enterprise customers with isolated data requirements |
| No W3C DID support | Not fully AGNTCY-compliant; agents lack interoperable decentralized identifiers |
| OAuth 2.0 only, no OIDC | Cannot integrate with standard enterprise identity ecosystems (SSO, SCIM) |
| No cross-instance federation | Multi-organization agent identity cannot be verified across AgentIdP deployments |
| No webhook/event streaming | Operators cannot react to agent lifecycle events in real time |
| No SOC 2 controls | Cannot pass enterprise security reviews; blocks revenue from regulated industries |
---
## Proposed Changes
### 1. Multi-Tenancy
Introduce an Organization model so a single AgentIdP instance can serve multiple isolated organizations. Each organization has its own namespace of agents, credentials, audit log, and rate limits. A new Admin API provides organization lifecycle management. All existing agent, credential, and audit endpoints become organization-scoped.
### 2. W3C Decentralized Identifiers (DIDs)
Issue a W3C `did:web` identifier for every registered agent. Serve DID Documents at `/.well-known/did.json` (instance root) and `/agents/:id/did` (per-agent). Expose a DID resolution endpoint. Produce AGNTCY-format agent cards from DID Documents.
### 3. AGNTCY Federation
Enable cross-instance agent identity federation using signed JWT assertions. Operators register trusted federation partners. Tokens issued by a trusted remote AgentIdP instance can be verified locally, enabling multi-organization and cross-enterprise agent identity interoperability aligned with AGNTCY standards.
### 4. OpenID Connect (OIDC)
Add a full OIDC layer on top of the existing OAuth 2.0 implementation using the `oidc-provider` certified library. Exposes OIDC Discovery, JWKS, ID tokens with agent claims, and an `/agent-info` endpoint (the agent-identity equivalent of the OIDC `/userinfo` endpoint).
### 5. Webhooks and Event Streaming
Real-time event notifications for all agent lifecycle events: agent created, suspended, revoked, credential rotated, token issued. Operators create webhook subscriptions with HMAC-SHA256 signing. Delivery is async via a Redis-backed queue with exponential backoff retry. An optional Kafka/NATS adapter is available for high-throughput environments.
### 6. SOC 2 Type II Preparation
Implement the technical controls required for SOC 2 Type II audit: encryption at rest via PostgreSQL column-level encryption for secrets, TLS enforcement on all inbound connections, automated secrets rotation, security event alerting via Prometheus alerting rules, and audit log immutability proof using a Merkle hash chain appended to each `audit_logs` row.
---
## Out of Scope for Phase 3
- Rust/C++ SDKs (Phase 4)
- Azure Terraform module (Phase 4)
- SCIM provisioning (Phase 4)
- End-user (human operator) identity management (out of product scope — AgentIdP is agent-first)
---
## Capabilities Table
### New Capabilities
| Workstream | Capability | Type |
|-----------|-----------|------|
| Multi-Tenancy | Organization model with isolated agent namespaces | New |
| Multi-Tenancy | Admin API: create, list, update, delete organizations | New |
| Multi-Tenancy | Per-organization rate limits and audit logs | New |
| Multi-Tenancy | Organization member management | New |
| W3C DIDs | `did:web` identifier on every registered agent | New |
| W3C DIDs | DID Document endpoint per agent | New |
| W3C DIDs | Instance-level root DID Document | New |
| W3C DIDs | DID resolution endpoint | New |
| W3C DIDs | AGNTCY-format agent card from DID Document | New |
| OIDC | OIDC Discovery endpoint (`/.well-known/openid-configuration`) | New |
| OIDC | JWKS endpoint (`/.well-known/jwks.json`) | New |
| OIDC | ID token with agent claims in token response | Modified |
| OIDC | `/agent-info` endpoint (agent claims) | New |
| Federation | Trust registry: register and list federation partners | New |
| Federation | Cross-instance token verification endpoint | New |
| Federation | Signed JWT assertion inter-IdP protocol | New |
| Webhooks | Webhook subscription management (CRUD) | New |
| Webhooks | HMAC-SHA256 signed delivery with retry | New |
| Webhooks | Delivery history log | New |
| Webhooks | Kafka/NATS adapter (optional) | New |
| SOC 2 | PostgreSQL column-level encryption for secrets at rest | New |
| SOC 2 | TLS enforcement middleware (reject non-TLS) | New |
| SOC 2 | Automated secrets rotation schedule | New |
| SOC 2 | Security event alerting (Prometheus alerting rules) | New |
| SOC 2 | Merkle hash chain on `audit_logs` for immutability proof | New |
| SOC 2 | Compliance documentation (controls matrix, runbook) | New |
### Modified Capabilities
| Workstream | Capability | Change |
|-----------|-----------|--------|
| Multi-Tenancy | `POST /agents` | Now scoped to `organizationId` |
| Multi-Tenancy | `GET /agents` | Filters restricted to caller's organization |
| Multi-Tenancy | `GET /audit` | Restricted to caller's organization by default |
| Multi-Tenancy | Rate limiting | Per-organization limits in addition to global |
| OIDC | `POST /oauth2/token` | Returns `id_token` in addition to `access_token` |
| SOC 2 | Audit log write path | Computes and appends Merkle hash on insert |
---
## Repository Impact
| Area | Impact |
|------|--------|
| `src/` | New services: OrgService, DIDService, OIDCService, FederationService, WebhookService, SOC2Controls |
| `src/db/migrations/` | 810 new migration files |
| `src/types/index.ts` | ~80 new interfaces/types |
| `src/middleware/` | New TLS enforcement middleware, updated auth middleware for org context |
| `src/routes/` | 6 new route files |
| `/.well-known/` | 3 new well-known endpoints |
| `policies/` | Updated Rego policies for org-scoped permissions |
| `dashboard/` | New Organization management pages |
| `monitoring/` | New alerting rules for SOC 2 security events |
| `docs/` | Compliance documentation, federation setup guide, webhook integration guide |
---
## New Dependencies
| Workstream | Package | Purpose | CEO Approval Required |
|-----------|---------|---------|----------------------|
| Multi-Tenancy | No new packages — row-level tenancy in existing PostgreSQL | — | No |
| W3C DIDs | `did-resolver` | W3C DID resolution | Yes |
| W3C DIDs | `web-did-resolver` | DID:WEB method resolver | Yes |
| OIDC | `oidc-provider` | Certified OIDC server library | Yes |
| Federation | No new packages — signed JWT assertions use existing `jsonwebtoken` | — | No |
| Webhooks | `bull` (Redis-backed queue) | Async webhook delivery queue | Yes |
| Webhooks | `kafkajs` (optional, Kafka adapter) | Kafka event streaming | Yes |
| SOC 2 | `node-forge` | Column-level encryption primitives | Yes |
---
## Delivery Sequence
Multi-tenancy is a prerequisite for all enterprise customer work — it must land first. DID support and OIDC are independent and can proceed in parallel. Federation depends on DIDs being in place. Webhooks are standalone. SOC 2 controls cut across the entire codebase and are implemented last to ensure all features they protect are already present.
```
1. Multi-Tenancy (prerequisite — all enterprise features assume org context)
2. W3C DIDs (parallel)
OIDC (parallel)
3. Federation (depends on DIDs)
4. Webhooks (standalone)
5. SOC 2 (cuts across all workstreams — implemented after all features are stable)
```
---
## Success Criteria
- All new dependencies CEO-approved before implementation begins
- All new API endpoints have OpenAPI 3.0 specs before implementation
- Multi-tenancy isolation verified: no cross-organization data leakage
- DID Documents are W3C DID Core 1.0 compliant and resolve correctly
- OIDC Discovery passes `oidc-provider` conformance test suite
- Federation token verification rejects tampered assertions
- Webhook delivery achieves >99.9% success rate with retry logic
- SOC 2 controls pass independent technical review
- TypeScript strict mode + zero `any` maintained throughout
- >80% test coverage on all new services