Files
sentryagent-idp/openspec/changes/phase-3-enterprise/proposal.md
SentryAgent.ai Developer cb7d079ef6 feat(openspec): Phase 3 Enterprise — proposal, design, specs, and tasks
Scaffolds the phase-3-enterprise OpenSpec change (proposal only — awaiting CEO
approval before implementation). 6 workstreams, 95 implementation tasks:

WS1: Multi-Tenancy (21 tasks) — org model, RLS, admin API
WS2: W3C DIDs (12 tasks) — DID:WEB, agent DID documents, AGNTCY cards
WS3: OIDC (12 tasks) — oidc-provider, ID tokens, JWKS, discovery
WS4: Federation (11 tasks) — cross-instance trust, JWT assertions
WS5: Webhooks (17 tasks) — subscriptions, Bull queue, HMAC, retry
WS6: SOC2 (22 tasks) — encryption at rest, Merkle audit chain, controls

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:53:31 +00:00

166 lines
8.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 3: Enterprise — Change Proposal
**Date**: 2026-03-29
**Author**: Virtual Architect
**Status**: Proposed — awaiting CEO approval
---
## Summary
Phase 1 delivered a complete, working AgentIdP MVP. Phase 2 made it production-ready: Vault-backed secrets, multi-language SDKs, OPA policy engine, React dashboard, Prometheus/Grafana observability, and multi-region Terraform deployment. Phase 3 makes AgentIdP enterprise-grade: the platform moves from a single-tenant developer tool to a multi-tenant enterprise identity platform with W3C DID support, OIDC compliance, AGNTCY federation, real-time event streaming, and SOC 2 Type II controls.
---
## Problem Statement
Phase 1 and Phase 2 are functional and production-ready but have the following enterprise gaps:
| Gap | Risk |
|-----|------|
| Single-tenant architecture | Cannot serve enterprise customers with isolated data requirements |
| No W3C DID support | Not fully AGNTCY-compliant; agents lack interoperable decentralized identifiers |
| OAuth 2.0 only, no OIDC | Cannot integrate with standard enterprise identity ecosystems (SSO, SCIM) |
| No cross-instance federation | Multi-organization agent identity cannot be verified across AgentIdP deployments |
| No webhook/event streaming | Operators cannot react to agent lifecycle events in real time |
| No SOC 2 controls | Cannot pass enterprise security reviews; blocks revenue from regulated industries |
---
## Proposed Changes
### 1. Multi-Tenancy
Introduce an Organization model so a single AgentIdP instance can serve multiple isolated organizations. Each organization has its own namespace of agents, credentials, audit log, and rate limits. A new Admin API provides organization lifecycle management. All existing agent, credential, and audit endpoints become organization-scoped.
### 2. W3C Decentralized Identifiers (DIDs)
Issue a W3C `did:web` identifier for every registered agent. Serve DID Documents at `/.well-known/did.json` (instance root) and `/agents/:id/did` (per-agent). Expose a DID resolution endpoint. Produce AGNTCY-format agent cards from DID Documents.
### 3. AGNTCY Federation
Enable cross-instance agent identity federation using signed JWT assertions. Operators register trusted federation partners. Tokens issued by a trusted remote AgentIdP instance can be verified locally, enabling multi-organization and cross-enterprise agent identity interoperability aligned with AGNTCY standards.
### 4. OpenID Connect (OIDC)
Add a full OIDC layer on top of the existing OAuth 2.0 implementation using the `oidc-provider` certified library. Exposes OIDC Discovery, JWKS, ID tokens with agent claims, and an `/agent-info` endpoint (the agent-identity equivalent of the OIDC `/userinfo` endpoint).
### 5. Webhooks and Event Streaming
Real-time event notifications for all agent lifecycle events: agent created, suspended, revoked, credential rotated, token issued. Operators create webhook subscriptions with HMAC-SHA256 signing. Delivery is async via a Redis-backed queue with exponential backoff retry. An optional Kafka/NATS adapter is available for high-throughput environments.
### 6. SOC 2 Type II Preparation
Implement the technical controls required for SOC 2 Type II audit: encryption at rest via PostgreSQL column-level encryption for secrets, TLS enforcement on all inbound connections, automated secrets rotation, security event alerting via Prometheus alerting rules, and audit log immutability proof using a Merkle hash chain appended to each `audit_logs` row.
---
## Out of Scope for Phase 3
- Rust/C++ SDKs (Phase 4)
- Azure Terraform module (Phase 4)
- SCIM provisioning (Phase 4)
- End-user (human operator) identity management (out of product scope — AgentIdP is agent-first)
---
## Capabilities Table
### New Capabilities
| Workstream | Capability | Type |
|-----------|-----------|------|
| Multi-Tenancy | Organization model with isolated agent namespaces | New |
| Multi-Tenancy | Admin API: create, list, update, delete organizations | New |
| Multi-Tenancy | Per-organization rate limits and audit logs | New |
| Multi-Tenancy | Organization member management | New |
| W3C DIDs | `did:web` identifier on every registered agent | New |
| W3C DIDs | DID Document endpoint per agent | New |
| W3C DIDs | Instance-level root DID Document | New |
| W3C DIDs | DID resolution endpoint | New |
| W3C DIDs | AGNTCY-format agent card from DID Document | New |
| OIDC | OIDC Discovery endpoint (`/.well-known/openid-configuration`) | New |
| OIDC | JWKS endpoint (`/.well-known/jwks.json`) | New |
| OIDC | ID token with agent claims in token response | Modified |
| OIDC | `/agent-info` endpoint (agent claims) | New |
| Federation | Trust registry: register and list federation partners | New |
| Federation | Cross-instance token verification endpoint | New |
| Federation | Signed JWT assertion inter-IdP protocol | New |
| Webhooks | Webhook subscription management (CRUD) | New |
| Webhooks | HMAC-SHA256 signed delivery with retry | New |
| Webhooks | Delivery history log | New |
| Webhooks | Kafka/NATS adapter (optional) | New |
| SOC 2 | PostgreSQL column-level encryption for secrets at rest | New |
| SOC 2 | TLS enforcement middleware (reject non-TLS) | New |
| SOC 2 | Automated secrets rotation schedule | New |
| SOC 2 | Security event alerting (Prometheus alerting rules) | New |
| SOC 2 | Merkle hash chain on `audit_logs` for immutability proof | New |
| SOC 2 | Compliance documentation (controls matrix, runbook) | New |
### Modified Capabilities
| Workstream | Capability | Change |
|-----------|-----------|--------|
| Multi-Tenancy | `POST /agents` | Now scoped to `organizationId` |
| Multi-Tenancy | `GET /agents` | Filters restricted to caller's organization |
| Multi-Tenancy | `GET /audit` | Restricted to caller's organization by default |
| Multi-Tenancy | Rate limiting | Per-organization limits in addition to global |
| OIDC | `POST /oauth2/token` | Returns `id_token` in addition to `access_token` |
| SOC 2 | Audit log write path | Computes and appends Merkle hash on insert |
---
## Repository Impact
| Area | Impact |
|------|--------|
| `src/` | New services: OrgService, DIDService, OIDCService, FederationService, WebhookService, SOC2Controls |
| `src/db/migrations/` | 810 new migration files |
| `src/types/index.ts` | ~80 new interfaces/types |
| `src/middleware/` | New TLS enforcement middleware, updated auth middleware for org context |
| `src/routes/` | 6 new route files |
| `/.well-known/` | 3 new well-known endpoints |
| `policies/` | Updated Rego policies for org-scoped permissions |
| `dashboard/` | New Organization management pages |
| `monitoring/` | New alerting rules for SOC 2 security events |
| `docs/` | Compliance documentation, federation setup guide, webhook integration guide |
---
## New Dependencies
| Workstream | Package | Purpose | CEO Approval Required |
|-----------|---------|---------|----------------------|
| Multi-Tenancy | No new packages — row-level tenancy in existing PostgreSQL | — | No |
| W3C DIDs | `did-resolver` | W3C DID resolution | Yes |
| W3C DIDs | `web-did-resolver` | DID:WEB method resolver | Yes |
| OIDC | `oidc-provider` | Certified OIDC server library | Yes |
| Federation | No new packages — signed JWT assertions use existing `jsonwebtoken` | — | No |
| Webhooks | `bull` (Redis-backed queue) | Async webhook delivery queue | Yes |
| Webhooks | `kafkajs` (optional, Kafka adapter) | Kafka event streaming | Yes |
| SOC 2 | `node-forge` | Column-level encryption primitives | Yes |
---
## Delivery Sequence
Multi-tenancy is a prerequisite for all enterprise customer work — it must land first. DID support and OIDC are independent and can proceed in parallel. Federation depends on DIDs being in place. Webhooks are standalone. SOC 2 controls cut across the entire codebase and are implemented last to ensure all features they protect are already present.
```
1. Multi-Tenancy (prerequisite — all enterprise features assume org context)
2. W3C DIDs (parallel)
OIDC (parallel)
3. Federation (depends on DIDs)
4. Webhooks (standalone)
5. SOC 2 (cuts across all workstreams — implemented after all features are stable)
```
---
## Success Criteria
- All new dependencies CEO-approved before implementation begins
- All new API endpoints have OpenAPI 3.0 specs before implementation
- Multi-tenancy isolation verified: no cross-organization data leakage
- DID Documents are W3C DID Core 1.0 compliant and resolve correctly
- OIDC Discovery passes `oidc-provider` conformance test suite
- Federation token verification rejects tampered assertions
- Webhook delivery achieves >99.9% success rate with retry logic
- SOC 2 controls pass independent technical review
- TypeScript strict mode + zero `any` maintained throughout
- >80% test coverage on all new services