chore(openspec): archive all completed changes, sync 14 new specs to library
Archived 4 completed OpenSpec changes (2026-04-02): - phase-3-enterprise (100/100 tasks) — 6 Phase 3 capabilities synced - devops-documentation (48/48 tasks) — 3 new + 1 merged capability - bedroom-developer-docs (33/33 tasks) — 4 new capabilities synced - engineering-docs (superseded by 2026-03-29 archive) — no tasks Main spec library grows from 21 → 35 capabilities (+14 new): federation, multi-tenancy, oidc, soc2, w3c-dids, webhooks, database, operations, system-overview, api-reference, core-concepts, developer-guides, quick-start + deployment (merged additive requirements) Active changes: 0 — project board is clear for Phase 4 planning. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
105
openspec/changes/archive/2026-04-02-engineering-docs/design.md
Normal file
105
openspec/changes/archive/2026-04-02-engineering-docs/design.md
Normal file
@@ -0,0 +1,105 @@
|
|||||||
|
## Context
|
||||||
|
|
||||||
|
SentryAgent.ai has completed Phase 1 (MVP) and Phase 2 (Production-Ready), producing a fully implemented AgentIdP with 12 capabilities across ~150 source files, 4 language SDKs, Terraform infrastructure, and a React web dashboard. The codebase is mature but undocumented at the engineering level — there are bedroom developer guides (`docs/developers/`) and DevOps guides (`docs/devops/`), but no structured internal engineering knowledge base.
|
||||||
|
|
||||||
|
New hires arrive with BSc Computer Science and one year of industrial experience. They understand programming fundamentals and have worked on codebases before, but they have no context on: what SentryAgent.ai is building, why architectural decisions were made, how the codebase is structured, how to navigate the services, how to contribute per our standards, or how the OpenSpec workflow operates. Without documentation, onboarding is fragmented and relies entirely on the CTO's time.
|
||||||
|
|
||||||
|
The goal is a `docs/engineering/` directory that a new engineer can read sequentially from top to bottom and arrive ready to contribute within their first week.
|
||||||
|
|
||||||
|
## Goals / Non-Goals
|
||||||
|
|
||||||
|
**Goals:**
|
||||||
|
- Produce a complete top-down engineering knowledge base readable in sequence
|
||||||
|
- Cover all 10 capability areas identified in the proposal
|
||||||
|
- Calibrate depth for BSc + 1yr experience — assume programming competence, explain domain and architectural decisions
|
||||||
|
- Every document is self-contained with internal cross-links where needed
|
||||||
|
- All code examples are complete and runnable (no ellipses, no `// ... rest of code`)
|
||||||
|
- Development environment setup is achievable in under 30 minutes following the guide alone
|
||||||
|
- Annotated walkthroughs trace the three critical flows through every layer of code with file:line references
|
||||||
|
|
||||||
|
**Non-Goals:**
|
||||||
|
- Not a replacement for `docs/developers/` (end-user API reference) or `docs/devops/` (operator runbooks)
|
||||||
|
- Not a tutorial for learning TypeScript, React, or Terraform — assumes language competence
|
||||||
|
- Not a complete API reference — `docs/developers/api-reference.md` already covers that
|
||||||
|
- Not roadmap documentation — focuses on what is built, not what is planned
|
||||||
|
|
||||||
|
## Decisions
|
||||||
|
|
||||||
|
### D1: Location — `docs/engineering/` as a flat directory with an index
|
||||||
|
|
||||||
|
**Decision**: All engineering docs live in `docs/engineering/` as flat markdown files with a `README.md` index.
|
||||||
|
|
||||||
|
**Rationale**: Deep nested directory structures create navigation friction. Flat layout with numbered filenames (`01-overview.md`, `02-architecture.md`) ensures reading order is obvious without needing a build tool. Gitea renders markdown natively, so no documentation site tooling is required.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- `docs/engineering/<subdirs>/` — rejected: adds navigation complexity with no benefit at our current document count
|
||||||
|
- Docusaurus site — rejected: adds build infrastructure overhead; plain markdown in-repo is sufficient and always in sync with code
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### D2: Numbered file naming for enforced reading order
|
||||||
|
|
||||||
|
**Decision**: Files are named `01-overview.md` through `10-sdk-guide.md`.
|
||||||
|
|
||||||
|
**Rationale**: New engineers need a guided path, not a reference library. Numbers make the intended reading sequence unambiguous without any tooling. The `README.md` index maps numbers to sections.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### D3: Annotated walkthroughs use file:line references
|
||||||
|
|
||||||
|
**Decision**: Code walkthrough documents reference actual source files with line numbers (e.g., `src/controllers/agentController.ts:45`).
|
||||||
|
|
||||||
|
**Rationale**: Engineers with 1yr experience learn fastest by reading real code, not simplified pseudocode. File:line references let them jump directly to the relevant section in their editor or on Gitea.
|
||||||
|
|
||||||
|
**Trade-off**: Line numbers drift as code changes. Mitigation: walkthrough documents include a "last verified" version comment and note which commit they were verified against. The CTO adds walkthrough review to the Phase 3 change process as a maintenance item.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### D4: Three walkthroughs selected by criticality and complexity
|
||||||
|
|
||||||
|
**Decision**: Walkthroughs cover: (1) OAuth 2.0 token issuance, (2) agent registration, (3) credential rotation.
|
||||||
|
|
||||||
|
**Rationale**:
|
||||||
|
- Token issuance is the highest-traffic path and touches the most layers (controller → service → repository → Redis → JWT signing)
|
||||||
|
- Agent registration is the entry point for all users and demonstrates the full validation + persistence + audit pattern
|
||||||
|
- Credential rotation demonstrates the Vault integration path and shows how Phase 2 extended Phase 1 patterns
|
||||||
|
|
||||||
|
These three flows collectively exercise every architectural layer and every major design pattern in the codebase.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### D5: Service deep-dives use a consistent template
|
||||||
|
|
||||||
|
**Decision**: Each service deep-dive follows the structure: Purpose → Responsibility boundary → Interface → Key methods → Database schema (if applicable) → Error types → Configuration.
|
||||||
|
|
||||||
|
**Rationale**: Consistency reduces cognitive load. An engineer who has read the AgentService deep-dive knows exactly where to look for the same information in the OAuth2Service deep-dive. The template mirrors SOLID's Single Responsibility — each section answers one question.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### D6: Engineering workflow doc is prescriptive, not descriptive
|
||||||
|
|
||||||
|
**Decision**: The workflow guide tells engineers exactly what to do step by step, not just what the process is.
|
||||||
|
|
||||||
|
**Rationale**: Engineers with 1yr experience have worked in teams but may not have used a spec-first workflow before. A prescriptive guide ("Step 1: run `openspec new change <name>`") reduces ambiguity and enforces our standards from day one.
|
||||||
|
|
||||||
|
## Risks / Trade-offs
|
||||||
|
|
||||||
|
**[Line numbers drift as code evolves]** → Walkthroughs include a "last verified against commit X" header. The CTO assigns a quarterly walkthrough review task in each Phase change.
|
||||||
|
|
||||||
|
**[Docs can become stale if not maintained]** → Each document has a "Last updated" field in its header. The engineering workflow guide explicitly requires updating relevant engineering docs as part of any PR that changes architecture or public service interfaces.
|
||||||
|
|
||||||
|
**[Scope is large — ~15 documents, ~10,000 lines]** → Tasks are broken into discrete documents, each independently completable. No document depends on another being written first (only the index depends on all others).
|
||||||
|
|
||||||
|
## Migration Plan
|
||||||
|
|
||||||
|
1. Create `docs/engineering/` directory
|
||||||
|
2. Write all 15 documents (10 capability areas, some split across multiple files)
|
||||||
|
3. Write `docs/engineering/README.md` index with links and reading order
|
||||||
|
4. Commit all to `develop` in a single commit
|
||||||
|
5. No existing documentation is modified or removed
|
||||||
|
|
||||||
|
No rollback required — this is additive only.
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
_(none — all decisions made above; scope fully defined in proposal)_
|
||||||
@@ -0,0 +1,42 @@
|
|||||||
|
## Why
|
||||||
|
|
||||||
|
SentryAgent.ai is growing and hiring engineers with BSc Computer Science and one year of industrial experience. There are currently no internal engineering documents that explain how the system works from the top down — new engineers have no structured path from product vision to running code, and no reference for how to contribute correctly. This gap slows onboarding, increases mistakes, and risks divergence from our architecture and standards.
|
||||||
|
|
||||||
|
## What Changes
|
||||||
|
|
||||||
|
- New `docs/engineering/` directory added to the repository as the canonical engineering knowledge base
|
||||||
|
- Top-down documentation suite covering all layers of the system: company vision → architecture → codebase → services → workflows → operations
|
||||||
|
- Annotated code walkthroughs for the three most critical system flows (token issuance, agent registration, credential rotation)
|
||||||
|
- Development environment setup guide targeting < 30 minutes from clone to running local stack
|
||||||
|
- Engineering workflow guide covering the full OpenSpec → Architect → Developer → QA → merge cycle
|
||||||
|
- Service deep-dive documents for all 8 core services/components
|
||||||
|
- SDK integration guide covering all four language SDKs
|
||||||
|
- Testing strategy and quality gate reference
|
||||||
|
- Deployment and operations reference covering Docker, Terraform, and monitoring
|
||||||
|
|
||||||
|
## Capabilities
|
||||||
|
|
||||||
|
### New Capabilities
|
||||||
|
|
||||||
|
- `engineering-overview`: Company mission, product vision, system purpose, and how the engineering team operates — the entry point for all new hires
|
||||||
|
- `architecture-guide`: System architecture including component diagram, data flow diagrams, deployment topology, and technology stack rationale (ADRs)
|
||||||
|
- `codebase-structure`: Annotated directory map explaining every top-level directory and key file, what lives where and why
|
||||||
|
- `service-deep-dives`: Per-service documentation for AgentService, OAuth2Service, CredentialService, AuditService, VaultClient, OPA policy engine, Web Dashboard, and Prometheus/Grafana monitoring
|
||||||
|
- `code-walkthroughs`: Step-by-step annotated traces of the three critical flows: token issuance end-to-end, agent registration end-to-end, credential rotation end-to-end
|
||||||
|
- `dev-environment-setup`: Local development environment setup — prerequisites, clone, configure, Docker Compose up, smoke test — targeting < 30 minutes
|
||||||
|
- `engineering-workflow`: How to contribute — OpenSpec spec-first workflow, branching strategy, PR standards, quality gates, and the role of each virtual engineering team member
|
||||||
|
- `testing-strategy`: Test framework, test types (unit vs integration), coverage gates, how to run tests, and how to write new tests following project conventions
|
||||||
|
- `deployment-operations`: Docker build and run, Terraform multi-region deployment, environment configuration, Prometheus/Grafana monitoring, and operational runbooks
|
||||||
|
- `sdk-guide`: Integration guide for Node.js, Python, Go, and Java SDKs — installation, authentication, all major operations, error handling
|
||||||
|
|
||||||
|
### Modified Capabilities
|
||||||
|
|
||||||
|
_(none — this change adds documentation only; no existing spec-level behavior changes)_
|
||||||
|
|
||||||
|
## Impact
|
||||||
|
|
||||||
|
- **Repository**: New `docs/engineering/` directory (~15 documents, ~10,000 lines of markdown)
|
||||||
|
- **No code changes**: Documentation only — zero impact on `src/`, `tests/`, `sdk/`, or infrastructure
|
||||||
|
- **Dependencies**: None — no new packages required
|
||||||
|
- **APIs**: No API changes
|
||||||
|
- **Existing docs**: `docs/developers/` (bedroom developer guide) and `docs/devops/` (operations) remain unchanged; this is an additive engineering-internal knowledge base
|
||||||
@@ -0,0 +1,35 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: System architecture document
|
||||||
|
The system SHALL include a document (`docs/engineering/02-architecture.md`) that describes the full system architecture: components, their responsibilities, how they communicate, and the deployment topology.
|
||||||
|
|
||||||
|
#### Scenario: Component diagram present
|
||||||
|
- **WHEN** a new engineer reads 02-architecture.md
|
||||||
|
- **THEN** they SHALL find an ASCII or Mermaid component diagram showing all major components (API server, PostgreSQL, Redis, Vault, OPA, Web Dashboard, Prometheus, Grafana) and their connections
|
||||||
|
|
||||||
|
#### Scenario: Request lifecycle explained
|
||||||
|
- **WHEN** a new engineer reads 02-architecture.md
|
||||||
|
- **THEN** they SHALL understand how an incoming HTTP request flows from client → Express router → middleware chain → controller → service → repository → database and back
|
||||||
|
|
||||||
|
#### Scenario: Data flow for authentication described
|
||||||
|
- **WHEN** a new engineer reads 02-architecture.md
|
||||||
|
- **THEN** they SHALL understand the OAuth 2.0 Client Credentials flow: client presents credentials → token service validates → Redis checked for existing token → JWT signed and returned
|
||||||
|
|
||||||
|
#### Scenario: Deployment topology covered
|
||||||
|
- **WHEN** a new engineer reads 02-architecture.md
|
||||||
|
- **THEN** they SHALL understand the multi-region deployment model (US, EU, APAC) and how Terraform provisions it
|
||||||
|
|
||||||
|
### Requirement: Technology stack and ADR document
|
||||||
|
The system SHALL include a document (`docs/engineering/03-tech-stack.md`) that lists every technology in the stack and explains why it was chosen over alternatives.
|
||||||
|
|
||||||
|
#### Scenario: Every major technology documented with rationale
|
||||||
|
- **WHEN** a new engineer reads 03-tech-stack.md
|
||||||
|
- **THEN** they SHALL find an entry for each technology (Node.js 18, TypeScript 5.3, Express 4.18, PostgreSQL 14, Redis 7, HashiCorp Vault, OPA, React 18, Vite 5, Prometheus, Grafana, Terraform) with: what it does in the system, why it was chosen, and what was considered but rejected
|
||||||
|
|
||||||
|
#### Scenario: TypeScript strict mode rationale explained
|
||||||
|
- **WHEN** a new engineer reads 03-tech-stack.md
|
||||||
|
- **THEN** they SHALL understand why strict mode is mandatory (safety, correctness, no implicit any) and what the consequences of violating it are
|
||||||
|
|
||||||
|
#### Scenario: PostgreSQL vs Redis responsibility boundary clear
|
||||||
|
- **WHEN** a new engineer reads 03-tech-stack.md
|
||||||
|
- **THEN** they SHALL understand what is stored in PostgreSQL (persistent state: agents, credentials, audit logs) vs Redis (ephemeral state: active tokens, rate limit counters)
|
||||||
@@ -0,0 +1,27 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Annotated code walkthrough documents
|
||||||
|
The system SHALL include a document (`docs/engineering/06-walkthroughs.md`) containing three annotated end-to-end walkthroughs of the system's critical flows, with file:line references to actual source code.
|
||||||
|
|
||||||
|
#### Scenario: Token issuance walkthrough complete
|
||||||
|
- **WHEN** a new engineer reads the token issuance walkthrough
|
||||||
|
- **THEN** they SHALL be guided step by step from: HTTP POST /oauth2/token → Express router → auth middleware → OAuth2Controller → OAuth2Service → CredentialRepository → Vault/bcrypt credential check → Redis token cache check → JWT signing (src/utils/jwt.ts) → AuditService.logEvent → HTTP 200 response
|
||||||
|
- **AND** every step SHALL reference the actual file and line number where it occurs
|
||||||
|
|
||||||
|
#### Scenario: Agent registration walkthrough complete
|
||||||
|
- **WHEN** a new engineer reads the agent registration walkthrough
|
||||||
|
- **THEN** they SHALL be guided step by step from: HTTP POST /agents → auth middleware → validation middleware → AgentController → AgentService.createAgent → input validation (src/utils/validators.ts) → AgentRepository.create → PostgreSQL INSERT → AuditService.logEvent → HTTP 201 response with agent object
|
||||||
|
- **AND** every step SHALL reference the actual file and line number
|
||||||
|
|
||||||
|
#### Scenario: Credential rotation walkthrough complete
|
||||||
|
- **WHEN** a new engineer reads the credential rotation walkthrough
|
||||||
|
- **THEN** they SHALL be guided step by step from: HTTP POST /agents/:id/credentials/:credId/rotate → auth middleware → CredentialController → CredentialService.rotateCredential → old credential revocation → new secret generation (src/utils/crypto.ts) → Vault write or bcrypt hash → CredentialRepository.update → token revocation for old credentials → AuditService.logEvent → HTTP 200 response
|
||||||
|
- **AND** every step SHALL reference the actual file and line number
|
||||||
|
|
||||||
|
#### Scenario: Walkthroughs include version reference
|
||||||
|
- **WHEN** a new engineer reads any walkthrough
|
||||||
|
- **THEN** the document SHALL include a header stating the commit hash it was last verified against, so engineers know if the walkthrough may have drifted from the current code
|
||||||
|
|
||||||
|
#### Scenario: Each walkthrough annotates why, not just what
|
||||||
|
- **WHEN** a new engineer reads a walkthrough step
|
||||||
|
- **THEN** each step SHALL explain not just what the code does but WHY — e.g., why Redis is checked before signing a new JWT, why constant-time comparison is used for credential verification, why audit logging happens after persistence not before
|
||||||
@@ -0,0 +1,24 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Codebase structure document
|
||||||
|
The system SHALL include a document (`docs/engineering/04-codebase-structure.md`) that provides an annotated map of every top-level directory and key file in the repository, explaining what lives where and why.
|
||||||
|
|
||||||
|
#### Scenario: Full directory tree annotated
|
||||||
|
- **WHEN** a new engineer reads 04-codebase-structure.md
|
||||||
|
- **THEN** they SHALL find an annotated directory tree covering: `src/`, `tests/`, `docs/`, `sdk/`, `sdk-python/`, `sdk-go/`, `sdk-java/`, `terraform/`, `dashboard/`, `migrations/`, `openspec/`, `scripts/`
|
||||||
|
|
||||||
|
#### Scenario: src/ subdirectory roles explained
|
||||||
|
- **WHEN** a new engineer reads 04-codebase-structure.md
|
||||||
|
- **THEN** they SHALL understand the role of each `src/` subdirectory: `controllers/` (HTTP layer), `services/` (business logic), `repositories/` (data access), `middleware/` (cross-cutting concerns), `utils/` (shared utilities), `types/` (TypeScript interfaces), `routes/` (Express router definitions)
|
||||||
|
|
||||||
|
#### Scenario: Where to add new code explained
|
||||||
|
- **WHEN** a new engineer needs to add a new feature
|
||||||
|
- **THEN** the document SHALL tell them exactly where each type of code belongs: new endpoint → controller + route; new business logic → service; new DB query → repository; new shared utility → utils/
|
||||||
|
|
||||||
|
#### Scenario: Key files identified and explained
|
||||||
|
- **WHEN** a new engineer reads 04-codebase-structure.md
|
||||||
|
- **THEN** they SHALL find explanations of: `src/app.ts` (Express app setup), `src/server.ts` (entry point), `src/types/index.ts` (canonical type definitions), `src/utils/errors.ts` (error hierarchy), `docker-compose.yml` (local dev stack), `tsconfig.json` (TypeScript config)
|
||||||
|
|
||||||
|
#### Scenario: DRY principle mapped to structure
|
||||||
|
- **WHEN** a new engineer reads 04-codebase-structure.md
|
||||||
|
- **THEN** they SHALL understand how the directory structure enforces DRY: one location for types, one for crypto utilities, one for JWT utilities, one for validators — and why duplication across these is a blocking PR issue
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Deployment and operations guide
|
||||||
|
The system SHALL include a document (`docs/engineering/10-deployment.md`) that explains how the application is built, deployed, and operated — covering Docker, Terraform, environment configuration, and monitoring.
|
||||||
|
|
||||||
|
#### Scenario: Docker build and run documented
|
||||||
|
- **WHEN** a new engineer reads 10-deployment.md
|
||||||
|
- **THEN** they SHALL understand the multi-stage Dockerfile (builder stage compiles TypeScript, production stage runs compiled JS with node:18-alpine and non-root USER node), how to build the image, and how to run it with the required environment variables
|
||||||
|
|
||||||
|
#### Scenario: Environment variables fully documented
|
||||||
|
- **WHEN** a new engineer needs to configure the application
|
||||||
|
- **THEN** the guide SHALL provide a complete table of all environment variables: name, purpose, required/optional, example value — covering database, Redis, JWT signing key, Vault, OPA, and rate limiting config
|
||||||
|
|
||||||
|
#### Scenario: Database migrations documented
|
||||||
|
- **WHEN** a new engineer needs to run or write migrations
|
||||||
|
- **THEN** the guide SHALL explain: where migration files live (`migrations/`), the naming convention, how to run them (`npm run migrate`), and how to write a new migration following the existing pattern
|
||||||
|
|
||||||
|
#### Scenario: Terraform multi-region deployment explained
|
||||||
|
- **WHEN** a new engineer reads 10-deployment.md
|
||||||
|
- **THEN** they SHALL understand the Terraform structure: what modules exist, what the three regions (US, EU, APAC) deploy, how to run `terraform plan` and `terraform apply`, and what AWS/GCP resources are provisioned
|
||||||
|
|
||||||
|
#### Scenario: Prometheus metrics and Grafana explained
|
||||||
|
- **WHEN** a new engineer reads 10-deployment.md
|
||||||
|
- **THEN** they SHALL find: which endpoint exposes metrics (`/metrics`), the key metrics tracked, how to access the Grafana dashboard locally (port, login), and how to add a new metric counter or histogram to the API server
|
||||||
|
|
||||||
|
#### Scenario: Operational runbook for common tasks
|
||||||
|
- **WHEN** a new engineer is on-call or supporting operations
|
||||||
|
- **THEN** the guide SHALL include a runbook covering: how to check application health, how to rotate the JWT signing key, how to revoke all tokens for a compromised agent, and how to read audit logs for an incident
|
||||||
@@ -0,0 +1,32 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Development environment setup guide
|
||||||
|
The system SHALL include a document (`docs/engineering/07-dev-setup.md`) that takes a new engineer from zero to a fully running local stack in under 30 minutes, with no prior knowledge of the project assumed.
|
||||||
|
|
||||||
|
#### Scenario: Prerequisites listed completely
|
||||||
|
- **WHEN** a new engineer reads 07-dev-setup.md
|
||||||
|
- **THEN** they SHALL find a complete prerequisites list: Node.js 18+, Docker Desktop, Git, a PostgreSQL client (optional), and links to install each — with no undocumented dependencies
|
||||||
|
|
||||||
|
#### Scenario: Repository clone and setup steps complete
|
||||||
|
- **WHEN** a new engineer follows the clone and setup steps
|
||||||
|
- **THEN** they SHALL be able to: clone the repo, copy `.env.example` to `.env`, run `npm install`, and have all dependencies installed with zero manual configuration
|
||||||
|
|
||||||
|
#### Scenario: Docker Compose local stack starts successfully
|
||||||
|
- **WHEN** a new engineer runs `docker-compose up -d`
|
||||||
|
- **THEN** all services (PostgreSQL, Redis, API server) SHALL start, migrations SHALL run automatically, and the guide SHALL show how to verify each service is healthy
|
||||||
|
|
||||||
|
#### Scenario: Smoke test confirms working stack
|
||||||
|
- **WHEN** a new engineer follows the smoke test section
|
||||||
|
- **THEN** they SHALL run a curl command to POST /oauth2/token with the seed credentials and receive a valid JWT — confirming the full stack is operational
|
||||||
|
|
||||||
|
#### Scenario: Common setup errors documented
|
||||||
|
- **WHEN** a new engineer encounters a setup error
|
||||||
|
- **THEN** the guide SHALL include a troubleshooting section covering the 5 most common errors: port already in use, migration failure, Node version mismatch, Docker not running, and missing .env variables
|
||||||
|
|
||||||
|
#### Scenario: Running tests locally documented
|
||||||
|
- **WHEN** a new engineer wants to run the test suite
|
||||||
|
- **THEN** the guide SHALL show: `npm test` (unit tests only, no services needed), `npm run test:integration` (requires Docker stack), and how to run a single test file
|
||||||
|
|
||||||
|
#### Scenario: Web dashboard local development documented
|
||||||
|
- **WHEN** a new engineer wants to run the web dashboard
|
||||||
|
- **THEN** the guide SHALL show how to start the Vite dev server (`npm run dev` in `dashboard/`) and which port it runs on, and confirm it connects to the local API server
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Company and product overview document
|
||||||
|
The system SHALL include a document (`docs/engineering/01-overview.md`) that explains SentryAgent.ai's mission, the AgentIdP product, target users, and why the product exists — providing new engineers with business and product context before they read any technical content.
|
||||||
|
|
||||||
|
#### Scenario: Mission and vision covered
|
||||||
|
- **WHEN** a new engineer reads 01-overview.md
|
||||||
|
- **THEN** they SHALL understand what SentryAgent.ai builds, why it exists, and what problem it solves for AI developers
|
||||||
|
|
||||||
|
#### Scenario: AGNTCY alignment explained
|
||||||
|
- **WHEN** a new engineer reads 01-overview.md
|
||||||
|
- **THEN** they SHALL understand what AGNTCY is, why SentryAgent.ai aligns to it, and what "first-class agent identity" means
|
||||||
|
|
||||||
|
#### Scenario: Product features listed
|
||||||
|
- **WHEN** a new engineer reads 01-overview.md
|
||||||
|
- **THEN** they SHALL see a summary of all product capabilities: agent registry, OAuth 2.0 auth, credential management, audit logs, SDKs, web dashboard, policy engine, and monitoring
|
||||||
|
|
||||||
|
#### Scenario: Phase roadmap visible
|
||||||
|
- **WHEN** a new engineer reads 01-overview.md
|
||||||
|
- **THEN** they SHALL understand which capabilities belong to Phase 1, Phase 2, and Phase 3
|
||||||
|
|
||||||
|
#### Scenario: Engineering team structure explained
|
||||||
|
- **WHEN** a new engineer reads 01-overview.md
|
||||||
|
- **THEN** they SHALL understand the Virtual Engineering Team model (CTO → Architect → Developer → QA) and how Claude operates as the engineering partner
|
||||||
|
|
||||||
|
#### Scenario: Free tier limits documented
|
||||||
|
- **WHEN** a new engineer reads 01-overview.md
|
||||||
|
- **THEN** they SHALL see the free tier limits (100 agents, 10,000 token requests/month, 90-day audit retention, 100 req/min) and understand the product's positioning
|
||||||
@@ -0,0 +1,32 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Engineering workflow and contribution guide
|
||||||
|
The system SHALL include a document (`docs/engineering/08-workflow.md`) that prescribes the exact steps an engineer MUST follow to contribute any new feature or change, from idea to merged code.
|
||||||
|
|
||||||
|
#### Scenario: OpenSpec spec-first workflow explained
|
||||||
|
- **WHEN** a new engineer reads 08-workflow.md
|
||||||
|
- **THEN** they SHALL understand that NO implementation begins without an approved OpenAPI spec — and the exact sequence: CEO approves → Architect writes spec → CTO reviews → Developer implements → QA signs off → CEO approves merge
|
||||||
|
|
||||||
|
#### Scenario: OpenSpec CLI commands documented
|
||||||
|
- **WHEN** a new engineer wants to start a new change
|
||||||
|
- **THEN** the guide SHALL provide the exact commands: `openspec new change <name>`, `openspec status --change <name>`, `openspec instructions <artifact> --change <name>`, and what each command does
|
||||||
|
|
||||||
|
#### Scenario: Branching strategy documented
|
||||||
|
- **WHEN** a new engineer creates a branch
|
||||||
|
- **THEN** the guide SHALL prescribe: feature branches from `develop`, naming convention `feature/<change-name>`, PR targets `develop`, `develop` → `main` requires CTO + CEO approval
|
||||||
|
|
||||||
|
#### Scenario: TypeScript and code standards enforced in workflow
|
||||||
|
- **WHEN** a new engineer writes code
|
||||||
|
- **THEN** the guide SHALL state the non-negotiable standards: strict mode, no `any`, DRY, SOLID, JSDoc on all public methods — and that PRs violating these are blocked by the CTO regardless of functionality
|
||||||
|
|
||||||
|
#### Scenario: PR checklist documented
|
||||||
|
- **WHEN** a new engineer opens a PR
|
||||||
|
- **THEN** the guide SHALL provide a PR checklist: TypeScript compiles with zero errors, ESLint passes with zero warnings, unit tests pass, coverage gate met (>80%), integration tests pass, OpenAPI spec updated if endpoint changed, engineering docs updated if architecture changed
|
||||||
|
|
||||||
|
#### Scenario: Virtual engineering team roles explained for contributors
|
||||||
|
- **WHEN** a new engineer reads 08-workflow.md
|
||||||
|
- **THEN** they SHALL understand the role separation: they contribute as the Principal Developer role, the CTO reviews all PRs, the Architect owns spec changes, and QA owns the test sign-off — and how to interact with each role in practice
|
||||||
|
|
||||||
|
#### Scenario: Commit message conventions documented
|
||||||
|
- **WHEN** a new engineer writes a commit message
|
||||||
|
- **THEN** the guide SHALL prescribe the Conventional Commits format: `feat:`, `fix:`, `docs:`, `test:`, `chore:`, `refactor:` prefixes — with examples for each
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: SDK integration guide
|
||||||
|
The system SHALL include a document (`docs/engineering/11-sdk-guide.md`) that explains how each of the four language SDKs is structured, how to use them, and how to contribute to or extend them.
|
||||||
|
|
||||||
|
#### Scenario: SDK architecture overview present
|
||||||
|
- **WHEN** a new engineer reads 11-sdk-guide.md
|
||||||
|
- **THEN** they SHALL understand that all four SDKs (Node.js, Python, Go, Java) implement the same API surface (14 endpoints, 4 service clients, 1 TokenManager, 1 error type) with identical semantics, and why consistency across SDKs is a non-negotiable standard
|
||||||
|
|
||||||
|
#### Scenario: Node.js SDK documented
|
||||||
|
- **WHEN** a new engineer reads the Node.js SDK section
|
||||||
|
- **THEN** they SHALL find: installation (`npm install @sentryagent/idp-sdk`), the AgentIdPClient constructor, all 4 service clients (agents, credentials, tokens, audit), TokenManager auto-refresh behaviour, AgentIdPError structure, and a complete working code example for the most common flow (register agent → generate credential → issue token)
|
||||||
|
|
||||||
|
#### Scenario: Python SDK documented
|
||||||
|
- **WHEN** a new engineer reads the Python SDK section
|
||||||
|
- **THEN** they SHALL find: installation (`pip install sentryagent-idp`), both sync (AgentIdPClient) and async (AsyncAgentIdPClient) variants, TokenManager and AsyncTokenManager auto-refresh, AgentIdPError, and a complete working example for sync and async usage
|
||||||
|
|
||||||
|
#### Scenario: Go SDK documented
|
||||||
|
- **WHEN** a new engineer reads the Go SDK section
|
||||||
|
- **THEN** they SHALL find: installation (`go get github.com/sentryagent/idp-sdk-go`), AgentIdPClient construction, goroutine-safe TokenManager, context.Context usage pattern, AgentIdPError with Code/HTTPStatus/Details, and a complete working example
|
||||||
|
|
||||||
|
#### Scenario: Java SDK documented
|
||||||
|
- **WHEN** a new engineer reads the Java SDK section
|
||||||
|
- **THEN** they SHALL find: Maven/Gradle dependency snippet, AgentIdPClient construction with builder pattern, sync methods and CompletableFuture async counterparts, thread-safe TokenManager, AgentIdPException, and a complete working example
|
||||||
|
|
||||||
|
#### Scenario: SDK contribution guide included
|
||||||
|
- **WHEN** a new engineer needs to add a new endpoint to all SDKs
|
||||||
|
- **THEN** the guide SHALL provide a step-by-step checklist for adding a new method to all four SDKs consistently: where to add the method, what the signature pattern is, how to write the corresponding test, and how to verify it compiles/passes in each language
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Service deep-dive documents
|
||||||
|
The system SHALL include a document (`docs/engineering/05-services.md`) providing a deep-dive reference for every core service and component, following a consistent template: Purpose → Responsibility boundary → Public interface → Key methods → Database schema (if applicable) → Error types → Configuration.
|
||||||
|
|
||||||
|
#### Scenario: AgentService documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the AgentService section covering: responsibility (agent CRUD only), public methods (createAgent, getAgent, listAgents, updateAgent, deleteAgent), the `agents` table schema, AgentNotFoundError and AgentAlreadyExistsError, and what AgentService does NOT do (no auth, no credentials — Single Responsibility)
|
||||||
|
|
||||||
|
#### Scenario: OAuth2Service documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the OAuth2Service section covering: responsibility (token issuance and revocation only), public methods (issueToken, validateToken, revokeToken), Redis token storage schema, JWT payload structure, token TTL configuration, and the Vault credential verification path vs bcrypt path
|
||||||
|
|
||||||
|
#### Scenario: CredentialService documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the CredentialService section covering: responsibility (credential lifecycle only), public methods (generateCredential, rotateCredential, revokeCredential, listCredentials), the `credentials` table schema, bcrypt vs Vault storage decision, and the `vault_path` column purpose
|
||||||
|
|
||||||
|
#### Scenario: AuditService documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the AuditService section covering: responsibility (immutable audit logging only), public methods (logEvent, queryLogs), the `audit_logs` table schema, event types enum, 90-day retention policy, and why audit records are never updated or deleted
|
||||||
|
|
||||||
|
#### Scenario: VaultClient documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the VaultClient section covering: purpose (wraps node-vault for KV v2 operations), public methods (writeSecret, readSecret, verifySecret, deleteSecret), the opt-in configuration (VAULT_ADDR env var), and the constant-time comparison in verifySecret and why it matters (timing attack prevention)
|
||||||
|
|
||||||
|
#### Scenario: OPA policy engine documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the OPA section covering: purpose (dynamic access control beyond static OAuth scopes), how policies are loaded, how authorization decisions are made, the policy file locations, and how to write and test a new policy
|
||||||
|
|
||||||
|
#### Scenario: Web Dashboard documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the Web Dashboard section covering: React 18 + Vite 5 + TypeScript structure, how it authenticates against the AgentIdP API, the main views (agent list, credential management, audit log viewer, policy editor), and how to run it locally
|
||||||
|
|
||||||
|
#### Scenario: Monitoring stack documented
|
||||||
|
- **WHEN** a new engineer reads 05-services.md
|
||||||
|
- **THEN** they SHALL find the monitoring section covering: Prometheus metrics exposed by the API server (`/metrics`), the key metrics (request count, latency histograms, active tokens, agent count), Grafana dashboard structure, and how to add a new metric to the API server
|
||||||
|
|
||||||
|
#### Scenario: Consistent template enforced
|
||||||
|
- **WHEN** a new engineer looks up any service
|
||||||
|
- **THEN** every service section SHALL follow the same template so the engineer knows exactly where to find each type of information
|
||||||
@@ -0,0 +1,32 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Testing strategy document
|
||||||
|
The system SHALL include a document (`docs/engineering/09-testing.md`) that explains the test architecture, how to run tests, coverage requirements, and how to write new tests following project conventions.
|
||||||
|
|
||||||
|
#### Scenario: Test types and their purposes explained
|
||||||
|
- **WHEN** a new engineer reads 09-testing.md
|
||||||
|
- **THEN** they SHALL understand the distinction between: unit tests (test one service/util in isolation, mock all dependencies, no running services needed) and integration tests (test full HTTP request/response cycle with real PostgreSQL + Redis)
|
||||||
|
|
||||||
|
#### Scenario: Test framework stack documented
|
||||||
|
- **WHEN** a new engineer reads 09-testing.md
|
||||||
|
- **THEN** they SHALL find the test stack listed and explained: Jest 29.7 (test runner + assertions), ts-jest (TypeScript compilation), Supertest 6.3 (HTTP integration testing), and how each is configured
|
||||||
|
|
||||||
|
#### Scenario: Coverage gates documented
|
||||||
|
- **WHEN** a new engineer reads 09-testing.md
|
||||||
|
- **THEN** they SHALL know the mandatory gates: >80% statements, >80% branches, >80% functions, >80% lines — and that PRs below these thresholds are blocked
|
||||||
|
|
||||||
|
#### Scenario: How to run the test suite documented
|
||||||
|
- **WHEN** a new engineer wants to run tests
|
||||||
|
- **THEN** the guide SHALL show: `npm test` (unit tests, no services), `npm run test:coverage` (unit tests + coverage report), `npm run test:integration` (requires Docker stack), and `npx jest src/services/agentService.test.ts` (single file)
|
||||||
|
|
||||||
|
#### Scenario: Unit test writing conventions shown
|
||||||
|
- **WHEN** a new engineer writes a new unit test
|
||||||
|
- **THEN** the guide SHALL show a complete example: how to mock a repository with `jest.mock()`, how to structure `describe`/`it` blocks, how to assert on thrown errors, and how to verify mock calls — using an actual test from the codebase as the example
|
||||||
|
|
||||||
|
#### Scenario: Integration test writing conventions shown
|
||||||
|
- **WHEN** a new engineer writes a new integration test
|
||||||
|
- **THEN** the guide SHALL show a complete example using Supertest: how to boot the Express app, how to seed test data, how to make authenticated requests (including getting a JWT first), and how to clean up after the test
|
||||||
|
|
||||||
|
#### Scenario: OWASP security testing reference included
|
||||||
|
- **WHEN** a new engineer writes security-relevant code
|
||||||
|
- **THEN** the guide SHALL include a reference to the OWASP Top 10 checks that are verified in QA sign-off and what each means in the context of this codebase (SQL injection, JWT attacks, credential exposure, etc.)
|
||||||
@@ -0,0 +1,2 @@
|
|||||||
|
schema: spec-driven
|
||||||
|
created: 2026-03-29
|
||||||
50
openspec/specs/api-reference/spec.md
Normal file
50
openspec/specs/api-reference/spec.md
Normal file
@@ -0,0 +1,50 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: API reference exists at docs/developers/api-reference.md
|
||||||
|
The system SHALL provide a human-readable API reference at `docs/developers/api-reference.md` covering all 14 endpoints across the four services: Agent Registry, OAuth 2.0 Token, Credential Management, and Audit Log.
|
||||||
|
|
||||||
|
#### Scenario: Developer finds any endpoint within 10 seconds
|
||||||
|
- **WHEN** the developer opens the API reference
|
||||||
|
- **THEN** they SHALL find a table of contents at the top linking to each of the four service sections
|
||||||
|
|
||||||
|
### Requirement: Every endpoint is documented with method, path, description, and auth requirements
|
||||||
|
For each of the 14 endpoints, the reference SHALL document: HTTP method, path, one-sentence description, and whether Bearer token auth is required.
|
||||||
|
|
||||||
|
#### Scenario: Developer knows which endpoints require authentication
|
||||||
|
- **WHEN** the developer scans the reference
|
||||||
|
- **THEN** they SHALL clearly see which endpoints require a Bearer token (all except POST /token) and which do not
|
||||||
|
|
||||||
|
### Requirement: Every endpoint includes a complete curl example
|
||||||
|
For each endpoint, the reference SHALL include at least one complete, runnable curl example with real placeholder values.
|
||||||
|
|
||||||
|
#### Scenario: Developer copies a curl example and runs it
|
||||||
|
- **WHEN** the developer copies a curl example from the reference
|
||||||
|
- **THEN** the command SHALL be complete — no ellipses, no `...`, no missing flags — requiring only substitution of their own agentId, token, and base URL
|
||||||
|
|
||||||
|
### Requirement: Every endpoint documents all request parameters and body fields
|
||||||
|
For each endpoint that accepts a request body or query parameters, the reference SHALL list every field with: name, type, required/optional, description, and validation constraints.
|
||||||
|
|
||||||
|
#### Scenario: Developer knows what fields are required for POST /agents
|
||||||
|
- **WHEN** the developer reads the POST /agents section
|
||||||
|
- **THEN** they SHALL see a table listing every field, its type, whether it is required, and any constraints (e.g. email format, max length)
|
||||||
|
|
||||||
|
### Requirement: Every endpoint documents all response codes and response body schemas
|
||||||
|
For each endpoint, the reference SHALL document every possible HTTP response code (2xx and 4xx/5xx) with a description and example response body.
|
||||||
|
|
||||||
|
#### Scenario: Developer understands a 429 response
|
||||||
|
- **WHEN** the developer reads the rate limit error documentation
|
||||||
|
- **THEN** they SHALL understand what triggered it, what the X-RateLimit-* headers mean, and when they can retry
|
||||||
|
|
||||||
|
### Requirement: API reference includes a base URL and versioning section
|
||||||
|
The reference SHALL include a section at the top explaining the base URL convention, port configuration, and that all endpoints are unversioned in Phase 1.
|
||||||
|
|
||||||
|
#### Scenario: Developer knows where to send requests
|
||||||
|
- **WHEN** the developer reads the base URL section
|
||||||
|
- **THEN** they SHALL see the default base URL (http://localhost:3000), how to change the port via environment variable, and a note that versioning will be introduced in Phase 2
|
||||||
|
|
||||||
|
### Requirement: API reference includes an errors section
|
||||||
|
The reference SHALL include a dedicated errors section listing all standard error response shapes, all custom error codes, and their HTTP status code mappings.
|
||||||
|
|
||||||
|
#### Scenario: Developer handles an AgentNotFoundError
|
||||||
|
- **WHEN** the developer reads the errors section
|
||||||
|
- **THEN** they SHALL see the exact JSON shape of the error response, the error code string, and the HTTP status (404)
|
||||||
43
openspec/specs/core-concepts/spec.md
Normal file
43
openspec/specs/core-concepts/spec.md
Normal file
@@ -0,0 +1,43 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Core concepts guide exists at docs/developers/concepts.md
|
||||||
|
The system SHALL provide a concepts guide at `docs/developers/concepts.md` that explains the AgentIdP model in plain English with no assumed prior knowledge of AGNTCY or OAuth 2.0.
|
||||||
|
|
||||||
|
#### Scenario: Developer understands what AgentIdP is
|
||||||
|
- **WHEN** a developer reads the concepts guide
|
||||||
|
- **THEN** they SHALL be able to explain in one sentence what SentryAgent.ai AgentIdP does and why they need it
|
||||||
|
|
||||||
|
### Requirement: Concepts guide explains what an AI agent identity is
|
||||||
|
The guide SHALL explain in plain English what it means to give an AI agent an identity — how it differs from a human user account and why agents need their own identity model.
|
||||||
|
|
||||||
|
#### Scenario: Agent identity vs human identity distinction is clear
|
||||||
|
- **WHEN** the developer reads the agent identity section
|
||||||
|
- **THEN** they SHALL understand that agents are non-human, machine-operated identities that need persistent, auditable credentials — not session-based logins
|
||||||
|
|
||||||
|
### Requirement: Concepts guide explains the AGNTCY alignment
|
||||||
|
The guide SHALL explain what AGNTCY is (Linux Foundation standard), why SentryAgent.ai aligns to it, and what benefit that gives the developer — without requiring the developer to read the AGNTCY specification.
|
||||||
|
|
||||||
|
#### Scenario: Developer understands AGNTCY without external reading
|
||||||
|
- **WHEN** the developer reads the AGNTCY section
|
||||||
|
- **THEN** they SHALL understand that AGNTCY-aligned agent IDs are interoperable across the AI agent ecosystem, and that SentryAgent.ai implements this for free
|
||||||
|
|
||||||
|
### Requirement: Concepts guide explains the agent lifecycle
|
||||||
|
The guide SHALL explain the four lifecycle states of an agent (active, suspended, decommissioned) and what each state means for credential and token behaviour.
|
||||||
|
|
||||||
|
#### Scenario: Developer understands what happens when an agent is decommissioned
|
||||||
|
- **WHEN** the developer reads the lifecycle section
|
||||||
|
- **THEN** they SHALL understand that decommissioning is irreversible, all credentials are revoked, and no new tokens can be issued
|
||||||
|
|
||||||
|
### Requirement: Concepts guide explains OAuth 2.0 Client Credentials in plain English
|
||||||
|
The guide SHALL explain the Client Credentials grant in plain English — no RFC references, no formal OAuth jargon — focused on how agents use it to authenticate.
|
||||||
|
|
||||||
|
#### Scenario: Developer understands client_id and client_secret without prior OAuth knowledge
|
||||||
|
- **WHEN** the developer reads the OAuth section
|
||||||
|
- **THEN** they SHALL understand that client_id identifies the agent and client_secret proves it — analogous to a username and password for machines
|
||||||
|
|
||||||
|
### Requirement: Concepts guide explains the free-tier limits
|
||||||
|
The guide SHALL document all free-tier limits (100 agents, 10,000 tokens/month, 100 req/min, 90-day audit retention) in a clear table.
|
||||||
|
|
||||||
|
#### Scenario: Developer knows the limits before hitting them
|
||||||
|
- **WHEN** the developer reads the free-tier section
|
||||||
|
- **THEN** they SHALL see a table with all four limits and a note on what happens when each is exceeded
|
||||||
4
openspec/specs/database/spec.md
Normal file
4
openspec/specs/database/spec.md
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Database doc exists at docs/devops/database.md
|
||||||
|
The system SHALL provide `docs/devops/database.md` documenting the 4-table schema (agents, credentials, audit_events, token_revocations), the migration runner, and exact commands to apply and verify migrations.
|
||||||
@@ -42,3 +42,8 @@ terraform/
|
|||||||
- [ ] PostgreSQL and Redis not publicly accessible — VPC-internal only
|
- [ ] PostgreSQL and Redis not publicly accessible — VPC-internal only
|
||||||
- [ ] `docs/devops/deployment.md` — end-to-end deployment walkthrough for AWS and GCP
|
- [ ] `docs/devops/deployment.md` — end-to-end deployment walkthrough for AWS and GCP
|
||||||
- [ ] `terraform.tfvars.example` provided for both environments — no secrets in version control
|
- [ ] `terraform.tfvars.example` provided for both environments — no secrets in version control
|
||||||
|
|
||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Local development guide exists at docs/devops/local-development.md
|
||||||
|
The system SHALL provide `docs/devops/local-development.md` documenting the complete local setup using docker-compose for infrastructure and npm for the application server, including all service ports, health check verification, and the Dockerfile gap note.
|
||||||
|
|||||||
56
openspec/specs/developer-guides/spec.md
Normal file
56
openspec/specs/developer-guides/spec.md
Normal file
@@ -0,0 +1,56 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Developer guides index exists at docs/developers/guides/README.md
|
||||||
|
The system SHALL provide a guides index at `docs/developers/guides/README.md` listing all available guides with one-line descriptions and links.
|
||||||
|
|
||||||
|
#### Scenario: Developer finds the right guide quickly
|
||||||
|
- **WHEN** the developer opens the guides folder
|
||||||
|
- **THEN** they SHALL see a list of all guides with descriptions so they can choose the one relevant to their task
|
||||||
|
|
||||||
|
### Requirement: Agent registration guide exists at docs/developers/guides/register-an-agent.md
|
||||||
|
The system SHALL provide a step-by-step guide for registering an agent, including all required and optional fields, validation rules, and how to handle the response.
|
||||||
|
|
||||||
|
#### Scenario: Developer registers their first agent
|
||||||
|
- **WHEN** the developer follows the registration guide
|
||||||
|
- **THEN** they SHALL successfully create an agent and understand what `agentId`, `clientId`, and `status` mean in the response
|
||||||
|
|
||||||
|
#### Scenario: Developer understands registration validation errors
|
||||||
|
- **WHEN** the guide covers validation
|
||||||
|
- **THEN** it SHALL show examples of common validation errors (missing required fields, invalid email format) and how to fix them
|
||||||
|
|
||||||
|
### Requirement: Credential management guide exists at docs/developers/guides/manage-credentials.md
|
||||||
|
The system SHALL provide a guide covering all four credential operations: generate, list, rotate, and revoke — with curl examples and explanation of when to use each.
|
||||||
|
|
||||||
|
#### Scenario: Developer rotates a compromised credential
|
||||||
|
- **WHEN** the developer follows the rotation section
|
||||||
|
- **THEN** they SHALL understand that rotation replaces the secret while keeping the same `credentialId`, and the old secret is immediately invalid
|
||||||
|
|
||||||
|
#### Scenario: Developer understands credential revocation vs agent decommission
|
||||||
|
- **WHEN** the developer reads the guide
|
||||||
|
- **THEN** they SHALL understand the difference: revoking a credential leaves the agent active with other credentials; decommissioning the agent revokes everything permanently
|
||||||
|
|
||||||
|
### Requirement: Token guide exists at docs/developers/guides/issue-and-revoke-tokens.md
|
||||||
|
The system SHALL provide a guide covering token issuance, introspection, and revocation — explaining the JWT structure, expiry, and how to use the Bearer token in API requests.
|
||||||
|
|
||||||
|
#### Scenario: Developer uses a token to authenticate a request
|
||||||
|
- **WHEN** the developer follows the token guide
|
||||||
|
- **THEN** they SHALL see an example of using the issued token as a Bearer token in an Authorization header on a subsequent API call
|
||||||
|
|
||||||
|
#### Scenario: Developer introspects a token to check validity
|
||||||
|
- **WHEN** the developer reads the introspection section
|
||||||
|
- **THEN** they SHALL understand what `active: true/false` means and what fields are returned
|
||||||
|
|
||||||
|
#### Scenario: Developer revokes a token
|
||||||
|
- **WHEN** the developer follows the revocation section
|
||||||
|
- **THEN** they SHALL understand that revoked tokens are immediately invalid even if not yet expired
|
||||||
|
|
||||||
|
### Requirement: Audit log guide exists at docs/developers/guides/query-audit-logs.md
|
||||||
|
The system SHALL provide a guide for querying the audit log — covering available filters (agentId, action, outcome, date range), pagination, and how to interpret audit events.
|
||||||
|
|
||||||
|
#### Scenario: Developer queries audit events for a specific agent
|
||||||
|
- **WHEN** the developer follows the audit guide
|
||||||
|
- **THEN** they SHALL see a curl example filtering by `agentId` and understand the structure of each audit event
|
||||||
|
|
||||||
|
#### Scenario: Developer understands audit log retention
|
||||||
|
- **WHEN** the developer reads the guide
|
||||||
|
- **THEN** they SHALL understand that free-tier audit logs are retained for 90 days and what happens after that window
|
||||||
370
openspec/specs/federation/spec.md
Normal file
370
openspec/specs/federation/spec.md
Normal file
@@ -0,0 +1,370 @@
|
|||||||
|
# AGNTCY Federation — Specification
|
||||||
|
|
||||||
|
**Workstream**: 4 of 6
|
||||||
|
**Phase**: 3 — Enterprise
|
||||||
|
**Author**: Virtual Architect
|
||||||
|
**Date**: 2026-03-29
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Enable cross-instance agent identity federation using signed JWT assertions. Operators register trusted remote AgentIdP instances as federation partners. When an agent presents a token issued by a trusted partner instance, the local AgentIdP can verify it by fetching and caching the partner's JWKS. This enables multi-organization agent identity interoperability aligned with AGNTCY standards.
|
||||||
|
|
||||||
|
Federation is opt-in per organization. Only tokens from explicitly registered, trusted partners are accepted.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### POST /federation/trust
|
||||||
|
|
||||||
|
Register a new federation trust partner. Requires `admin:orgs` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
POST /federation/trust
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [name, issuer, jwksUri]
|
||||||
|
properties:
|
||||||
|
name:
|
||||||
|
type: string
|
||||||
|
minLength: 2
|
||||||
|
maxLength: 100
|
||||||
|
description: Human-readable name for this federation partner
|
||||||
|
example: "Contoso AgentIdP"
|
||||||
|
issuer:
|
||||||
|
type: string
|
||||||
|
format: uri
|
||||||
|
description: OIDC issuer URL of the partner instance (must match iss claim in tokens)
|
||||||
|
example: "https://agentidp.contoso.com"
|
||||||
|
jwksUri:
|
||||||
|
type: string
|
||||||
|
format: uri
|
||||||
|
description: URL of the partner's JWKS endpoint
|
||||||
|
example: "https://agentidp.contoso.com/.well-known/jwks.json"
|
||||||
|
allowedOrganizations:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
description: Optional list of organization IDs in the partner instance whose tokens are accepted. Empty means all partner orgs are trusted.
|
||||||
|
example: ["org_contoso_engineering"]
|
||||||
|
expiresAt:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
description: Optional expiry for this trust relationship. If omitted, trust does not expire automatically.
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
201 Created:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/FederationPartner'
|
||||||
|
example:
|
||||||
|
partnerId: "fed_01HXK7Z9P3FKWABCDEF33333"
|
||||||
|
name: "Contoso AgentIdP"
|
||||||
|
issuer: "https://agentidp.contoso.com"
|
||||||
|
jwksUri: "https://agentidp.contoso.com/.well-known/jwks.json"
|
||||||
|
status: "active"
|
||||||
|
allowedOrganizations: []
|
||||||
|
trustedSince: "2026-03-29T12:00:00Z"
|
||||||
|
expiresAt: null
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
examples:
|
||||||
|
duplicate_issuer:
|
||||||
|
code: "DUPLICATE_ISSUER"
|
||||||
|
message: "A trust relationship with this issuer already exists"
|
||||||
|
unreachable_jwks:
|
||||||
|
code: "JWKS_UNREACHABLE"
|
||||||
|
message: "Could not fetch JWKS from the provided jwksUri"
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /federation/partners
|
||||||
|
|
||||||
|
List all registered federation partners for the caller's organization. Requires `admin:orgs` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /federation/partners
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
|
||||||
|
Query Parameters:
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
enum: [active, suspended, expired]
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
default: 1
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
default: 20
|
||||||
|
maximum: 100
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
data:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: '#/components/schemas/FederationPartner'
|
||||||
|
total:
|
||||||
|
type: integer
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
example:
|
||||||
|
data:
|
||||||
|
- partnerId: "fed_01HXK7Z9P3FKWABCDEF33333"
|
||||||
|
name: "Contoso AgentIdP"
|
||||||
|
issuer: "https://agentidp.contoso.com"
|
||||||
|
jwksUri: "https://agentidp.contoso.com/.well-known/jwks.json"
|
||||||
|
status: "active"
|
||||||
|
trustedSince: "2026-03-29T12:00:00Z"
|
||||||
|
expiresAt: null
|
||||||
|
total: 1
|
||||||
|
page: 1
|
||||||
|
limit: 20
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### DELETE /federation/partners/:partnerId
|
||||||
|
|
||||||
|
Remove a federation trust relationship. Requires `admin:orgs` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
DELETE /federation/partners/{partnerId}
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
|
||||||
|
Path Parameters:
|
||||||
|
partnerId:
|
||||||
|
type: string
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
204 No Content: {}
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### POST /federation/verify
|
||||||
|
|
||||||
|
Verify a token issued by a federated partner AgentIdP instance. The caller presents the token; this endpoint resolves the issuer, fetches (or cache-hits) the partner's JWKS, and verifies the signature and claims.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
POST /federation/verify
|
||||||
|
Authorization: Bearer <local access_token with agents:read scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [token]
|
||||||
|
properties:
|
||||||
|
token:
|
||||||
|
type: string
|
||||||
|
description: The JWT token issued by the remote AgentIdP instance to verify
|
||||||
|
expectedIssuer:
|
||||||
|
type: string
|
||||||
|
format: uri
|
||||||
|
description: Optional — if provided, verification fails if token issuer does not match
|
||||||
|
expectedOrganizationId:
|
||||||
|
type: string
|
||||||
|
description: Optional — if provided, verification fails if token organization_id does not match
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
valid:
|
||||||
|
type: boolean
|
||||||
|
claims:
|
||||||
|
type: object
|
||||||
|
description: Decoded JWT claims from the verified token
|
||||||
|
properties:
|
||||||
|
sub:
|
||||||
|
type: string
|
||||||
|
iss:
|
||||||
|
type: string
|
||||||
|
iat:
|
||||||
|
type: integer
|
||||||
|
exp:
|
||||||
|
type: integer
|
||||||
|
agent_id:
|
||||||
|
type: string
|
||||||
|
agent_type:
|
||||||
|
type: string
|
||||||
|
organization_id:
|
||||||
|
type: string
|
||||||
|
capabilities:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
did:
|
||||||
|
type: string
|
||||||
|
partner:
|
||||||
|
type: object
|
||||||
|
description: The federation partner record that vouches for this token
|
||||||
|
properties:
|
||||||
|
partnerId:
|
||||||
|
type: string
|
||||||
|
name:
|
||||||
|
type: string
|
||||||
|
issuer:
|
||||||
|
type: string
|
||||||
|
example:
|
||||||
|
valid: true
|
||||||
|
claims:
|
||||||
|
sub: "agt_contoso_abc123"
|
||||||
|
iss: "https://agentidp.contoso.com"
|
||||||
|
iat: 1743249600
|
||||||
|
exp: 1743253200
|
||||||
|
agent_id: "agt_contoso_abc123"
|
||||||
|
agent_type: "classifier"
|
||||||
|
organization_id: "org_contoso_engineering"
|
||||||
|
capabilities: ["text-classification"]
|
||||||
|
did: "did:web:agentidp.contoso.com:agents:agt_contoso_abc123"
|
||||||
|
partner:
|
||||||
|
partnerId: "fed_01HXK7Z9P3FKWABCDEF33333"
|
||||||
|
name: "Contoso AgentIdP"
|
||||||
|
issuer: "https://agentidp.contoso.com"
|
||||||
|
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
|
||||||
|
401 Unauthorized (local token invalid):
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
|
||||||
|
422 Unprocessable Entity (token invalid or untrusted issuer):
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
valid:
|
||||||
|
type: boolean
|
||||||
|
example: false
|
||||||
|
reason:
|
||||||
|
type: string
|
||||||
|
enum:
|
||||||
|
- TOKEN_EXPIRED
|
||||||
|
- INVALID_SIGNATURE
|
||||||
|
- UNTRUSTED_ISSUER
|
||||||
|
- JWKS_FETCH_FAILED
|
||||||
|
- ORGANIZATION_NOT_ALLOWED
|
||||||
|
message:
|
||||||
|
type: string
|
||||||
|
example:
|
||||||
|
valid: false
|
||||||
|
reason: "UNTRUSTED_ISSUER"
|
||||||
|
message: "No trust relationship registered for issuer https://unknown.example.com"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Schema Changes
|
||||||
|
|
||||||
|
### New Table: federation_partners
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE federation_partners (
|
||||||
|
partner_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
|
||||||
|
name VARCHAR(100) NOT NULL,
|
||||||
|
issuer VARCHAR(255) NOT NULL,
|
||||||
|
jwks_uri VARCHAR(255) NOT NULL,
|
||||||
|
allowed_organizations JSONB NOT NULL DEFAULT '[]',
|
||||||
|
status VARCHAR(20) NOT NULL DEFAULT 'active',
|
||||||
|
trusted_since TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
expires_at TIMESTAMPTZ,
|
||||||
|
last_jwks_fetch TIMESTAMPTZ,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
CONSTRAINT federation_partners_status_check CHECK (status IN ('active', 'suspended', 'expired')),
|
||||||
|
UNIQUE (organization_id, issuer)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_federation_partners_org_id ON federation_partners(organization_id);
|
||||||
|
CREATE INDEX idx_federation_partners_issuer ON federation_partners(issuer);
|
||||||
|
CREATE INDEX idx_federation_partners_status ON federation_partners(status);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Redis: JWKS Cache
|
||||||
|
|
||||||
|
Partner JWKS documents are cached in Redis with a TTL:
|
||||||
|
|
||||||
|
```
|
||||||
|
Key: federation:jwks:<issuer_url_sha256>
|
||||||
|
Value: JSON string of the JWKS document
|
||||||
|
TTL: 1 hour (configurable via FEDERATION_JWKS_CACHE_TTL_SECONDS)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Description | Default |
|
||||||
|
|---------------------|-------------|---------|
|
||||||
|
| `FEDERATION_ENABLED` | Enable federation endpoints | `true` |
|
||||||
|
| `FEDERATION_JWKS_CACHE_TTL_SECONDS` | Redis TTL for cached partner JWKS | `3600` |
|
||||||
|
| `FEDERATION_JWKS_FETCH_TIMEOUT_MS` | HTTP timeout for fetching partner JWKS | `5000` |
|
||||||
|
| `FEDERATION_MAX_PARTNERS_PER_ORG` | Max federation partners per organization | `50` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
No new npm packages. Federation uses `jsonwebtoken` (already present) for JWT verification and the existing HTTP client for JWKS fetches.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- Only tokens from explicitly registered, active federation partners are accepted in `POST /federation/verify`
|
||||||
|
- JWKS are cached to prevent JWKS endpoint hammering; cache is invalidated when a partner is updated
|
||||||
|
- Token signature verification uses the partner's JWKS; `alg: none` is always rejected
|
||||||
|
- `allowedOrganizations` field enables fine-grained trust: a partner can be trusted but only for tokens from specific organizations within that partner
|
||||||
|
- Expired federation partners (`expiresAt` in the past) are automatically treated as status `expired` — their tokens are rejected
|
||||||
|
- `POST /federation/verify` does not grant any local permissions — it is a verification-only endpoint. Callers must make their own access control decisions based on the returned claims.
|
||||||
|
- Clock skew tolerance: `exp` claim verification allows 30 seconds of clock skew (standard JWT practice)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] `POST /federation/trust` registers a partner and fetches JWKS; returns 400 if JWKS unreachable
|
||||||
|
- [ ] `POST /federation/verify` returns `valid: true` for a correctly signed token from a trusted partner
|
||||||
|
- [ ] `POST /federation/verify` returns `valid: false` with `reason: UNTRUSTED_ISSUER` for unknown issuers
|
||||||
|
- [ ] `POST /federation/verify` returns `valid: false` with `reason: TOKEN_EXPIRED` for expired tokens
|
||||||
|
- [ ] Expired trust relationships (past `expiresAt`) are rejected automatically
|
||||||
|
- [ ] JWKS cache hit is used on second verification request for same issuer (Redis key present)
|
||||||
|
- [ ] TypeScript strict, zero `any`, >80% test coverage on FederationService
|
||||||
444
openspec/specs/multi-tenancy/spec.md
Normal file
444
openspec/specs/multi-tenancy/spec.md
Normal file
@@ -0,0 +1,444 @@
|
|||||||
|
# Multi-Tenancy — Specification
|
||||||
|
|
||||||
|
**Workstream**: 1 of 6
|
||||||
|
**Phase**: 3 — Enterprise
|
||||||
|
**Author**: Virtual Architect
|
||||||
|
**Date**: 2026-03-29
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Introduce an Organization model so a single AgentIdP instance serves multiple isolated organizations. Each organization has its own namespace of agents, credentials, audit events, and rate limits. Row-level tenancy in PostgreSQL is enforced by both application-layer `organization_id` filtering and PostgreSQL Row-Level Security (RLS) policies.
|
||||||
|
|
||||||
|
All existing endpoints that operate on agents, credentials, or audit events are augmented to be organization-scoped. A new Admin API provides organization lifecycle management. Organization membership controls which agents a caller can manage.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### POST /organizations
|
||||||
|
|
||||||
|
Create a new organization. Requires system-admin scope (`admin:orgs`).
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
POST /organizations
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [name, slug]
|
||||||
|
properties:
|
||||||
|
name:
|
||||||
|
type: string
|
||||||
|
minLength: 2
|
||||||
|
maxLength: 100
|
||||||
|
description: Display name of the organization
|
||||||
|
example: "Acme AI Platform"
|
||||||
|
slug:
|
||||||
|
type: string
|
||||||
|
minLength: 2
|
||||||
|
maxLength: 50
|
||||||
|
pattern: "^[a-z0-9-]+$"
|
||||||
|
description: URL-safe unique identifier
|
||||||
|
example: "acme-ai"
|
||||||
|
planTier:
|
||||||
|
type: string
|
||||||
|
enum: [free, pro, enterprise]
|
||||||
|
default: free
|
||||||
|
maxAgents:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
default: 100
|
||||||
|
maxTokensPerMonth:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
default: 10000
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
201 Created:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/Organization'
|
||||||
|
example:
|
||||||
|
organizationId: "org_01HXK7Z9P3FKWABCDEF12345"
|
||||||
|
name: "Acme AI Platform"
|
||||||
|
slug: "acme-ai"
|
||||||
|
planTier: "free"
|
||||||
|
maxAgents: 100
|
||||||
|
maxTokensPerMonth: 10000
|
||||||
|
status: "active"
|
||||||
|
createdAt: "2026-03-29T12:00:00Z"
|
||||||
|
updatedAt: "2026-03-29T12:00:00Z"
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "VALIDATION_ERROR"
|
||||||
|
message: "slug must be unique"
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "INSUFFICIENT_SCOPE"
|
||||||
|
message: "admin:orgs scope required"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /organizations
|
||||||
|
|
||||||
|
List all organizations. Requires `admin:orgs` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /organizations
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
|
||||||
|
Query Parameters:
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
enum: [active, suspended, deleted]
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
default: 1
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
maximum: 100
|
||||||
|
default: 20
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
data:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: '#/components/schemas/Organization'
|
||||||
|
total:
|
||||||
|
type: integer
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
example:
|
||||||
|
data:
|
||||||
|
- organizationId: "org_01HXK7Z9P3FKWABCDEF12345"
|
||||||
|
name: "Acme AI Platform"
|
||||||
|
slug: "acme-ai"
|
||||||
|
planTier: "free"
|
||||||
|
status: "active"
|
||||||
|
createdAt: "2026-03-29T12:00:00Z"
|
||||||
|
updatedAt: "2026-03-29T12:00:00Z"
|
||||||
|
total: 1
|
||||||
|
page: 1
|
||||||
|
limit: 20
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /organizations/:orgId
|
||||||
|
|
||||||
|
Get a single organization. Requires `admin:orgs` scope or membership in the organization.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /organizations/{orgId}
|
||||||
|
Authorization: Bearer <token>
|
||||||
|
|
||||||
|
Path Parameters:
|
||||||
|
orgId:
|
||||||
|
type: string
|
||||||
|
description: Organization ID (org_... prefix)
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/Organization'
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "ORG_NOT_FOUND"
|
||||||
|
message: "Organization not found"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### PATCH /organizations/:orgId
|
||||||
|
|
||||||
|
Partially update an organization. Requires `admin:orgs` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
PATCH /organizations/{orgId}
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
name:
|
||||||
|
type: string
|
||||||
|
minLength: 2
|
||||||
|
maxLength: 100
|
||||||
|
planTier:
|
||||||
|
type: string
|
||||||
|
enum: [free, pro, enterprise]
|
||||||
|
maxAgents:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
maxTokensPerMonth:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
enum: [active, suspended]
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/Organization'
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### DELETE /organizations/:orgId
|
||||||
|
|
||||||
|
Soft-delete an organization (sets status to `deleted`). Requires `admin:orgs` scope. Hard deletion is not supported — data is retained for compliance.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
DELETE /organizations/{orgId}
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
204 No Content: {}
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
409 Conflict:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "ORG_HAS_ACTIVE_AGENTS"
|
||||||
|
message: "Organization has active agents; decommission all agents before deleting"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### POST /organizations/:orgId/members
|
||||||
|
|
||||||
|
Add a member (agent credential) to an organization. Requires `admin:orgs` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
POST /organizations/{orgId}/members
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [agentId, role]
|
||||||
|
properties:
|
||||||
|
agentId:
|
||||||
|
type: string
|
||||||
|
description: ID of an already-registered agent to add as a member
|
||||||
|
role:
|
||||||
|
type: string
|
||||||
|
enum: [member, admin]
|
||||||
|
description: Role within the organization
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
201 Created:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/OrgMember'
|
||||||
|
example:
|
||||||
|
memberId: "mem_01HXK7Z9P3FKWABCDEF99999"
|
||||||
|
organizationId: "org_01HXK7Z9P3FKWABCDEF12345"
|
||||||
|
agentId: "agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
role: "member"
|
||||||
|
joinedAt: "2026-03-29T12:00:00Z"
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
409 Conflict:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "ALREADY_MEMBER"
|
||||||
|
message: "Agent is already a member of this organization"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Modified: All /agents, /audit endpoints
|
||||||
|
|
||||||
|
All existing agent, credential, and audit endpoints now operate within the caller's organization context (extracted from `organization_id` claim in JWT). No URL changes — the scoping is transparent to callers already using the API.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Schema Changes
|
||||||
|
|
||||||
|
### New Table: organizations
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE organizations (
|
||||||
|
organization_id VARCHAR(40) PRIMARY KEY, -- org_... prefixed ULID
|
||||||
|
name VARCHAR(100) NOT NULL,
|
||||||
|
slug VARCHAR(50) NOT NULL UNIQUE,
|
||||||
|
plan_tier VARCHAR(20) NOT NULL DEFAULT 'free',
|
||||||
|
max_agents INTEGER NOT NULL DEFAULT 100,
|
||||||
|
max_tokens_per_month INTEGER NOT NULL DEFAULT 10000,
|
||||||
|
status VARCHAR(20) NOT NULL DEFAULT 'active',
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
CONSTRAINT organizations_status_check CHECK (status IN ('active', 'suspended', 'deleted')),
|
||||||
|
CONSTRAINT organizations_plan_check CHECK (plan_tier IN ('free', 'pro', 'enterprise'))
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_organizations_slug ON organizations(slug);
|
||||||
|
CREATE INDEX idx_organizations_status ON organizations(status);
|
||||||
|
```
|
||||||
|
|
||||||
|
### New Table: organization_members
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE organization_members (
|
||||||
|
member_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
|
||||||
|
agent_id VARCHAR(40) NOT NULL REFERENCES agents(agent_id),
|
||||||
|
role VARCHAR(20) NOT NULL DEFAULT 'member',
|
||||||
|
joined_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
CONSTRAINT organization_members_role_check CHECK (role IN ('member', 'admin')),
|
||||||
|
UNIQUE (organization_id, agent_id)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_org_members_org_id ON organization_members(organization_id);
|
||||||
|
CREATE INDEX idx_org_members_agent_id ON organization_members(agent_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Modified: agents table
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE agents
|
||||||
|
ADD COLUMN organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id) DEFAULT 'org_system';
|
||||||
|
|
||||||
|
CREATE INDEX idx_agents_organization_id ON agents(organization_id);
|
||||||
|
|
||||||
|
-- RLS
|
||||||
|
ALTER TABLE agents ENABLE ROW LEVEL SECURITY;
|
||||||
|
CREATE POLICY agents_org_isolation ON agents
|
||||||
|
USING (organization_id = current_setting('app.organization_id', true));
|
||||||
|
```
|
||||||
|
|
||||||
|
### Modified: credentials table
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE credentials
|
||||||
|
ADD COLUMN organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id) DEFAULT 'org_system';
|
||||||
|
|
||||||
|
CREATE INDEX idx_credentials_organization_id ON credentials(organization_id);
|
||||||
|
ALTER TABLE credentials ENABLE ROW LEVEL SECURITY;
|
||||||
|
CREATE POLICY credentials_org_isolation ON credentials
|
||||||
|
USING (organization_id = current_setting('app.organization_id', true));
|
||||||
|
```
|
||||||
|
|
||||||
|
### Modified: audit_logs table
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE audit_logs
|
||||||
|
ADD COLUMN organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id) DEFAULT 'org_system';
|
||||||
|
|
||||||
|
CREATE INDEX idx_audit_logs_organization_id ON audit_logs(organization_id);
|
||||||
|
ALTER TABLE audit_logs ENABLE ROW LEVEL SECURITY;
|
||||||
|
CREATE POLICY audit_logs_org_isolation ON audit_logs
|
||||||
|
USING (organization_id = current_setting('app.organization_id', true));
|
||||||
|
```
|
||||||
|
|
||||||
|
### Seed: Default system organization
|
||||||
|
|
||||||
|
```sql
|
||||||
|
INSERT INTO organizations (organization_id, name, slug, plan_tier, max_agents, max_tokens_per_month, status)
|
||||||
|
VALUES ('org_system', 'System', 'system', 'enterprise', 999999, 999999999, 'active');
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Description | Default |
|
||||||
|
|---------------------|-------------|---------|
|
||||||
|
| `MULTI_TENANCY_ENABLED` | Enable organization enforcement (set false for single-tenant mode) | `true` |
|
||||||
|
| `DEFAULT_ORG_ID` | Organization ID to assign pre-tenancy data during migration | `org_system` |
|
||||||
|
| `MAX_ORGS_PER_INSTANCE` | Hard cap on number of organizations per instance | `1000` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
No new npm packages. Row-level tenancy uses existing PostgreSQL client (`pg`) and query patterns.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- PostgreSQL RLS is enabled as defense-in-depth — even accidental omission of `organization_id` filter at application layer is caught by the database
|
||||||
|
- `SET LOCAL app.organization_id` is called at the start of every database transaction
|
||||||
|
- The `admin:orgs` scope is a new privileged scope — only system-level agent credentials carry it
|
||||||
|
- Organization slugs are public-facing but organization IDs are internal — never expose organization IDs in public URLs where avoidable
|
||||||
|
- `DELETE /organizations` is soft-delete only — hard deletion requires a separate admin runbook to prevent accidental data loss
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] Single AgentIdP instance can serve 2+ organizations with zero cross-organization data leakage
|
||||||
|
- [ ] All agent/credential/audit operations are scoped to caller's organization_id from JWT
|
||||||
|
- [ ] PostgreSQL RLS policies verified: direct DB query without app.organization_id setting returns 0 rows
|
||||||
|
- [ ] Organization CRUD endpoints return correct 403 when caller lacks admin:orgs scope
|
||||||
|
- [ ] Pre-existing agents assigned to default system organization without data loss
|
||||||
|
- [ ] TypeScript strict, zero `any`, >80% test coverage on OrgService
|
||||||
366
openspec/specs/oidc/spec.md
Normal file
366
openspec/specs/oidc/spec.md
Normal file
@@ -0,0 +1,366 @@
|
|||||||
|
# OpenID Connect (OIDC) — Specification
|
||||||
|
|
||||||
|
**Workstream**: 3 of 6
|
||||||
|
**Phase**: 3 — Enterprise
|
||||||
|
**Author**: Virtual Architect
|
||||||
|
**Date**: 2026-03-29
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Add a full OIDC 1.0 layer on top of the existing OAuth 2.0 `client_credentials` implementation using the certified `oidc-provider` npm library. The OIDC layer exposes Discovery, JWKS, extends the token endpoint to return ID tokens with agent claims, and provides an `/agent-info` endpoint (the agent-identity equivalent of OIDC's `/userinfo`).
|
||||||
|
|
||||||
|
The existing `POST /oauth2/token` endpoint is extended, not replaced. Callers that do not request the `openid` scope continue to receive standard OAuth 2.0 responses unchanged.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### GET /.well-known/openid-configuration
|
||||||
|
|
||||||
|
OIDC Discovery document. No authentication required. This is the standard OIDC Discovery endpoint (RFC 8414 / OpenID Connect Discovery 1.0).
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /.well-known/openid-configuration
|
||||||
|
No authentication required
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/json
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
description: OIDC Discovery document per OpenID Connect Discovery 1.0
|
||||||
|
example:
|
||||||
|
issuer: "https://idp.sentryagent.ai"
|
||||||
|
authorization_endpoint: "https://idp.sentryagent.ai/oauth2/authorize"
|
||||||
|
token_endpoint: "https://idp.sentryagent.ai/oauth2/token"
|
||||||
|
jwks_uri: "https://idp.sentryagent.ai/.well-known/jwks.json"
|
||||||
|
userinfo_endpoint: "https://idp.sentryagent.ai/agent-info"
|
||||||
|
introspection_endpoint: "https://idp.sentryagent.ai/oauth2/introspect"
|
||||||
|
revocation_endpoint: "https://idp.sentryagent.ai/oauth2/revoke"
|
||||||
|
response_types_supported:
|
||||||
|
- "token"
|
||||||
|
grant_types_supported:
|
||||||
|
- "client_credentials"
|
||||||
|
subject_types_supported:
|
||||||
|
- "public"
|
||||||
|
id_token_signing_alg_values_supported:
|
||||||
|
- "RS256"
|
||||||
|
- "ES256"
|
||||||
|
scopes_supported:
|
||||||
|
- "openid"
|
||||||
|
- "agents:read"
|
||||||
|
- "agents:write"
|
||||||
|
- "tokens:read"
|
||||||
|
- "audit:read"
|
||||||
|
claims_supported:
|
||||||
|
- "sub"
|
||||||
|
- "iss"
|
||||||
|
- "iat"
|
||||||
|
- "exp"
|
||||||
|
- "agent_id"
|
||||||
|
- "agent_type"
|
||||||
|
- "organization_id"
|
||||||
|
- "capabilities"
|
||||||
|
- "deployment_env"
|
||||||
|
- "owner"
|
||||||
|
token_endpoint_auth_methods_supported:
|
||||||
|
- "client_secret_post"
|
||||||
|
- "client_secret_basic"
|
||||||
|
500 Internal Server Error:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /.well-known/jwks.json
|
||||||
|
|
||||||
|
JSON Web Key Set. Contains the public keys used to sign ID tokens and access tokens. No authentication required. Clients use this endpoint to verify token signatures.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /.well-known/jwks.json
|
||||||
|
No authentication required
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/json
|
||||||
|
Cache-Control: public, max-age=3600
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [keys]
|
||||||
|
properties:
|
||||||
|
keys:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: object
|
||||||
|
description: JSON Web Key (RFC 7517)
|
||||||
|
properties:
|
||||||
|
kty:
|
||||||
|
type: string
|
||||||
|
example: "RSA"
|
||||||
|
use:
|
||||||
|
type: string
|
||||||
|
example: "sig"
|
||||||
|
kid:
|
||||||
|
type: string
|
||||||
|
description: Key ID — matches `kid` header in issued JWTs
|
||||||
|
alg:
|
||||||
|
type: string
|
||||||
|
example: "RS256"
|
||||||
|
n:
|
||||||
|
type: string
|
||||||
|
description: RSA modulus (base64url)
|
||||||
|
e:
|
||||||
|
type: string
|
||||||
|
description: RSA exponent (base64url)
|
||||||
|
example:
|
||||||
|
keys:
|
||||||
|
- kty: "RSA"
|
||||||
|
use: "sig"
|
||||||
|
kid: "key-2026-03-29-01"
|
||||||
|
alg: "RS256"
|
||||||
|
n: "0vx7agoebGcQSuuPiLJXZptN9nndrQmbXEps2aiAFbWhM78LhWx4cbbfAAt..."
|
||||||
|
e: "AQAB"
|
||||||
|
500 Internal Server Error:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### POST /oauth2/token (extended)
|
||||||
|
|
||||||
|
The existing token endpoint is extended to return an `id_token` when the `openid` scope is requested. All existing behavior is preserved when `openid` is not in the scope list.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
POST /oauth2/token
|
||||||
|
Content-Type: application/x-www-form-urlencoded
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [grant_type, client_id, client_secret]
|
||||||
|
properties:
|
||||||
|
grant_type:
|
||||||
|
type: string
|
||||||
|
enum: [client_credentials]
|
||||||
|
client_id:
|
||||||
|
type: string
|
||||||
|
client_secret:
|
||||||
|
type: string
|
||||||
|
scope:
|
||||||
|
type: string
|
||||||
|
description: Space-separated scopes. Include "openid" to receive an id_token.
|
||||||
|
example: "openid agents:read"
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK (with openid scope):
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
access_token:
|
||||||
|
type: string
|
||||||
|
token_type:
|
||||||
|
type: string
|
||||||
|
example: "Bearer"
|
||||||
|
expires_in:
|
||||||
|
type: integer
|
||||||
|
scope:
|
||||||
|
type: string
|
||||||
|
id_token:
|
||||||
|
type: string
|
||||||
|
description: Signed JWT ID token containing agent identity claims. Only present when openid scope was requested.
|
||||||
|
example:
|
||||||
|
access_token: "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
|
||||||
|
token_type: "Bearer"
|
||||||
|
expires_in: 3600
|
||||||
|
scope: "openid agents:read"
|
||||||
|
id_token: "eyJhbGciOiJSUzI1NiIsImtpZCI6ImtleS0yMDI2LTAzLTI5LTAxIn0..."
|
||||||
|
|
||||||
|
200 OK (without openid scope):
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
access_token:
|
||||||
|
type: string
|
||||||
|
token_type:
|
||||||
|
type: string
|
||||||
|
expires_in:
|
||||||
|
type: integer
|
||||||
|
scope:
|
||||||
|
type: string
|
||||||
|
example:
|
||||||
|
access_token: "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
|
||||||
|
token_type: "Bearer"
|
||||||
|
expires_in: 3600
|
||||||
|
scope: "agents:read"
|
||||||
|
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/OAuthErrorResponse'
|
||||||
|
example:
|
||||||
|
error: "invalid_client"
|
||||||
|
error_description: "Invalid client credentials"
|
||||||
|
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/OAuthErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
#### ID Token Claims
|
||||||
|
|
||||||
|
When `openid` scope is requested, the ID token (a signed JWT) contains the following claims:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"iss": "https://idp.sentryagent.ai",
|
||||||
|
"sub": "agt_01HXK7Z9P3FKWABCDEF67890",
|
||||||
|
"aud": "agt_01HXK7Z9P3FKWABCDEF67890",
|
||||||
|
"iat": 1743249600,
|
||||||
|
"exp": 1743253200,
|
||||||
|
"agent_id": "agt_01HXK7Z9P3FKWABCDEF67890",
|
||||||
|
"agent_type": "orchestrator",
|
||||||
|
"organization_id": "org_01HXK7Z9P3FKWABCDEF12345",
|
||||||
|
"capabilities": ["task-planning", "tool-use"],
|
||||||
|
"deployment_env": "production",
|
||||||
|
"owner": "acme-ai",
|
||||||
|
"did": "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /agent-info
|
||||||
|
|
||||||
|
Returns claims about the authenticated agent identity. This is the agent-first equivalent of the OIDC `/userinfo` endpoint. Authentication required with any valid access token.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /agent-info
|
||||||
|
Authorization: Bearer <access_token>
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/json
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
description: Agent identity claims (subset of registered agent data)
|
||||||
|
properties:
|
||||||
|
sub:
|
||||||
|
type: string
|
||||||
|
description: Subject — agentId
|
||||||
|
agent_id:
|
||||||
|
type: string
|
||||||
|
agent_type:
|
||||||
|
type: string
|
||||||
|
organization_id:
|
||||||
|
type: string
|
||||||
|
capabilities:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
deployment_env:
|
||||||
|
type: string
|
||||||
|
owner:
|
||||||
|
type: string
|
||||||
|
version:
|
||||||
|
type: string
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
did:
|
||||||
|
type: string
|
||||||
|
description: W3C DID for this agent (if DID workstream is active)
|
||||||
|
created_at:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
example:
|
||||||
|
sub: "agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
agent_id: "agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
agent_type: "orchestrator"
|
||||||
|
organization_id: "org_01HXK7Z9P3FKWABCDEF12345"
|
||||||
|
capabilities: ["task-planning", "tool-use"]
|
||||||
|
deployment_env: "production"
|
||||||
|
owner: "acme-ai"
|
||||||
|
version: "1.2.0"
|
||||||
|
status: "active"
|
||||||
|
did: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
created_at: "2026-03-29T12:00:00Z"
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "UNAUTHORIZED"
|
||||||
|
message: "Invalid or expired access token"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Schema Changes
|
||||||
|
|
||||||
|
### New Table: oidc_keys
|
||||||
|
|
||||||
|
Stores the RSA/EC key pairs used for ID token signing. Private keys stored in Vault; public key JWK in PostgreSQL for JWKS endpoint.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE oidc_keys (
|
||||||
|
key_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
kid VARCHAR(100) NOT NULL UNIQUE, -- Key ID in JWKS
|
||||||
|
algorithm VARCHAR(10) NOT NULL,
|
||||||
|
use_purpose VARCHAR(10) NOT NULL DEFAULT 'sig',
|
||||||
|
public_key_jwk JSONB NOT NULL,
|
||||||
|
vault_key_path VARCHAR(255) NOT NULL,
|
||||||
|
is_current BOOLEAN NOT NULL DEFAULT TRUE,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
retired_at TIMESTAMPTZ,
|
||||||
|
CONSTRAINT oidc_keys_alg_check CHECK (algorithm IN ('RS256', 'ES256')),
|
||||||
|
CONSTRAINT oidc_keys_use_check CHECK (use_purpose IN ('sig', 'enc'))
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_oidc_keys_is_current ON oidc_keys(is_current) WHERE is_current = TRUE;
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Description | Default |
|
||||||
|
|---------------------|-------------|---------|
|
||||||
|
| `OIDC_ISSUER` | OIDC issuer URL (must match token `iss` claim) | `https://${HOST}` |
|
||||||
|
| `OIDC_ID_TOKEN_TTL_SECONDS` | ID token lifetime | `3600` |
|
||||||
|
| `OIDC_SIGNING_ALG` | ID token signing algorithm | `RS256` |
|
||||||
|
| `OIDC_JWKS_CACHE_TTL_SECONDS` | JWKS response cache TTL | `3600` |
|
||||||
|
| `OIDC_KEY_ROTATION_DAYS` | Days between signing key rotations | `90` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
| Package | Version | Purpose |
|
||||||
|
|---------|---------|---------|
|
||||||
|
| `oidc-provider` | `^8.4.6` | Certified OIDC server library (OpenID Foundation conformant) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- ID token signing keys are stored in Vault; public keys only are served via JWKS
|
||||||
|
- JWKS endpoint is cached in Redis (`OIDC_JWKS_CACHE_TTL_SECONDS`) to prevent key-hammering
|
||||||
|
- Key rotation: when a new signing key is created, the old key remains in JWKS until all tokens signed with it have expired
|
||||||
|
- The `openid` scope is only issued to callers explicitly requesting it — not included by default
|
||||||
|
- `GET /agent-info` returns the same data as the ID token — no additional sensitive data
|
||||||
|
- ID tokens for agent credentials must not contain client secrets or internal system paths
|
||||||
|
- `alg: none` is explicitly rejected — all ID tokens must be signed
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] `/.well-known/openid-configuration` passes OIDC Discovery conformance validation
|
||||||
|
- [ ] `/.well-known/jwks.json` returns valid JWKS with current signing public key
|
||||||
|
- [ ] ID token returned when `openid` scope is in token request; not returned otherwise
|
||||||
|
- [ ] ID token is verifiable against JWKS endpoint using standard JWT libraries
|
||||||
|
- [ ] ID token claims match agent record (agent_type, capabilities, organization_id, did)
|
||||||
|
- [ ] `/agent-info` returns correct claims for authenticated agent
|
||||||
|
- [ ] Key rotation: old JWKS key is kept until all signed tokens expire
|
||||||
|
- [ ] TypeScript strict, zero `any`, >80% test coverage on OIDCService
|
||||||
7
openspec/specs/operations/spec.md
Normal file
7
openspec/specs/operations/spec.md
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Security guide exists at docs/devops/security.md
|
||||||
|
The system SHALL provide `docs/devops/security.md` documenting RSA keypair generation, key rotation procedure, CORS configuration, and secret storage guidance.
|
||||||
|
|
||||||
|
### Requirement: Operations runbook exists at docs/devops/operations.md
|
||||||
|
The system SHALL provide `docs/devops/operations.md` covering startup procedure, graceful shutdown (SIGTERM/SIGINT), log interpretation, and troubleshooting for the most common operational failures.
|
||||||
45
openspec/specs/quick-start/spec.md
Normal file
45
openspec/specs/quick-start/spec.md
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: Quick-start guide exists at docs/developers/quick-start.md
|
||||||
|
The system SHALL provide a quick-start guide at `docs/developers/quick-start.md` that enables a bedroom developer to register their first agent and issue an OAuth 2.0 access token in under 5 minutes.
|
||||||
|
|
||||||
|
#### Scenario: Developer completes quick-start from zero
|
||||||
|
- **WHEN** a developer with no prior AgentIdP knowledge follows the quick-start guide
|
||||||
|
- **THEN** they SHALL have a registered agent, a valid credential, and a working access token by the end
|
||||||
|
|
||||||
|
### Requirement: Quick-start lists exact prerequisites
|
||||||
|
The quick-start guide SHALL list all prerequisites at the top before any steps, so the developer knows what they need before starting.
|
||||||
|
|
||||||
|
#### Scenario: Prerequisites are minimal and explicit
|
||||||
|
- **WHEN** the developer reads the prerequisites section
|
||||||
|
- **THEN** they SHALL see exactly: Docker (for running PostgreSQL and Redis) and curl (for API calls) — nothing else required
|
||||||
|
|
||||||
|
### Requirement: Quick-start provides a working docker-compose startup command
|
||||||
|
The quick-start guide SHALL include a single command to start the required infrastructure (PostgreSQL + Redis) using the project's `docker-compose.yml`.
|
||||||
|
|
||||||
|
#### Scenario: Developer starts infrastructure
|
||||||
|
- **WHEN** the developer runs the provided docker-compose command
|
||||||
|
- **THEN** the guide SHALL confirm what services are started and what ports they run on
|
||||||
|
|
||||||
|
### Requirement: Quick-start covers the full 4-step workflow
|
||||||
|
The quick-start guide SHALL cover exactly these four steps in order, each with a working curl command and the expected response:
|
||||||
|
|
||||||
|
1. Start the AgentIdP server
|
||||||
|
2. Register an agent (`POST /agents`)
|
||||||
|
3. Generate a credential (`POST /agents/{agentId}/credentials`)
|
||||||
|
4. Issue an access token (`POST /token`)
|
||||||
|
|
||||||
|
#### Scenario: Each step has a copy-pasteable curl command
|
||||||
|
- **WHEN** the developer reads any step
|
||||||
|
- **THEN** they SHALL find a complete curl command with real placeholder values they can substitute
|
||||||
|
|
||||||
|
#### Scenario: Each step shows the expected JSON response
|
||||||
|
- **WHEN** the developer runs a curl command from the guide
|
||||||
|
- **THEN** the guide SHALL show them what a successful response looks like so they can verify their output
|
||||||
|
|
||||||
|
### Requirement: Quick-start ends with a next-steps section
|
||||||
|
The quick-start guide SHALL end with a "What's Next" section linking to: core-concepts.md, developer-guides.md, and api-reference.md.
|
||||||
|
|
||||||
|
#### Scenario: Developer knows where to go after quick-start
|
||||||
|
- **WHEN** the developer reaches the end of the quick-start
|
||||||
|
- **THEN** they SHALL see at least 3 links to deeper documentation
|
||||||
335
openspec/specs/soc2/spec.md
Normal file
335
openspec/specs/soc2/spec.md
Normal file
@@ -0,0 +1,335 @@
|
|||||||
|
# SOC 2 Type II Preparation — Specification
|
||||||
|
|
||||||
|
**Workstream**: 6 of 6
|
||||||
|
**Phase**: 3 — Enterprise
|
||||||
|
**Author**: Virtual Architect
|
||||||
|
**Date**: 2026-03-29
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Implement the technical controls required for SOC 2 Type II audit readiness. SOC 2 Type II certifies that security controls operate continuously over a defined period — not just that they exist. Controls are implemented in code, not just documented.
|
||||||
|
|
||||||
|
This workstream cuts across all other Phase 3 workstreams. It delivers: encryption at rest for sensitive columns, TLS enforcement middleware, automated secrets rotation, security event alerting, and audit log immutability via a Merkle hash chain. A compliance documentation package (controls matrix and runbook) is produced for auditors.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Technical Controls
|
||||||
|
|
||||||
|
### Control C1: Encryption at Rest (Column-Level Encryption)
|
||||||
|
|
||||||
|
Sensitive columns in PostgreSQL are encrypted using `pgcrypto` symmetric encryption. The encryption key is stored in Vault and fetched at application startup, never written to disk.
|
||||||
|
|
||||||
|
**Columns encrypted**:
|
||||||
|
- `credentials.secret_hash` — encrypted with AES-256-CBC
|
||||||
|
- `credentials.vault_path` — encrypted with AES-256-CBC
|
||||||
|
- `webhook_subscriptions.vault_secret_path` — encrypted with AES-256-CBC
|
||||||
|
- `agent_did_keys.vault_key_path` — encrypted with AES-256-CBC
|
||||||
|
|
||||||
|
**Implementation**: A `EncryptionService` wraps `pgcrypto` `pgp_sym_encrypt` / `pgp_sym_decrypt`. The key is a 256-bit symmetric key stored at `secret/agentidp/encryption/column-key` in Vault. All INSERT/SELECT operations for encrypted columns go through `EncryptionService`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Control C2: TLS Enforcement
|
||||||
|
|
||||||
|
All inbound HTTP connections are rejected in production if TLS is not present. This is enforced at two levels:
|
||||||
|
1. Express middleware: `TLSEnforcementMiddleware` — if `X-Forwarded-Proto` is not `https` and `NODE_ENV=production`, respond `301 Moved Permanently` to HTTPS.
|
||||||
|
2. Terraform: Load balancers (Phase 2 Terraform modules) already enforce TLS; TLS enforcement middleware provides defense-in-depth.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Control C3: Automated Secrets Rotation
|
||||||
|
|
||||||
|
A scheduled job (`SecretsRotationJob`) runs on a configurable cron schedule. It:
|
||||||
|
1. Identifies credentials whose `expires_at` is within `ROTATION_WARNING_DAYS` days
|
||||||
|
2. Emits a Prometheus metric `agentidp_credentials_expiring_soon_total` (labelled by `org_id`, `days_remaining`)
|
||||||
|
3. Renews Vault leases for all active credentials
|
||||||
|
4. Sends a webhook event `credential.expiring_soon` to subscribers who have opted in
|
||||||
|
|
||||||
|
This does not automatically rotate credentials without operator action — it alerts and prepares. Forced rotation requires an operator call to the existing `POST /agents/:id/credentials/:credId/rotate` endpoint.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Control C4: Audit Log Immutability (Merkle Hash Chain)
|
||||||
|
|
||||||
|
Every `audit_logs` row carries two new columns:
|
||||||
|
- `hash`: SHA-256 of `(eventId || timestamp.toISOString() || action || outcome || agentId || organizationId || previousHash)`
|
||||||
|
- `previous_hash`: hash of the immediately preceding `audit_logs` row (by `created_at` order), or the genesis string `"GENESIS"` for the first row
|
||||||
|
|
||||||
|
A PostgreSQL trigger prevents `UPDATE` and `DELETE` on `audit_logs`.
|
||||||
|
|
||||||
|
A new admin endpoint `GET /audit/verify` runs a sequential chain verification pass and returns the integrity status.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Control C5: Security Event Alerting
|
||||||
|
|
||||||
|
Prometheus alerting rules are written for the following security events:
|
||||||
|
|
||||||
|
| Alert | Condition | Severity |
|
||||||
|
|-------|-----------|---------|
|
||||||
|
| `AuthFailureSpike` | >50 `auth.failed` events in 5 minutes | Warning |
|
||||||
|
| `RateLimitExhaustion` | >80% of org rate limit consumed in 1 minute | Warning |
|
||||||
|
| `AnomalousTokenIssuance` | Token issuance rate 3x 7-day average | Warning |
|
||||||
|
| `WebhookDeadLetterAccumulating` | `agentidp_webhook_dead_letters_total` increases by >10 in 1 hour | Warning |
|
||||||
|
| `AuditChainIntegrityFailed` | `agentidp_audit_chain_integrity` metric is 0 | Critical |
|
||||||
|
| `CredentialExpiryApproaching` | `agentidp_credentials_expiring_soon_total{days_remaining="7"}` > 0 | Info |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### GET /audit/verify
|
||||||
|
|
||||||
|
Verify the Merkle hash chain integrity of the audit log. Requires `admin:orgs` scope. This is a potentially expensive operation on large audit logs — it is rate-limited to once per 5 minutes per organization.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /audit/verify
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
|
||||||
|
Query Parameters:
|
||||||
|
fromDate:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
description: Start of verification range. If omitted, verifies from genesis.
|
||||||
|
toDate:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
description: End of verification range. If omitted, verifies to the latest row.
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
valid:
|
||||||
|
type: boolean
|
||||||
|
description: True if the chain is intact across the entire range
|
||||||
|
rowsVerified:
|
||||||
|
type: integer
|
||||||
|
description: Number of audit rows verified
|
||||||
|
firstEventId:
|
||||||
|
type: string
|
||||||
|
lastEventId:
|
||||||
|
type: string
|
||||||
|
firstTimestamp:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
lastTimestamp:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
verifiedAt:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
brokenAtEventId:
|
||||||
|
type: string
|
||||||
|
nullable: true
|
||||||
|
description: Present only if valid=false — the first eventId where the chain breaks
|
||||||
|
example:
|
||||||
|
valid: true
|
||||||
|
rowsVerified: 15420
|
||||||
|
firstEventId: "evt_genesis_00001"
|
||||||
|
lastEventId: "evt_01HXK7Z9P3FKWABCDEFZZZZZ"
|
||||||
|
firstTimestamp: "2026-01-01T00:00:00Z"
|
||||||
|
lastTimestamp: "2026-03-29T12:00:00Z"
|
||||||
|
verifiedAt: "2026-03-29T14:00:00Z"
|
||||||
|
brokenAtEventId: null
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
429 Too Many Requests:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "RATE_LIMITED"
|
||||||
|
message: "Audit verification can be run at most once per 5 minutes"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /compliance/controls
|
||||||
|
|
||||||
|
Returns the current status of all SOC 2 technical controls. Requires `admin:orgs` scope. Used by auditors and compliance dashboards.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /compliance/controls
|
||||||
|
Authorization: Bearer <token with admin:orgs scope>
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
generatedAt:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
controls:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
controlId:
|
||||||
|
type: string
|
||||||
|
name:
|
||||||
|
type: string
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
enum: [pass, fail, warning, not_applicable]
|
||||||
|
description:
|
||||||
|
type: string
|
||||||
|
lastChecked:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
example:
|
||||||
|
generatedAt: "2026-03-29T14:00:00Z"
|
||||||
|
controls:
|
||||||
|
- controlId: "C1"
|
||||||
|
name: "Encryption at Rest"
|
||||||
|
status: "pass"
|
||||||
|
description: "Column-level encryption active for all sensitive columns"
|
||||||
|
lastChecked: "2026-03-29T14:00:00Z"
|
||||||
|
- controlId: "C2"
|
||||||
|
name: "TLS Enforcement"
|
||||||
|
status: "pass"
|
||||||
|
description: "All non-TLS requests redirected to HTTPS in production"
|
||||||
|
lastChecked: "2026-03-29T14:00:00Z"
|
||||||
|
- controlId: "C3"
|
||||||
|
name: "Secrets Rotation"
|
||||||
|
status: "warning"
|
||||||
|
description: "3 credentials expiring within 7 days"
|
||||||
|
lastChecked: "2026-03-29T14:00:00Z"
|
||||||
|
- controlId: "C4"
|
||||||
|
name: "Audit Log Immutability"
|
||||||
|
status: "pass"
|
||||||
|
description: "Merkle chain intact — last verified 2026-03-29T13:55:00Z"
|
||||||
|
lastChecked: "2026-03-29T14:00:00Z"
|
||||||
|
- controlId: "C5"
|
||||||
|
name: "Security Event Alerting"
|
||||||
|
status: "pass"
|
||||||
|
description: "All 6 alerting rules active in Prometheus"
|
||||||
|
lastChecked: "2026-03-29T14:00:00Z"
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Schema Changes
|
||||||
|
|
||||||
|
### Modified: audit_logs table
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE audit_logs
|
||||||
|
ADD COLUMN hash VARCHAR(64), -- SHA-256 hex string of chain node
|
||||||
|
ADD COLUMN previous_hash VARCHAR(64); -- Hash of preceding row, or "GENESIS"
|
||||||
|
|
||||||
|
-- Back-fill genesis hash for existing rows (one-time migration)
|
||||||
|
-- Migration script computes chain in order of created_at
|
||||||
|
|
||||||
|
-- Prevent updates and deletes (immutability trigger)
|
||||||
|
CREATE OR REPLACE FUNCTION prevent_audit_modification()
|
||||||
|
RETURNS TRIGGER AS $$
|
||||||
|
BEGIN
|
||||||
|
RAISE EXCEPTION 'audit_logs rows are immutable — modification is not permitted';
|
||||||
|
END;
|
||||||
|
$$ LANGUAGE plpgsql;
|
||||||
|
|
||||||
|
CREATE TRIGGER audit_logs_immutability
|
||||||
|
BEFORE UPDATE OR DELETE ON audit_logs
|
||||||
|
FOR EACH ROW EXECUTE FUNCTION prevent_audit_modification();
|
||||||
|
```
|
||||||
|
|
||||||
|
### Modified: credentials table
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Columns remain same type; application now stores encrypted values
|
||||||
|
-- No DDL change — encryption is transparent at application layer
|
||||||
|
-- Add comment for documentation
|
||||||
|
COMMENT ON COLUMN credentials.secret_hash IS 'AES-256-CBC encrypted via EncryptionService (pgcrypto). Not a plain bcrypt hash.';
|
||||||
|
COMMENT ON COLUMN credentials.vault_path IS 'AES-256-CBC encrypted via EncryptionService.';
|
||||||
|
```
|
||||||
|
|
||||||
|
### New Table: compliance_check_log
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE compliance_check_log (
|
||||||
|
check_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
|
||||||
|
control_id VARCHAR(10) NOT NULL,
|
||||||
|
status VARCHAR(20) NOT NULL,
|
||||||
|
details JSONB NOT NULL DEFAULT '{}',
|
||||||
|
checked_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_compliance_check_org ON compliance_check_log(organization_id, checked_at DESC);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Description | Default |
|
||||||
|
|---------------------|-------------|---------|
|
||||||
|
| `SOC2_CONTROLS_ENABLED` | Enable SOC 2 controls enforcement | `true` |
|
||||||
|
| `TLS_ENFORCEMENT_ENABLED` | Enforce HTTPS in production | `true` in production, `false` in development |
|
||||||
|
| `COLUMN_ENCRYPTION_KEY_PATH` | Vault path for AES-256 column encryption key | `secret/agentidp/encryption/column-key` |
|
||||||
|
| `ROTATION_WARNING_DAYS` | Days before expiry to emit rotation warning | `30` |
|
||||||
|
| `SECRETS_ROTATION_CRON` | Cron schedule for rotation check job | `0 3 * * *` (daily at 3 AM UTC) |
|
||||||
|
| `AUDIT_CHAIN_VERIFY_CRON` | Cron schedule for automated chain verification | `0 2 * * *` (daily at 2 AM UTC) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
| Package | Version | Purpose |
|
||||||
|
|---------|---------|---------|
|
||||||
|
| `node-forge` | `^1.3.1` | AES-256-CBC column-level encryption primitives |
|
||||||
|
|
||||||
|
Note: `pgcrypto` PostgreSQL extension must be enabled: `CREATE EXTENSION IF NOT EXISTS pgcrypto;`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Compliance Documentation
|
||||||
|
|
||||||
|
The following documents are produced as part of this workstream:
|
||||||
|
|
||||||
|
| Document | Path | Description |
|
||||||
|
|----------|------|-------------|
|
||||||
|
| Controls Matrix | `docs/compliance/soc2-controls-matrix.md` | Maps SOC 2 Trust Services Criteria to implemented controls |
|
||||||
|
| Encryption Runbook | `docs/compliance/encryption-runbook.md` | Key rotation procedure, Vault key path map |
|
||||||
|
| Audit Log Runbook | `docs/compliance/audit-log-runbook.md` | How to run chain verification, interpret results |
|
||||||
|
| Incident Response | `docs/compliance/incident-response.md` | Security event response procedures |
|
||||||
|
| Secrets Rotation Guide | `docs/compliance/secrets-rotation.md` | Operator guide for credential and key rotation |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- Column encryption key is fetched from Vault at startup and held in process memory — never written to disk or logged
|
||||||
|
- Key rotation: new encryption key generates re-encrypted copies of all sensitive columns in a migration; the old key is retained in Vault history
|
||||||
|
- The immutability trigger on `audit_logs` prevents application-layer modification; a `SUPERUSER` can still bypass triggers — document this in the controls matrix as a residual risk requiring compensating controls (e.g., read-only replica verification)
|
||||||
|
- `GET /audit/verify` is rate-limited to prevent denial-of-service via repeated expensive sequential scans
|
||||||
|
- `GET /compliance/controls` never returns raw secrets or key material — only control status
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] `pgcrypto` extension enabled; sensitive columns are encrypted at rest (verified: plaintext not visible in direct DB query)
|
||||||
|
- [ ] TLS enforcement middleware redirects HTTP to HTTPS in production; passthrough in development
|
||||||
|
- [ ] `SecretsRotationJob` runs on schedule; emits Prometheus metric for expiring credentials
|
||||||
|
- [ ] Audit log immutability trigger prevents UPDATE/DELETE on `audit_logs` table
|
||||||
|
- [ ] `GET /audit/verify` returns `valid: true` for an unmodified chain
|
||||||
|
- [ ] `GET /audit/verify` returns `valid: false` with `brokenAtEventId` after a row is manually tampered with (test scenario)
|
||||||
|
- [ ] All 6 Prometheus alerting rules are present in `monitoring/prometheus/alerts.yml`
|
||||||
|
- [ ] `GET /compliance/controls` returns correct status for all 5 controls
|
||||||
|
- [ ] Compliance documentation written and reviewed
|
||||||
|
- [ ] TypeScript strict, zero `any`, >80% test coverage on SOC2 control implementations
|
||||||
10
openspec/specs/system-overview/spec.md
Normal file
10
openspec/specs/system-overview/spec.md
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
## ADDED Requirements
|
||||||
|
|
||||||
|
### Requirement: System overview exists at docs/devops/README.md
|
||||||
|
The system SHALL provide a `docs/devops/README.md` that serves as the entry point for DevOps engineers, including an index of all DevOps docs and a brief system overview.
|
||||||
|
|
||||||
|
### Requirement: Architecture doc exists at docs/devops/architecture.md
|
||||||
|
The system SHALL provide `docs/devops/architecture.md` documenting all components (Express server, PostgreSQL, Redis), their roles, ports, and data flow.
|
||||||
|
|
||||||
|
### Requirement: Environment variable reference exists at docs/devops/environment-variables.md
|
||||||
|
The system SHALL provide `docs/devops/environment-variables.md` documenting every environment variable with name, type, required/optional, default, and example value.
|
||||||
353
openspec/specs/w3c-dids/spec.md
Normal file
353
openspec/specs/w3c-dids/spec.md
Normal file
@@ -0,0 +1,353 @@
|
|||||||
|
# W3C Decentralized Identifiers (DIDs) — Specification
|
||||||
|
|
||||||
|
**Workstream**: 2 of 6
|
||||||
|
**Phase**: 3 — Enterprise
|
||||||
|
**Author**: Virtual Architect
|
||||||
|
**Date**: 2026-03-29
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Issue a W3C `did:web` identifier for every registered agent and serve DID Documents over HTTPS. The AgentIdP instance itself has a root DID Document at `/.well-known/did.json`. Each agent has an individual DID Document at `/agents/:id/did`. A DID resolution endpoint wraps the standard resolution workflow. Agent cards in AGNTCY format are derivable from DID Documents.
|
||||||
|
|
||||||
|
The `did:web` method resolves to `https://<host>/.well-known/did.json` (instance) and `https://<host>/agents/<agentId>/did` (per-agent). All DID Documents are W3C DID Core 1.0 compliant.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### GET /.well-known/did.json
|
||||||
|
|
||||||
|
Root DID Document for the AgentIdP instance. No authentication required — this is a public discovery endpoint.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /.well-known/did.json
|
||||||
|
No authentication required
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/json
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
description: W3C DID Core 1.0 compliant DID Document
|
||||||
|
required: [id, "@context", verificationMethod, authentication]
|
||||||
|
properties:
|
||||||
|
"@context":
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
example:
|
||||||
|
- "https://www.w3.org/ns/did/v1"
|
||||||
|
- "https://w3id.org/security/suites/jws-2020/v1"
|
||||||
|
id:
|
||||||
|
type: string
|
||||||
|
description: DID for this AgentIdP instance
|
||||||
|
example: "did:web:idp.sentryagent.ai"
|
||||||
|
controller:
|
||||||
|
type: string
|
||||||
|
example: "did:web:idp.sentryagent.ai"
|
||||||
|
verificationMethod:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: '#/components/schemas/VerificationMethod'
|
||||||
|
authentication:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
description: References to verification methods for authentication
|
||||||
|
assertionMethod:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
service:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: '#/components/schemas/DIDService'
|
||||||
|
example:
|
||||||
|
"@context":
|
||||||
|
- "https://www.w3.org/ns/did/v1"
|
||||||
|
id: "did:web:idp.sentryagent.ai"
|
||||||
|
controller: "did:web:idp.sentryagent.ai"
|
||||||
|
verificationMethod:
|
||||||
|
- id: "did:web:idp.sentryagent.ai#key-1"
|
||||||
|
type: "JsonWebKey2020"
|
||||||
|
controller: "did:web:idp.sentryagent.ai"
|
||||||
|
publicKeyJwk:
|
||||||
|
kty: "EC"
|
||||||
|
crv: "P-256"
|
||||||
|
x: "f83OJ3D2xF1Bg8vub9tLe1gHMzV76e8Tus9uPHvRVEU"
|
||||||
|
y: "x_FEzRu9m36HLN_tue659LNpXW6pCyStikYjKIWI5a0"
|
||||||
|
authentication:
|
||||||
|
- "did:web:idp.sentryagent.ai#key-1"
|
||||||
|
service:
|
||||||
|
- id: "did:web:idp.sentryagent.ai#agent-registry"
|
||||||
|
type: "AgentIdentityProvider"
|
||||||
|
serviceEndpoint: "https://idp.sentryagent.ai"
|
||||||
|
500 Internal Server Error:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /agents/:id/did
|
||||||
|
|
||||||
|
Per-agent DID Document. No authentication required — DID Documents are public.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /agents/{agentId}/did
|
||||||
|
No authentication required
|
||||||
|
|
||||||
|
Path Parameters:
|
||||||
|
agentId:
|
||||||
|
type: string
|
||||||
|
description: Agent ID
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/json
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
description: W3C DID Core 1.0 compliant per-agent DID Document
|
||||||
|
example:
|
||||||
|
"@context":
|
||||||
|
- "https://www.w3.org/ns/did/v1"
|
||||||
|
- "https://w3id.org/agntcy/v1"
|
||||||
|
id: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
controller: "did:web:idp.sentryagent.ai"
|
||||||
|
verificationMethod:
|
||||||
|
- id: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890#key-1"
|
||||||
|
type: "JsonWebKey2020"
|
||||||
|
controller: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
publicKeyJwk:
|
||||||
|
kty: "EC"
|
||||||
|
crv: "P-256"
|
||||||
|
x: "abc123"
|
||||||
|
y: "def456"
|
||||||
|
authentication:
|
||||||
|
- "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890#key-1"
|
||||||
|
service:
|
||||||
|
- id: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890#agent-card"
|
||||||
|
type: "AgentCard"
|
||||||
|
serviceEndpoint: "https://idp.sentryagent.ai/agents/agt_01HXK7Z9P3FKWABCDEF67890/did/card"
|
||||||
|
agntcy:
|
||||||
|
agentId: "agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
agentType: "orchestrator"
|
||||||
|
capabilities:
|
||||||
|
- "task-planning"
|
||||||
|
- "tool-use"
|
||||||
|
deploymentEnv: "production"
|
||||||
|
owner: "acme-ai"
|
||||||
|
version: "1.2.0"
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "AGENT_NOT_FOUND"
|
||||||
|
message: "Agent not found"
|
||||||
|
410 Gone:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "AGENT_DECOMMISSIONED"
|
||||||
|
message: "Agent has been decommissioned — DID Document is no longer active"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /agents/:id/did/resolve
|
||||||
|
|
||||||
|
DID resolution endpoint: resolves any `did:web` DID and returns the DID resolution result in W3C DID Resolution format. This enables external systems to use AgentIdP as a resolver for agent DIDs. Authentication required (`agents:read` scope).
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /agents/{agentId}/did/resolve
|
||||||
|
Authorization: Bearer <token with agents:read scope>
|
||||||
|
|
||||||
|
Path Parameters:
|
||||||
|
agentId:
|
||||||
|
type: string
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/ld+json;profile="https://w3id.org/did-resolution"
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [didDocument, didDocumentMetadata, didResolutionMetadata]
|
||||||
|
properties:
|
||||||
|
didDocument:
|
||||||
|
type: object
|
||||||
|
description: The resolved DID Document
|
||||||
|
didDocumentMetadata:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
created:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
updated:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
deactivated:
|
||||||
|
type: boolean
|
||||||
|
didResolutionMetadata:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
contentType:
|
||||||
|
type: string
|
||||||
|
example: "application/did+ld+json"
|
||||||
|
retrieved:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
example:
|
||||||
|
didDocument:
|
||||||
|
"@context": ["https://www.w3.org/ns/did/v1"]
|
||||||
|
id: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
didDocumentMetadata:
|
||||||
|
created: "2026-03-29T12:00:00Z"
|
||||||
|
updated: "2026-03-29T12:00:00Z"
|
||||||
|
deactivated: false
|
||||||
|
didResolutionMetadata:
|
||||||
|
contentType: "application/did+ld+json"
|
||||||
|
retrieved: "2026-03-29T14:00:00Z"
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /agents/:id/did/card
|
||||||
|
|
||||||
|
AGNTCY-format agent card derived from DID Document. Returns a JSON object representing the agent's identity and capabilities in the AGNTCY agent card format. No authentication required.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /agents/{agentId}/did/card
|
||||||
|
No authentication required
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
Content-Type: application/json
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
description: AGNTCY-format agent card
|
||||||
|
properties:
|
||||||
|
did:
|
||||||
|
type: string
|
||||||
|
name:
|
||||||
|
type: string
|
||||||
|
agentType:
|
||||||
|
type: string
|
||||||
|
capabilities:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
owner:
|
||||||
|
type: string
|
||||||
|
version:
|
||||||
|
type: string
|
||||||
|
deploymentEnv:
|
||||||
|
type: string
|
||||||
|
identityProvider:
|
||||||
|
type: string
|
||||||
|
description: DID of the issuing AgentIdP instance
|
||||||
|
issuedAt:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
example:
|
||||||
|
did: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
name: "acme-orchestrator"
|
||||||
|
agentType: "orchestrator"
|
||||||
|
capabilities: ["task-planning", "tool-use"]
|
||||||
|
owner: "acme-ai"
|
||||||
|
version: "1.2.0"
|
||||||
|
deploymentEnv: "production"
|
||||||
|
identityProvider: "did:web:idp.sentryagent.ai"
|
||||||
|
issuedAt: "2026-03-29T12:00:00Z"
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Schema Changes
|
||||||
|
|
||||||
|
### New Table: agent_did_keys
|
||||||
|
|
||||||
|
Stores the public/private key pair used to sign each agent's DID Document. The private key is stored in Vault; only the public key JWK is stored in PostgreSQL.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE agent_did_keys (
|
||||||
|
key_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
agent_id VARCHAR(40) NOT NULL UNIQUE REFERENCES agents(agent_id),
|
||||||
|
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
|
||||||
|
public_key_jwk JSONB NOT NULL,
|
||||||
|
vault_key_path VARCHAR(255) NOT NULL, -- Vault path where private key is stored
|
||||||
|
key_type VARCHAR(20) NOT NULL DEFAULT 'EC',
|
||||||
|
curve VARCHAR(10) NOT NULL DEFAULT 'P-256',
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
rotated_at TIMESTAMPTZ,
|
||||||
|
CONSTRAINT agent_did_keys_key_type_check CHECK (key_type IN ('EC', 'RSA'))
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_agent_did_keys_agent_id ON agent_did_keys(agent_id);
|
||||||
|
CREATE INDEX idx_agent_did_keys_org_id ON agent_did_keys(organization_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
### New Column: agents.did
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE agents
|
||||||
|
ADD COLUMN did VARCHAR(255),
|
||||||
|
ADD COLUMN did_created_at TIMESTAMPTZ;
|
||||||
|
|
||||||
|
-- Populated automatically on agent creation
|
||||||
|
-- Example value: "did:web:idp.sentryagent.ai:agents:agt_01HXK7Z9P3FKWABCDEF67890"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Description | Default |
|
||||||
|
|---------------------|-------------|---------|
|
||||||
|
| `DID_WEB_DOMAIN` | Domain name for `did:web` construction | Required — derived from `HOST` if not set |
|
||||||
|
| `DID_KEY_TYPE` | Cryptographic key type for DID keys | `EC` |
|
||||||
|
| `DID_KEY_CURVE` | Elliptic curve for EC keys | `P-256` |
|
||||||
|
| `DID_DOCUMENT_CACHE_TTL_SECONDS` | How long to cache DID Documents in Redis | `300` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
| Package | Version | Purpose |
|
||||||
|
|---------|---------|---------|
|
||||||
|
| `did-resolver` | `^4.1.0` | W3C DID resolution interface |
|
||||||
|
| `web-did-resolver` | `^2.0.27` | DID:WEB method resolver |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- DID Documents are public endpoints — no authentication, no rate-limit-sensitive data exposed
|
||||||
|
- Private keys for DID signing are stored in Vault; never written to PostgreSQL
|
||||||
|
- DID Document cache in Redis has a TTL — stale documents are evicted automatically
|
||||||
|
- Decommissioned agents return HTTP 410 Gone with `deactivated: true` in DID Document metadata
|
||||||
|
- DID rotation: when a credential is rotated, the DID Document key can optionally be rotated; the old key is retained in history
|
||||||
|
- `GET /agents/:id/did/card` exposes only data already present in the agent registration — no new sensitive fields
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] Every new agent registration automatically generates a `did:web` DID and key pair
|
||||||
|
- [ ] Root DID Document at `/.well-known/did.json` is W3C DID Core 1.0 compliant (validated by `did-resolver`)
|
||||||
|
- [ ] Per-agent DID Document returns correct `did:web` identifier and public key JWK
|
||||||
|
- [ ] DID resolution endpoint returns W3C DID Resolution format
|
||||||
|
- [ ] Decommissioned agent DID Document returns 410 Gone with `deactivated: true`
|
||||||
|
- [ ] Agent card at `/agents/:id/did/card` matches AGNTCY agent card format
|
||||||
|
- [ ] Private keys never appear in any API response or log
|
||||||
|
- [ ] TypeScript strict, zero `any`, >80% test coverage on DIDService
|
||||||
476
openspec/specs/webhooks/spec.md
Normal file
476
openspec/specs/webhooks/spec.md
Normal file
@@ -0,0 +1,476 @@
|
|||||||
|
# Webhooks and Event Streaming — Specification
|
||||||
|
|
||||||
|
**Workstream**: 5 of 6
|
||||||
|
**Phase**: 3 — Enterprise
|
||||||
|
**Author**: Virtual Architect
|
||||||
|
**Date**: 2026-03-29
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Real-time event notifications for agent lifecycle events via HTTP webhooks. Operators create webhook subscriptions specifying a target URL, the events they want to receive, and a secret for HMAC-SHA256 signature verification. Delivery is asynchronous via a Redis-backed `bull` queue with exponential backoff retry (max 10 attempts). All deliveries are logged for observability.
|
||||||
|
|
||||||
|
Supported events: `agent.created`, `agent.updated`, `agent.suspended`, `agent.reactivated`, `agent.decommissioned`, `credential.generated`, `credential.rotated`, `credential.revoked`, `token.issued`, `token.revoked`.
|
||||||
|
|
||||||
|
An optional Kafka/NATS adapter enables high-throughput event streaming alongside webhook delivery.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### POST /webhooks
|
||||||
|
|
||||||
|
Create a new webhook subscription. Requires `agents:write` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
POST /webhooks
|
||||||
|
Authorization: Bearer <token with agents:write scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
required: [url, events, secret]
|
||||||
|
properties:
|
||||||
|
url:
|
||||||
|
type: string
|
||||||
|
format: uri
|
||||||
|
description: HTTPS endpoint to deliver events to
|
||||||
|
example: "https://app.example.com/hooks/agentidp"
|
||||||
|
events:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
enum:
|
||||||
|
- agent.created
|
||||||
|
- agent.updated
|
||||||
|
- agent.suspended
|
||||||
|
- agent.reactivated
|
||||||
|
- agent.decommissioned
|
||||||
|
- credential.generated
|
||||||
|
- credential.rotated
|
||||||
|
- credential.revoked
|
||||||
|
- token.issued
|
||||||
|
- token.revoked
|
||||||
|
- "*"
|
||||||
|
minItems: 1
|
||||||
|
description: List of event types to subscribe to. Use ["*"] to subscribe to all events.
|
||||||
|
example: ["agent.created", "credential.rotated"]
|
||||||
|
secret:
|
||||||
|
type: string
|
||||||
|
minLength: 16
|
||||||
|
description: Secret used to compute HMAC-SHA256 signature. Store securely — it is returned only once.
|
||||||
|
example: "whsec_super_secret_value_here"
|
||||||
|
description:
|
||||||
|
type: string
|
||||||
|
maxLength: 255
|
||||||
|
description: Optional human-readable description for this subscription
|
||||||
|
active:
|
||||||
|
type: boolean
|
||||||
|
default: true
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
201 Created:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/WebhookSubscription'
|
||||||
|
example:
|
||||||
|
subscriptionId: "wh_01HXK7Z9P3FKWABCDEF55555"
|
||||||
|
organizationId: "org_01HXK7Z9P3FKWABCDEF12345"
|
||||||
|
url: "https://app.example.com/hooks/agentidp"
|
||||||
|
events: ["agent.created", "credential.rotated"]
|
||||||
|
description: "Production event sink"
|
||||||
|
active: true
|
||||||
|
createdAt: "2026-03-29T12:00:00Z"
|
||||||
|
updatedAt: "2026-03-29T12:00:00Z"
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
examples:
|
||||||
|
invalid_url:
|
||||||
|
code: "VALIDATION_ERROR"
|
||||||
|
message: "url must be a valid HTTPS URI"
|
||||||
|
invalid_event:
|
||||||
|
code: "VALIDATION_ERROR"
|
||||||
|
message: "Unknown event type: agent.unknown"
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /webhooks
|
||||||
|
|
||||||
|
List webhook subscriptions for the caller's organization. Requires `agents:read` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /webhooks
|
||||||
|
Authorization: Bearer <token with agents:read scope>
|
||||||
|
|
||||||
|
Query Parameters:
|
||||||
|
active:
|
||||||
|
type: boolean
|
||||||
|
description: Filter by active/inactive subscriptions
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
default: 1
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
default: 20
|
||||||
|
maximum: 100
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
data:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: '#/components/schemas/WebhookSubscription'
|
||||||
|
total:
|
||||||
|
type: integer
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /webhooks/:id
|
||||||
|
|
||||||
|
Get a single webhook subscription. Requires `agents:read` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /webhooks/{subscriptionId}
|
||||||
|
Authorization: Bearer <token with agents:read scope>
|
||||||
|
|
||||||
|
Path Parameters:
|
||||||
|
subscriptionId:
|
||||||
|
type: string
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/WebhookSubscription'
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
example:
|
||||||
|
code: "WEBHOOK_NOT_FOUND"
|
||||||
|
message: "Webhook subscription not found"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### PATCH /webhooks/:id
|
||||||
|
|
||||||
|
Update a webhook subscription (e.g., pause/resume, change events). Requires `agents:write` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
PATCH /webhooks/{subscriptionId}
|
||||||
|
Authorization: Bearer <token with agents:write scope>
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
Request Body:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
url:
|
||||||
|
type: string
|
||||||
|
format: uri
|
||||||
|
events:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
type: string
|
||||||
|
description:
|
||||||
|
type: string
|
||||||
|
maxLength: 255
|
||||||
|
active:
|
||||||
|
type: boolean
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/WebhookSubscription'
|
||||||
|
400 Bad Request:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### DELETE /webhooks/:id
|
||||||
|
|
||||||
|
Delete a webhook subscription. Requires `agents:write` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
DELETE /webhooks/{subscriptionId}
|
||||||
|
Authorization: Bearer <token with agents:write scope>
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
204 No Content: {}
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
403 Forbidden:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GET /webhooks/:id/deliveries
|
||||||
|
|
||||||
|
List delivery attempts for a specific webhook subscription. Requires `agents:read` scope.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
GET /webhooks/{subscriptionId}/deliveries
|
||||||
|
Authorization: Bearer <token with agents:read scope>
|
||||||
|
|
||||||
|
Query Parameters:
|
||||||
|
status:
|
||||||
|
type: string
|
||||||
|
enum: [pending, success, failed, dead_letter]
|
||||||
|
eventType:
|
||||||
|
type: string
|
||||||
|
description: Filter by event type
|
||||||
|
fromDate:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
toDate:
|
||||||
|
type: string
|
||||||
|
format: date-time
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
default: 1
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
default: 50
|
||||||
|
maximum: 200
|
||||||
|
|
||||||
|
Responses:
|
||||||
|
200 OK:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
data:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: '#/components/schemas/WebhookDelivery'
|
||||||
|
total:
|
||||||
|
type: integer
|
||||||
|
page:
|
||||||
|
type: integer
|
||||||
|
limit:
|
||||||
|
type: integer
|
||||||
|
example:
|
||||||
|
data:
|
||||||
|
- deliveryId: "del_01HXK7Z9P3FKWABCDEF77777"
|
||||||
|
subscriptionId: "wh_01HXK7Z9P3FKWABCDEF55555"
|
||||||
|
eventType: "agent.created"
|
||||||
|
eventId: "evt_01HXK7Z9P3FKWABCDEF99999"
|
||||||
|
status: "success"
|
||||||
|
httpStatusCode: 200
|
||||||
|
attemptCount: 1
|
||||||
|
nextRetryAt: null
|
||||||
|
deliveredAt: "2026-03-29T12:00:05Z"
|
||||||
|
createdAt: "2026-03-29T12:00:00Z"
|
||||||
|
total: 1
|
||||||
|
page: 1
|
||||||
|
limit: 50
|
||||||
|
401 Unauthorized:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
404 Not Found:
|
||||||
|
schema:
|
||||||
|
$ref: '#/components/schemas/ErrorResponse'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Webhook Payload Format
|
||||||
|
|
||||||
|
Every webhook delivery uses this envelope format:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": "evt_01HXK7Z9P3FKWABCDEF99999",
|
||||||
|
"type": "agent.created",
|
||||||
|
"organizationId": "org_01HXK7Z9P3FKWABCDEF12345",
|
||||||
|
"timestamp": "2026-03-29T12:00:00Z",
|
||||||
|
"data": {
|
||||||
|
"agentId": "agt_01HXK7Z9P3FKWABCDEF67890",
|
||||||
|
"agentType": "orchestrator",
|
||||||
|
"status": "active",
|
||||||
|
"owner": "acme-ai",
|
||||||
|
"version": "1.0.0",
|
||||||
|
"deploymentEnv": "production"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### HMAC-SHA256 Signature
|
||||||
|
|
||||||
|
Every delivery includes the following HTTP headers:
|
||||||
|
|
||||||
|
```
|
||||||
|
X-AgentIdP-Event: agent.created
|
||||||
|
X-AgentIdP-Delivery-Id: del_01HXK7Z9P3FKWABCDEF77777
|
||||||
|
X-AgentIdP-Timestamp: 1743249600
|
||||||
|
X-AgentIdP-Signature-256: sha256=<HMAC-SHA256 of timestamp.payload using subscription secret>
|
||||||
|
```
|
||||||
|
|
||||||
|
Signature computation:
|
||||||
|
```
|
||||||
|
signed_content = timestamp + "." + JSON.stringify(payload)
|
||||||
|
signature = HMAC-SHA256(secret, signed_content)
|
||||||
|
header_value = "sha256=" + hex(signature)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Schema Changes
|
||||||
|
|
||||||
|
### New Table: webhook_subscriptions
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE webhook_subscriptions (
|
||||||
|
subscription_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
|
||||||
|
url VARCHAR(2048) NOT NULL,
|
||||||
|
events JSONB NOT NULL DEFAULT '[]',
|
||||||
|
secret_hash VARCHAR(255) NOT NULL, -- bcrypt hash of secret; plain text stored in Vault
|
||||||
|
vault_secret_path VARCHAR(255) NOT NULL,
|
||||||
|
description VARCHAR(255),
|
||||||
|
active BOOLEAN NOT NULL DEFAULT TRUE,
|
||||||
|
failure_count INTEGER NOT NULL DEFAULT 0,
|
||||||
|
last_delivery_at TIMESTAMPTZ,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_webhook_subs_org_id ON webhook_subscriptions(organization_id);
|
||||||
|
CREATE INDEX idx_webhook_subs_active ON webhook_subscriptions(active) WHERE active = TRUE;
|
||||||
|
```
|
||||||
|
|
||||||
|
### New Table: webhook_deliveries
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE webhook_deliveries (
|
||||||
|
delivery_id VARCHAR(40) PRIMARY KEY,
|
||||||
|
subscription_id VARCHAR(40) NOT NULL REFERENCES webhook_subscriptions(subscription_id),
|
||||||
|
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
|
||||||
|
event_id VARCHAR(40) NOT NULL,
|
||||||
|
event_type VARCHAR(100) NOT NULL,
|
||||||
|
payload JSONB NOT NULL,
|
||||||
|
status VARCHAR(20) NOT NULL DEFAULT 'pending',
|
||||||
|
http_status_code SMALLINT,
|
||||||
|
response_body TEXT,
|
||||||
|
attempt_count SMALLINT NOT NULL DEFAULT 0,
|
||||||
|
next_retry_at TIMESTAMPTZ,
|
||||||
|
delivered_at TIMESTAMPTZ,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
CONSTRAINT webhook_deliveries_status_check CHECK (status IN ('pending', 'success', 'failed', 'dead_letter'))
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_webhook_deliveries_sub_id ON webhook_deliveries(subscription_id);
|
||||||
|
CREATE INDEX idx_webhook_deliveries_status ON webhook_deliveries(status);
|
||||||
|
CREATE INDEX idx_webhook_deliveries_org_id ON webhook_deliveries(organization_id);
|
||||||
|
CREATE INDEX idx_webhook_deliveries_created ON webhook_deliveries(created_at);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Retry Schedule
|
||||||
|
|
||||||
|
```
|
||||||
|
Attempt 1: immediate
|
||||||
|
Attempt 2: 1 minute after failure
|
||||||
|
Attempt 3: 5 minutes after failure
|
||||||
|
Attempt 4: 15 minutes after failure
|
||||||
|
Attempt 5: 1 hour after failure
|
||||||
|
Attempt 6: 4 hours after failure
|
||||||
|
Attempt 7: 12 hours after failure
|
||||||
|
Attempt 8: 24 hours after failure
|
||||||
|
Attempt 9: 48 hours after failure
|
||||||
|
Attempt 10: 72 hours after failure
|
||||||
|
After attempt 10: status = dead_letter; operator alerted via Prometheus metric
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Description | Default |
|
||||||
|
|---------------------|-------------|---------|
|
||||||
|
| `WEBHOOKS_ENABLED` | Enable webhook functionality | `true` |
|
||||||
|
| `WEBHOOK_DELIVERY_TIMEOUT_MS` | HTTP delivery request timeout | `10000` |
|
||||||
|
| `WEBHOOK_MAX_RETRIES` | Maximum delivery attempts before dead-letter | `10` |
|
||||||
|
| `WEBHOOK_WORKER_CONCURRENCY` | Number of concurrent delivery workers | `5` |
|
||||||
|
| `KAFKA_BROKERS` | Comma-separated Kafka broker list (optional; activates Kafka adapter) | `""` |
|
||||||
|
| `KAFKA_TOPIC_PREFIX` | Prefix for Kafka topic names | `agentidp` |
|
||||||
|
| `NATS_URL` | NATS server URL (optional; activates NATS adapter) | `""` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
| Package | Version | Purpose |
|
||||||
|
|---------|---------|---------|
|
||||||
|
| `bull` | `^4.16.3` | Redis-backed async job queue for webhook delivery |
|
||||||
|
| `kafkajs` | `^2.2.4` | Kafka producer adapter (optional) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- Webhook secrets are stored in Vault; only a bcrypt hash is in PostgreSQL for in-memory comparison
|
||||||
|
- All deliveries must be to HTTPS endpoints — HTTP endpoints are rejected at subscription creation
|
||||||
|
- Private/internal IP ranges (RFC 1918, loopback) are blocked at delivery time to prevent SSRF
|
||||||
|
- HMAC signature allows the receiving server to verify the delivery is authentic
|
||||||
|
- Replay attacks are mitigated by including a timestamp in the signed content; receivers should reject deliveries with timestamps older than 5 minutes
|
||||||
|
- Dead-letter events generate a Prometheus metric `agentidp_webhook_dead_letters_total` for alerting
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] `POST /webhooks` creates a subscription; secret stored in Vault, not returned after creation
|
||||||
|
- [ ] Webhook delivery occurs within 30 seconds of event generation for healthy subscribers
|
||||||
|
- [ ] Delivery includes correct `X-AgentIdP-Signature-256` header verifiable with provided secret
|
||||||
|
- [ ] Failed delivery is retried per schedule; status updates in `webhook_deliveries` table
|
||||||
|
- [ ] After max retries, status is `dead_letter` and metric is incremented
|
||||||
|
- [ ] Delivery to HTTP (non-HTTPS) URL is rejected at subscription creation
|
||||||
|
- [ ] Delivery to private IP range is rejected (SSRF protection)
|
||||||
|
- [ ] `GET /webhooks/:id/deliveries` returns accurate delivery history
|
||||||
|
- [ ] TypeScript strict, zero `any`, >80% test coverage on WebhookService
|
||||||
Reference in New Issue
Block a user