chore(openspec): archive all completed changes, sync 14 new specs to library

Archived 4 completed OpenSpec changes (2026-04-02):
- phase-3-enterprise (100/100 tasks) — 6 Phase 3 capabilities synced
- devops-documentation (48/48 tasks) — 3 new + 1 merged capability
- bedroom-developer-docs (33/33 tasks) — 4 new capabilities synced
- engineering-docs (superseded by 2026-03-29 archive) — no tasks

Main spec library grows from 21 → 35 capabilities (+14 new):
federation, multi-tenancy, oidc, soc2, w3c-dids, webhooks,
database, operations, system-overview, api-reference, core-concepts,
developer-guides, quick-start + deployment (merged additive requirements)

Active changes: 0 — project board is clear for Phase 4 planning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
SentryAgent.ai Developer
2026-04-02 03:50:47 +00:00
parent ceec22f714
commit f1fbe0e29a
53 changed files with 3019 additions and 0 deletions

335
openspec/specs/soc2/spec.md Normal file
View File

@@ -0,0 +1,335 @@
# SOC 2 Type II Preparation — Specification
**Workstream**: 6 of 6
**Phase**: 3 — Enterprise
**Author**: Virtual Architect
**Date**: 2026-03-29
---
## Overview
Implement the technical controls required for SOC 2 Type II audit readiness. SOC 2 Type II certifies that security controls operate continuously over a defined period — not just that they exist. Controls are implemented in code, not just documented.
This workstream cuts across all other Phase 3 workstreams. It delivers: encryption at rest for sensitive columns, TLS enforcement middleware, automated secrets rotation, security event alerting, and audit log immutability via a Merkle hash chain. A compliance documentation package (controls matrix and runbook) is produced for auditors.
---
## Technical Controls
### Control C1: Encryption at Rest (Column-Level Encryption)
Sensitive columns in PostgreSQL are encrypted using `pgcrypto` symmetric encryption. The encryption key is stored in Vault and fetched at application startup, never written to disk.
**Columns encrypted**:
- `credentials.secret_hash` — encrypted with AES-256-CBC
- `credentials.vault_path` — encrypted with AES-256-CBC
- `webhook_subscriptions.vault_secret_path` — encrypted with AES-256-CBC
- `agent_did_keys.vault_key_path` — encrypted with AES-256-CBC
**Implementation**: A `EncryptionService` wraps `pgcrypto` `pgp_sym_encrypt` / `pgp_sym_decrypt`. The key is a 256-bit symmetric key stored at `secret/agentidp/encryption/column-key` in Vault. All INSERT/SELECT operations for encrypted columns go through `EncryptionService`.
---
### Control C2: TLS Enforcement
All inbound HTTP connections are rejected in production if TLS is not present. This is enforced at two levels:
1. Express middleware: `TLSEnforcementMiddleware` — if `X-Forwarded-Proto` is not `https` and `NODE_ENV=production`, respond `301 Moved Permanently` to HTTPS.
2. Terraform: Load balancers (Phase 2 Terraform modules) already enforce TLS; TLS enforcement middleware provides defense-in-depth.
---
### Control C3: Automated Secrets Rotation
A scheduled job (`SecretsRotationJob`) runs on a configurable cron schedule. It:
1. Identifies credentials whose `expires_at` is within `ROTATION_WARNING_DAYS` days
2. Emits a Prometheus metric `agentidp_credentials_expiring_soon_total` (labelled by `org_id`, `days_remaining`)
3. Renews Vault leases for all active credentials
4. Sends a webhook event `credential.expiring_soon` to subscribers who have opted in
This does not automatically rotate credentials without operator action — it alerts and prepares. Forced rotation requires an operator call to the existing `POST /agents/:id/credentials/:credId/rotate` endpoint.
---
### Control C4: Audit Log Immutability (Merkle Hash Chain)
Every `audit_logs` row carries two new columns:
- `hash`: SHA-256 of `(eventId || timestamp.toISOString() || action || outcome || agentId || organizationId || previousHash)`
- `previous_hash`: hash of the immediately preceding `audit_logs` row (by `created_at` order), or the genesis string `"GENESIS"` for the first row
A PostgreSQL trigger prevents `UPDATE` and `DELETE` on `audit_logs`.
A new admin endpoint `GET /audit/verify` runs a sequential chain verification pass and returns the integrity status.
---
### Control C5: Security Event Alerting
Prometheus alerting rules are written for the following security events:
| Alert | Condition | Severity |
|-------|-----------|---------|
| `AuthFailureSpike` | >50 `auth.failed` events in 5 minutes | Warning |
| `RateLimitExhaustion` | >80% of org rate limit consumed in 1 minute | Warning |
| `AnomalousTokenIssuance` | Token issuance rate 3x 7-day average | Warning |
| `WebhookDeadLetterAccumulating` | `agentidp_webhook_dead_letters_total` increases by >10 in 1 hour | Warning |
| `AuditChainIntegrityFailed` | `agentidp_audit_chain_integrity` metric is 0 | Critical |
| `CredentialExpiryApproaching` | `agentidp_credentials_expiring_soon_total{days_remaining="7"}` > 0 | Info |
---
## API Endpoints
### GET /audit/verify
Verify the Merkle hash chain integrity of the audit log. Requires `admin:orgs` scope. This is a potentially expensive operation on large audit logs — it is rate-limited to once per 5 minutes per organization.
```yaml
GET /audit/verify
Authorization: Bearer <token with admin:orgs scope>
Query Parameters:
fromDate:
type: string
format: date-time
description: Start of verification range. If omitted, verifies from genesis.
toDate:
type: string
format: date-time
description: End of verification range. If omitted, verifies to the latest row.
Responses:
200 OK:
schema:
type: object
properties:
valid:
type: boolean
description: True if the chain is intact across the entire range
rowsVerified:
type: integer
description: Number of audit rows verified
firstEventId:
type: string
lastEventId:
type: string
firstTimestamp:
type: string
format: date-time
lastTimestamp:
type: string
format: date-time
verifiedAt:
type: string
format: date-time
brokenAtEventId:
type: string
nullable: true
description: Present only if valid=false — the first eventId where the chain breaks
example:
valid: true
rowsVerified: 15420
firstEventId: "evt_genesis_00001"
lastEventId: "evt_01HXK7Z9P3FKWABCDEFZZZZZ"
firstTimestamp: "2026-01-01T00:00:00Z"
lastTimestamp: "2026-03-29T12:00:00Z"
verifiedAt: "2026-03-29T14:00:00Z"
brokenAtEventId: null
401 Unauthorized:
schema:
$ref: '#/components/schemas/ErrorResponse'
403 Forbidden:
schema:
$ref: '#/components/schemas/ErrorResponse'
429 Too Many Requests:
schema:
$ref: '#/components/schemas/ErrorResponse'
example:
code: "RATE_LIMITED"
message: "Audit verification can be run at most once per 5 minutes"
```
---
### GET /compliance/controls
Returns the current status of all SOC 2 technical controls. Requires `admin:orgs` scope. Used by auditors and compliance dashboards.
```yaml
GET /compliance/controls
Authorization: Bearer <token with admin:orgs scope>
Responses:
200 OK:
schema:
type: object
properties:
generatedAt:
type: string
format: date-time
controls:
type: array
items:
type: object
properties:
controlId:
type: string
name:
type: string
status:
type: string
enum: [pass, fail, warning, not_applicable]
description:
type: string
lastChecked:
type: string
format: date-time
example:
generatedAt: "2026-03-29T14:00:00Z"
controls:
- controlId: "C1"
name: "Encryption at Rest"
status: "pass"
description: "Column-level encryption active for all sensitive columns"
lastChecked: "2026-03-29T14:00:00Z"
- controlId: "C2"
name: "TLS Enforcement"
status: "pass"
description: "All non-TLS requests redirected to HTTPS in production"
lastChecked: "2026-03-29T14:00:00Z"
- controlId: "C3"
name: "Secrets Rotation"
status: "warning"
description: "3 credentials expiring within 7 days"
lastChecked: "2026-03-29T14:00:00Z"
- controlId: "C4"
name: "Audit Log Immutability"
status: "pass"
description: "Merkle chain intact — last verified 2026-03-29T13:55:00Z"
lastChecked: "2026-03-29T14:00:00Z"
- controlId: "C5"
name: "Security Event Alerting"
status: "pass"
description: "All 6 alerting rules active in Prometheus"
lastChecked: "2026-03-29T14:00:00Z"
401 Unauthorized:
schema:
$ref: '#/components/schemas/ErrorResponse'
403 Forbidden:
schema:
$ref: '#/components/schemas/ErrorResponse'
```
---
## Database Schema Changes
### Modified: audit_logs table
```sql
ALTER TABLE audit_logs
ADD COLUMN hash VARCHAR(64), -- SHA-256 hex string of chain node
ADD COLUMN previous_hash VARCHAR(64); -- Hash of preceding row, or "GENESIS"
-- Back-fill genesis hash for existing rows (one-time migration)
-- Migration script computes chain in order of created_at
-- Prevent updates and deletes (immutability trigger)
CREATE OR REPLACE FUNCTION prevent_audit_modification()
RETURNS TRIGGER AS $$
BEGIN
RAISE EXCEPTION 'audit_logs rows are immutable — modification is not permitted';
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER audit_logs_immutability
BEFORE UPDATE OR DELETE ON audit_logs
FOR EACH ROW EXECUTE FUNCTION prevent_audit_modification();
```
### Modified: credentials table
```sql
-- Columns remain same type; application now stores encrypted values
-- No DDL change — encryption is transparent at application layer
-- Add comment for documentation
COMMENT ON COLUMN credentials.secret_hash IS 'AES-256-CBC encrypted via EncryptionService (pgcrypto). Not a plain bcrypt hash.';
COMMENT ON COLUMN credentials.vault_path IS 'AES-256-CBC encrypted via EncryptionService.';
```
### New Table: compliance_check_log
```sql
CREATE TABLE compliance_check_log (
check_id VARCHAR(40) PRIMARY KEY,
organization_id VARCHAR(40) NOT NULL REFERENCES organizations(organization_id),
control_id VARCHAR(10) NOT NULL,
status VARCHAR(20) NOT NULL,
details JSONB NOT NULL DEFAULT '{}',
checked_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_compliance_check_org ON compliance_check_log(organization_id, checked_at DESC);
```
---
## Configuration
| Environment Variable | Description | Default |
|---------------------|-------------|---------|
| `SOC2_CONTROLS_ENABLED` | Enable SOC 2 controls enforcement | `true` |
| `TLS_ENFORCEMENT_ENABLED` | Enforce HTTPS in production | `true` in production, `false` in development |
| `COLUMN_ENCRYPTION_KEY_PATH` | Vault path for AES-256 column encryption key | `secret/agentidp/encryption/column-key` |
| `ROTATION_WARNING_DAYS` | Days before expiry to emit rotation warning | `30` |
| `SECRETS_ROTATION_CRON` | Cron schedule for rotation check job | `0 3 * * *` (daily at 3 AM UTC) |
| `AUDIT_CHAIN_VERIFY_CRON` | Cron schedule for automated chain verification | `0 2 * * *` (daily at 2 AM UTC) |
---
## Dependencies
| Package | Version | Purpose |
|---------|---------|---------|
| `node-forge` | `^1.3.1` | AES-256-CBC column-level encryption primitives |
Note: `pgcrypto` PostgreSQL extension must be enabled: `CREATE EXTENSION IF NOT EXISTS pgcrypto;`
---
## Compliance Documentation
The following documents are produced as part of this workstream:
| Document | Path | Description |
|----------|------|-------------|
| Controls Matrix | `docs/compliance/soc2-controls-matrix.md` | Maps SOC 2 Trust Services Criteria to implemented controls |
| Encryption Runbook | `docs/compliance/encryption-runbook.md` | Key rotation procedure, Vault key path map |
| Audit Log Runbook | `docs/compliance/audit-log-runbook.md` | How to run chain verification, interpret results |
| Incident Response | `docs/compliance/incident-response.md` | Security event response procedures |
| Secrets Rotation Guide | `docs/compliance/secrets-rotation.md` | Operator guide for credential and key rotation |
---
## Security Considerations
- Column encryption key is fetched from Vault at startup and held in process memory — never written to disk or logged
- Key rotation: new encryption key generates re-encrypted copies of all sensitive columns in a migration; the old key is retained in Vault history
- The immutability trigger on `audit_logs` prevents application-layer modification; a `SUPERUSER` can still bypass triggers — document this in the controls matrix as a residual risk requiring compensating controls (e.g., read-only replica verification)
- `GET /audit/verify` is rate-limited to prevent denial-of-service via repeated expensive sequential scans
- `GET /compliance/controls` never returns raw secrets or key material — only control status
---
## Acceptance Criteria
- [ ] `pgcrypto` extension enabled; sensitive columns are encrypted at rest (verified: plaintext not visible in direct DB query)
- [ ] TLS enforcement middleware redirects HTTP to HTTPS in production; passthrough in development
- [ ] `SecretsRotationJob` runs on schedule; emits Prometheus metric for expiring credentials
- [ ] Audit log immutability trigger prevents UPDATE/DELETE on `audit_logs` table
- [ ] `GET /audit/verify` returns `valid: true` for an unmodified chain
- [ ] `GET /audit/verify` returns `valid: false` with `brokenAtEventId` after a row is manually tampered with (test scenario)
- [ ] All 6 Prometheus alerting rules are present in `monitoring/prometheus/alerts.yml`
- [ ] `GET /compliance/controls` returns correct status for all 5 controls
- [ ] Compliance documentation written and reviewed
- [ ] TypeScript strict, zero `any`, >80% test coverage on SOC2 control implementations