docs: DevOps documentation — complete docs/devops/ set

Adds the full devops-documentation OpenSpec change implementation.
Separate from docs/developers/ — serves a different audience (operators,
not API consumers).

docs/devops/:
- README.md          — index and system overview
- architecture.md    — components, ports, data flow, Redis key patterns
- environment-variables.md — all 7 env vars (required + optional, formats, .env example)
- database.md        — 4-table schema, indexes, constraints, migration runner
- local-development.md — docker-compose setup, health checks, startup, Dockerfile gap noted
- security.md        — RSA key generation/rotation, CORS, bcrypt, secret storage guidance
- operations.md      — startup order, graceful shutdown, log reference, troubleshooting

QA gates: 48/48 tasks complete. All env vars verified against source.
All table names verified against migrations. All ports verified against
docker-compose.yml. All internal links resolve.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
SentryAgent.ai Developer
2026-03-28 14:28:55 +00:00
parent 61ea975c79
commit d94a8cedc0
15 changed files with 1353 additions and 0 deletions

219
docs/devops/database.md Normal file
View File

@@ -0,0 +1,219 @@
# Database
AgentIdP uses PostgreSQL 14+ as its primary data store. The schema consists of four tables managed by a custom migration runner.
---
## Schema Overview
```
agents
└── credentials (FK: client_id → agents.agent_id, CASCADE DELETE)
audit_events (no FK — append-only, agent_id is informational)
token_revocations (no FK — independent revocation store)
```
---
## Tables
### `agents`
The Agent Registry. One row per registered AI agent identity.
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `agent_id` | `UUID` | No | Primary key — system-assigned, immutable |
| `email` | `VARCHAR(255)` | No | Unique email-format identifier |
| `agent_type` | `VARCHAR(32)` | No | Enum: `screener`, `classifier`, `orchestrator`, `extractor`, `summarizer`, `router`, `monitor`, `custom` |
| `version` | `VARCHAR(64)` | No | Semantic version string |
| `capabilities` | `TEXT[]` | No | Array of `resource:action` strings |
| `owner` | `VARCHAR(128)` | No | Owning team or organisation |
| `deployment_env` | `VARCHAR(16)` | No | Enum: `development`, `staging`, `production` |
| `status` | `VARCHAR(24)` | No | Enum: `active`, `suspended`, `decommissioned`. Default: `active` |
| `created_at` | `TIMESTAMPTZ` | No | Registration timestamp. Default: `NOW()` |
| `updated_at` | `TIMESTAMPTZ` | No | Last update timestamp. Default: `NOW()` |
**Indexes:**
| Index | Column | Purpose |
|-------|--------|---------|
| `idx_agents_email` | `email` | Unique lookup on registration and conflict check |
| `idx_agents_status` | `status` | Filter by lifecycle status |
| `idx_agents_owner` | `owner` | Filter by owner |
| `idx_agents_agent_type` | `agent_type` | Filter by type |
| `idx_agents_created_at` | `created_at DESC` | Default sort for list queries |
**Constraints:**
- `email` is UNIQUE — one registration per email address
- `agent_type` and `deployment_env` and `status` have CHECK constraints enforcing the enum values
---
### `credentials`
OAuth 2.0 client credentials. One agent can have multiple credentials.
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `credential_id` | `UUID` | No | Primary key — system-assigned |
| `client_id` | `UUID` | No | FK → `agents.agent_id` (CASCADE DELETE) |
| `secret_hash` | `VARCHAR(255)` | No | bcrypt hash of the client secret. Plaintext is never stored. |
| `status` | `VARCHAR(16)` | No | Enum: `active`, `revoked`. Default: `active` |
| `created_at` | `TIMESTAMPTZ` | No | Creation timestamp |
| `expires_at` | `TIMESTAMPTZ` | Yes | Optional expiry. NULL = no expiry. |
| `revoked_at` | `TIMESTAMPTZ` | Yes | Revocation timestamp. NULL = not revoked. |
**Indexes:**
| Index | Column | Purpose |
|-------|--------|---------|
| `idx_credentials_client_id` | `client_id` | List credentials for an agent |
| `idx_credentials_status` | `status` | Filter active/revoked |
| `idx_credentials_created_at` | `created_at DESC` | Default sort |
**Cascade behaviour:** Deleting an agent record cascades and deletes all associated credentials. In practice, agents are soft-deleted (status → `decommissioned`) not hard-deleted, so this cascade is a safety net.
---
### `audit_events`
Immutable audit log. Append-only by design — no application-layer UPDATE or DELETE is ever issued against this table.
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `event_id` | `UUID` | No | Primary key — system-assigned |
| `agent_id` | `UUID` | No | Agent that triggered the event (informational, no FK) |
| `action` | `VARCHAR(32)` | No | Enum — see values below |
| `outcome` | `VARCHAR(16)` | No | Enum: `success`, `failure` |
| `ip_address` | `VARCHAR(64)` | No | Client IP address (IPv4 or IPv6) |
| `user_agent` | `TEXT` | No | HTTP User-Agent from the request |
| `metadata` | `JSONB` | No | Action-specific data. Default: `{}` |
| `timestamp` | `TIMESTAMPTZ` | No | Event timestamp. Default: `NOW()` |
**`action` enum values:** `agent.created`, `agent.updated`, `agent.decommissioned`, `agent.suspended`, `agent.reactivated`, `token.issued`, `token.revoked`, `token.introspected`, `credential.generated`, `credential.rotated`, `credential.revoked`, `auth.failed`
**Indexes:**
| Index | Column | Purpose |
|-------|--------|---------|
| `idx_audit_events_agent_id` | `agent_id` | Filter events by agent |
| `idx_audit_events_action` | `action` | Filter by action type |
| `idx_audit_events_outcome` | `outcome` | Filter successes/failures |
| `idx_audit_events_timestamp` | `timestamp DESC` | Default sort, date range queries |
**Why no FK on `agent_id`?** Audit records must be retained even after an agent is decommissioned. A FK would prevent decommission or cascade-delete history. The `agent_id` is stored as an informational reference only.
**Free tier retention:** The application enforces a 90-day retention window at the query layer. Purging old records is not yet automated — it is a Phase 2 task.
---
### `token_revocations`
Durable record of revoked JWT tokens. Supplements Redis for durability across Redis restarts.
| Column | Type | Nullable | Description |
|--------|------|----------|-------------|
| `jti` | `UUID` | No | Primary key — the JWT ID claim from the revoked token |
| `expires_at` | `TIMESTAMPTZ` | No | When the token would have expired naturally |
| `revoked_at` | `TIMESTAMPTZ` | No | When the token was revoked. Default: `NOW()` |
**Indexes:**
| Index | Column | Purpose |
|-------|--------|---------|
| `idx_token_revocations_expires_at` | `expires_at` | Enables future cleanup of expired revocation records |
**Dual-store design:** When a token is revoked, the `jti` is written to both:
1. Redis key `revoked:<jti>` with TTL set to the token's remaining lifetime — fast O(1) lookup on every authenticated request
2. This PostgreSQL table — durable record if Redis is restarted
**Note:** On Redis restart, the in-memory revocation cache is cold. Tokens revoked before the restart will pass auth until Phase 2 implements a warm-up that loads active revocations from PostgreSQL into Redis on startup.
---
## Migration Runner
Migrations are managed by `scripts/migrate.ts`. It reads `.sql` files from `src/db/migrations/` in alphabetical order, tracks applied migrations in a `schema_migrations` table, and executes only unapplied migrations — each in its own transaction.
### `schema_migrations` table
Created automatically on first run if it does not exist.
| Column | Type | Description |
|--------|------|-------------|
| `name` | `VARCHAR(255)` | Migration filename (primary key) |
| `applied_at` | `TIMESTAMPTZ` | When the migration was applied |
### Running migrations
```bash
# Set DATABASE_URL in environment or .env first
npm run db:migrate
```
Expected output (first run):
```
Running database migrations...
✓ Applied: 001_create_agents.sql
✓ Applied: 002_create_credentials.sql
✓ Applied: 003_create_audit_events.sql
✓ Applied: 004_create_tokens.sql
Migrations complete. 4 migration(s) applied.
```
Expected output (already applied):
```
Running database migrations...
- Skipped (already applied): 001_create_agents.sql
- Skipped (already applied): 002_create_credentials.sql
- Skipped (already applied): 003_create_audit_events.sql
- Skipped (already applied): 004_create_tokens.sql
Migrations complete. 0 migration(s) applied.
```
### Verifying applied migrations
```bash
psql "$DATABASE_URL" -c "SELECT name, applied_at FROM schema_migrations ORDER BY name;"
```
Expected output:
```
name | applied_at
-----------------------------------+-------------------------------
001_create_agents.sql | 2026-03-28 09:00:00.000000+00
002_create_credentials.sql | 2026-03-28 09:00:00.000000+00
003_create_audit_events.sql | 2026-03-28 09:00:00.000000+00
004_create_tokens.sql | 2026-03-28 09:00:00.000000+00
(4 rows)
```
### Adding a new migration
1. Create a new `.sql` file in `src/db/migrations/` with the next numeric prefix (e.g. `005_add_column.sql`)
2. Write idempotent SQL using `IF NOT EXISTS` / `IF EXISTS` guards where possible
3. Run `npm run db:migrate`
Migrations are run in alphabetical filename order. The prefix ensures correct ordering.
### Rollback
There is no automated rollback. To undo a migration:
1. Write and apply a compensating migration (e.g. `005_rollback_add_column.sql`)
2. Or connect directly to PostgreSQL and run the reverse SQL manually
---
## Connection Pool
The application uses `pg.Pool` with default settings (max 10 connections). The pool is a singleton — one pool per process instance.
To override pool size, modify `src/db/pool.ts`. In production, ensure `DATABASE_URL` includes connection pool parameters if using PgBouncer or a managed connection pooler.