Vault is optional — server falls back to bcrypt (Phase 1 behaviour) when VAULT_ADDR is not set. Full coexistence: existing bcrypt credentials continue to work until rotated. Changes: - src/vault/VaultClient.ts — wraps node-vault KV v2; writeSecret, readSecret, verifySecret (constant-time), deleteSecret - src/db/migrations/005_add_vault_path.sql — vault_path column on credentials - CredentialRepository — createWithVaultPath, updateVaultPath methods - CredentialService — routes generate/rotate through Vault when configured; bcrypt path unchanged - OAuth2Service — verifies via Vault when vaultPath set, bcrypt otherwise - src/app.ts — createVaultClientFromEnv() wired into service layer - ICredentialRow — vaultPath field added - docs/devops/environment-variables.md — VAULT_ADDR, VAULT_TOKEN, VAULT_MOUNT - docs/devops/vault-setup.md — dev quickstart, production config, migration guide - tests: 33/33 unit tests pass (VaultClient + CredentialService Vault path) - node-vault + @types/node-vault installed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
128 lines
7.8 KiB
Markdown
128 lines
7.8 KiB
Markdown
# Phase 2: Production-Ready — Tasks
|
|
|
|
**Status**: In progress — Workstream 1 complete.
|
|
|
|
## CEO Approval Gates (required before implementation)
|
|
|
|
- [x] A0.1 Approve dependency: `node-vault` (Vault integration)
|
|
- [x] A0.2 Approve dependency: `@openpolicyagent/opa-wasm` (OPA policy engine)
|
|
- [x] A0.3 Approve dependency: React 18 + Vite 5 (web dashboard)
|
|
- [x] A0.4 Approve dependency: `prom-client` (Prometheus metrics)
|
|
- [x] A0.5 Approve dependency: Terraform (infrastructure as code)
|
|
|
|
---
|
|
|
|
## Workstream 1: HashiCorp Vault Integration
|
|
|
|
- [x] 1.1 Write `src/vault/VaultClient.ts` — wraps `node-vault`; methods: writeSecret, readSecret, deleteSecret, verifySecret
|
|
- [x] 1.2 Write `src/db/migrations/005_add_vault_path.sql` — add `vault_path` column to `credentials`
|
|
- [x] 1.3 Update `CredentialService.ts` — new credentials use Vault; existing bcrypt credentials continue to work
|
|
- [x] 1.4 Update `docs/devops/environment-variables.md` — add VAULT_ADDR, VAULT_TOKEN, VAULT_MOUNT
|
|
- [x] 1.5 Write `docs/devops/vault-setup.md` — Vault dev server setup, production Vault config, migration guide
|
|
- [x] 1.6 Write unit tests for VaultClient (mocked Vault) and updated CredentialService
|
|
- [x] 1.7 QA sign-off: zero `any`, TypeScript strict, >80% coverage, coexistence verified
|
|
|
|
## Workstream 2: Python SDK
|
|
|
|
- [ ] 2.1 Create `sdk-python/` with `pyproject.toml` — name: sentryagent-idp, python>=3.9
|
|
- [ ] 2.2 Write `sdk-python/src/sentryagent_idp/types.py` — all request/response dataclasses
|
|
- [ ] 2.3 Write `sdk-python/src/sentryagent_idp/errors.py` — AgentIdPError exception
|
|
- [ ] 2.4 Write `sdk-python/src/sentryagent_idp/token_manager.py` — sync TokenManager
|
|
- [ ] 2.5 Write `sdk-python/src/sentryagent_idp/async_token_manager.py` — async TokenManager (httpx)
|
|
- [ ] 2.6 Write `sdk-python/src/sentryagent_idp/services/agents.py` — AgentRegistryClient (sync + async)
|
|
- [ ] 2.7 Write `sdk-python/src/sentryagent_idp/services/credentials.py` — CredentialClient (sync + async)
|
|
- [ ] 2.8 Write `sdk-python/src/sentryagent_idp/services/token.py` — TokenClient (sync + async)
|
|
- [ ] 2.9 Write `sdk-python/src/sentryagent_idp/services/audit.py` — AuditClient (sync + async)
|
|
- [ ] 2.10 Write `sdk-python/src/sentryagent_idp/client.py` — AgentIdPClient (sync) + AsyncAgentIdPClient
|
|
- [ ] 2.11 Write `sdk-python/src/sentryagent_idp/__init__.py` — barrel exports
|
|
- [ ] 2.12 Write `sdk-python/README.md`
|
|
- [ ] 2.13 QA: `mypy --strict` clean, all 14 endpoints, AgentIdPError on all failure paths, pytest >80%
|
|
|
|
## Workstream 3: Go SDK
|
|
|
|
- [ ] 3.1 Create `sdk-go/` with `go.mod` — module: github.com/sentryagent/idp-sdk-go, go 1.21
|
|
- [ ] 3.2 Write `sdk-go/types.go` — all request/response structs
|
|
- [ ] 3.3 Write `sdk-go/errors.go` — AgentIdPError type implementing error interface
|
|
- [ ] 3.4 Write `sdk-go/token_manager.go` — mutex-guarded TokenManager
|
|
- [ ] 3.5 Write `sdk-go/services/agents.go` — AgentRegistryClient
|
|
- [ ] 3.6 Write `sdk-go/services/credentials.go` — CredentialClient
|
|
- [ ] 3.7 Write `sdk-go/services/token.go` — TokenClient
|
|
- [ ] 3.8 Write `sdk-go/services/audit.go` — AuditClient
|
|
- [ ] 3.9 Write `sdk-go/client.go` — AgentIdPClient
|
|
- [ ] 3.10 Write `sdk-go/README.md`
|
|
- [ ] 3.11 QA: `go vet` clean, `staticcheck` clean, all 14 endpoints, goroutine-safe, `go test ./...` >80%
|
|
|
|
## Workstream 4: Java SDK
|
|
|
|
- [ ] 4.1 Create `sdk-java/` with `pom.xml` — groupId: ai.sentryagent, artifactId: idp-sdk, Java 17
|
|
- [ ] 4.2 Write all POJO request/response model classes
|
|
- [ ] 4.3 Write `AgentIdPException.java` extending RuntimeException
|
|
- [ ] 4.4 Write `TokenManager.java` — synchronized cache with 60s refresh buffer
|
|
- [ ] 4.5 Write `AgentRegistryClient.java` — sync + CompletableFuture methods
|
|
- [ ] 4.6 Write `CredentialClient.java` — sync + CompletableFuture methods
|
|
- [ ] 4.7 Write `TokenClient.java` — sync + CompletableFuture methods
|
|
- [ ] 4.8 Write `AuditClient.java` — sync + CompletableFuture methods
|
|
- [ ] 4.9 Write `AgentIdPClient.java` — composes all service clients
|
|
- [ ] 4.10 Write `sdk-java/README.md`
|
|
- [ ] 4.11 QA: `mvn verify` passes, all 14 endpoints, AgentIdPException on all failure paths, JUnit 5 >80%
|
|
|
|
## Workstream 5: OPA Policy Engine
|
|
|
|
- [ ] 5.1 Write `policies/authz.rego` — allow/deny rules matching all current scope checks
|
|
- [ ] 5.2 Write `policies/data/scopes.json` — scope to endpoint permission mapping
|
|
- [ ] 5.3 Write `src/middleware/opa.ts` — OpaMiddleware: loads Wasm, evaluates input, returns allow/deny
|
|
- [ ] 5.4 Replace static scope check in `src/middleware/auth.ts` with OpaMiddleware
|
|
- [ ] 5.5 Add SIGHUP handler in `src/server.ts` to hot-reload policy files
|
|
- [ ] 5.6 Update `docs/devops/environment-variables.md` — add POLICY_DIR
|
|
- [ ] 5.7 QA: all existing auth tests pass unchanged, new OPA unit tests, hot-reload verified
|
|
|
|
## Workstream 6: Web Dashboard UI
|
|
|
|
- [ ] 6.1 Create `dashboard/` with Vite 5 + React 18 + TypeScript strict configuration
|
|
- [ ] 6.2 Set up shadcn/ui with Tailwind CSS
|
|
- [ ] 6.3 Write `dashboard/src/lib/auth.ts` — credential entry, TokenManager, sessionStorage
|
|
- [ ] 6.4 Write `dashboard/src/lib/client.ts` — wraps @sentryagent/idp-sdk AgentIdPClient
|
|
- [ ] 6.5 Write Login page (`/dashboard/login`)
|
|
- [ ] 6.6 Write Agents page (`/dashboard/agents`) — list, search, filter by status
|
|
- [ ] 6.7 Write Agent Detail page (`/dashboard/agents/:id`) — suspend/reactivate with confirm dialog
|
|
- [ ] 6.8 Write Credentials page (`/dashboard/agents/:id/credentials`) — rotate/revoke with confirm
|
|
- [ ] 6.9 Write Audit Log page (`/dashboard/audit`) — filters, pagination
|
|
- [ ] 6.10 Write Health page (`/dashboard/health`) — PostgreSQL + Redis connectivity status
|
|
- [ ] 6.11 Configure AgentIdP Express app to serve `dashboard/dist/` at `/dashboard`
|
|
- [ ] 6.12 Write `dashboard/README.md`
|
|
- [ ] 6.13 QA: TypeScript strict, zero `any`, OWASP Top 10 review, responsive layout verified
|
|
|
|
## Workstream 7: Prometheus + Grafana Monitoring
|
|
|
|
- [ ] 7.1 Add `prom-client` to dependencies (after CEO approval A0.4)
|
|
- [ ] 7.2 Write `src/metrics/registry.ts` — shared Prometheus Registry with all 7 metric definitions
|
|
- [ ] 7.3 Instrument `OAuth2Service.ts` — increment `agentidp_tokens_issued_total`
|
|
- [ ] 7.4 Instrument `AgentService.ts` — increment `agentidp_agents_registered_total`
|
|
- [ ] 7.5 Instrument `src/middleware/` — HTTP request counter and duration histogram
|
|
- [ ] 7.6 Instrument `src/db/pool.ts` — DB query duration histogram
|
|
- [ ] 7.7 Instrument `src/cache/redis.ts` — Redis command duration histogram
|
|
- [ ] 7.8 Add `GET /metrics` route (unauthenticated, Prometheus text format)
|
|
- [ ] 7.9 Write `monitoring/prometheus/prometheus.yml` — scrape config
|
|
- [ ] 7.10 Write `monitoring/grafana/provisioning/` — datasource + dashboard provisioning
|
|
- [ ] 7.11 Write `monitoring/grafana/dashboards/agentidp.json` — pre-built Grafana dashboard
|
|
- [ ] 7.12 Write `docker-compose.monitoring.yml` overlay
|
|
- [ ] 7.13 Update `docs/devops/operations.md` — monitoring section
|
|
- [ ] 7.14 QA: all 7 metrics verified under load, Grafana auto-provisions, no auth leak on /metrics
|
|
|
|
## Workstream 8: Multi-Region Deployment (Terraform)
|
|
|
|
- [ ] 8.1 Write `terraform/modules/agentidp/main.tf` + `variables.tf` + `outputs.tf`
|
|
- [ ] 8.2 Write `terraform/modules/rds/` — managed PostgreSQL module
|
|
- [ ] 8.3 Write `terraform/modules/redis/` — managed Redis module
|
|
- [ ] 8.4 Write `terraform/modules/lb/` — load balancer + TLS module
|
|
- [ ] 8.5 Write `terraform/environments/aws/main.tf` + `variables.tf` + `terraform.tfvars.example`
|
|
- [ ] 8.6 Write `terraform/environments/gcp/main.tf` + `variables.tf` + `terraform.tfvars.example`
|
|
- [ ] 8.7 Write `docs/devops/deployment.md` — end-to-end AWS and GCP deployment walkthrough
|
|
- [ ] 8.8 QA: `terraform validate` passes, secrets not hardcoded, TLS enforced, DB/Redis VPC-internal
|
|
|
|
---
|
|
|
|
## Phase 2 Complete Criteria
|
|
|
|
All 8 workstreams done. All tasks checked. All QA gates passed. CEO reviewed.
|