Files
SentryAgent.ai Developer d42c653eea chore(openspec): archive engineering-docs and phase-2-production-ready changes
- engineering-docs → archive/2026-03-29-engineering-docs (63/63 tasks complete)
- phase-2-production-ready → archive/2026-03-29-phase-2-production-ready (89/89 tasks complete)
- openspec/specs/ synced with all Phase 1 + Phase 2 + engineering-docs capabilities (22 specs total)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:41:53 +00:00

7.9 KiB

Phase 2: Production-Ready — Tasks

Status: In progress — Workstreams 1, 2, 3, 4 complete.

CEO Approval Gates (required before implementation)

  • A0.1 Approve dependency: node-vault (Vault integration)
  • A0.2 Approve dependency: @openpolicyagent/opa-wasm (OPA policy engine)
  • A0.3 Approve dependency: React 18 + Vite 5 (web dashboard)
  • A0.4 Approve dependency: prom-client (Prometheus metrics)
  • A0.5 Approve dependency: Terraform (infrastructure as code)

Workstream 1: HashiCorp Vault Integration

  • 1.1 Write src/vault/VaultClient.ts — wraps node-vault; methods: writeSecret, readSecret, deleteSecret, verifySecret
  • 1.2 Write src/db/migrations/005_add_vault_path.sql — add vault_path column to credentials
  • 1.3 Update CredentialService.ts — new credentials use Vault; existing bcrypt credentials continue to work
  • 1.4 Update docs/devops/environment-variables.md — add VAULT_ADDR, VAULT_TOKEN, VAULT_MOUNT
  • 1.5 Write docs/devops/vault-setup.md — Vault dev server setup, production Vault config, migration guide
  • 1.6 Write unit tests for VaultClient (mocked Vault) and updated CredentialService
  • 1.7 QA sign-off: zero any, TypeScript strict, >80% coverage, coexistence verified

Workstream 2: Python SDK

  • 2.1 Create sdk-python/ with pyproject.toml — name: sentryagent-idp, python>=3.9
  • 2.2 Write sdk-python/src/sentryagent_idp/types.py — all request/response dataclasses
  • 2.3 Write sdk-python/src/sentryagent_idp/errors.py — AgentIdPError exception
  • 2.4 Write sdk-python/src/sentryagent_idp/token_manager.py — sync TokenManager
  • 2.5 Write sdk-python/src/sentryagent_idp/async_token_manager.py — async TokenManager (httpx)
  • 2.6 Write sdk-python/src/sentryagent_idp/services/agents.py — AgentRegistryClient (sync + async)
  • 2.7 Write sdk-python/src/sentryagent_idp/services/credentials.py — CredentialClient (sync + async)
  • 2.8 Write sdk-python/src/sentryagent_idp/services/token.py — TokenClient (sync + async)
  • 2.9 Write sdk-python/src/sentryagent_idp/services/audit.py — AuditClient (sync + async)
  • 2.10 Write sdk-python/src/sentryagent_idp/client.py — AgentIdPClient (sync) + AsyncAgentIdPClient
  • 2.11 Write sdk-python/src/sentryagent_idp/__init__.py — barrel exports
  • 2.12 Write sdk-python/README.md
  • 2.13 QA: mypy --strict clean, all 14 endpoints, AgentIdPError on all failure paths, pytest >80%

Workstream 3: Go SDK

  • 3.1 Create sdk-go/ with go.mod — module: github.com/sentryagent/idp-sdk-go, go 1.21
  • 3.2 Write sdk-go/types.go — all request/response structs
  • 3.3 Write sdk-go/errors.go — AgentIdPError type implementing error interface
  • 3.4 Write sdk-go/token_manager.go — mutex-guarded TokenManager
  • 3.5 Write sdk-go/agents.go — AgentRegistryClient (flat package; see ADR below)
  • 3.6 Write sdk-go/credentials.go — CredentialClient
  • 3.7 Write sdk-go/token_service.go — TokenServiceClient
  • 3.8 Write sdk-go/audit.go — AuditClient
  • 3.9 Write sdk-go/client.go — AgentIdPClient
  • 3.10 Write sdk-go/README.md
  • 3.11 QA: go vet clean, staticcheck clean, all 14 endpoints, goroutine-safe, go test ./... >80%

Workstream 4: Java SDK

  • 4.1 Create sdk-java/ with pom.xml — groupId: ai.sentryagent, artifactId: idp-sdk, Java 17
  • 4.2 Write all POJO request/response model classes
  • 4.3 Write AgentIdPException.java extending RuntimeException
  • 4.4 Write TokenManager.java — synchronized cache with 60s refresh buffer
  • 4.5 Write AgentRegistryClient.java — sync + CompletableFuture methods
  • 4.6 Write CredentialClient.java — sync + CompletableFuture methods
  • 4.7 Write TokenClient.java — sync + CompletableFuture methods
  • 4.8 Write AuditClient.java — sync + CompletableFuture methods
  • 4.9 Write AgentIdPClient.java — composes all service clients
  • 4.10 Write sdk-java/README.md
  • 4.11 QA: mvn verify passes, all 14 endpoints, AgentIdPException on all failure paths, JUnit 5 >80%

Workstream 5: OPA Policy Engine

  • 5.1 Write policies/authz.rego — allow/deny rules matching all current scope checks
  • 5.2 Write policies/data/scopes.json — scope to endpoint permission mapping
  • 5.3 Write src/middleware/opa.ts — OpaMiddleware: loads Wasm, evaluates input, returns allow/deny
  • 5.4 Replace static scope check in src/middleware/auth.ts with OpaMiddleware
  • 5.5 Add SIGHUP handler in src/server.ts to hot-reload policy files
  • 5.6 Update docs/devops/environment-variables.md — add POLICY_DIR
  • 5.7 QA: all existing auth tests pass unchanged, new OPA unit tests, hot-reload verified

Workstream 6: Web Dashboard UI

  • 6.1 Create dashboard/ with Vite 5 + React 18 + TypeScript strict configuration
  • 6.2 Set up shadcn/ui with Tailwind CSS
  • 6.3 Write dashboard/src/lib/auth.ts — credential entry, TokenManager, sessionStorage
  • 6.4 Write dashboard/src/lib/client.ts — wraps @sentryagent/idp-sdk AgentIdPClient
  • 6.5 Write Login page (/dashboard/login)
  • 6.6 Write Agents page (/dashboard/agents) — list, search, filter by status
  • 6.7 Write Agent Detail page (/dashboard/agents/:id) — suspend/reactivate with confirm dialog
  • 6.8 Write Credentials page (/dashboard/agents/:id/credentials) — rotate/revoke with confirm
  • 6.9 Write Audit Log page (/dashboard/audit) — filters, pagination
  • 6.10 Write Health page (/dashboard/health) — PostgreSQL + Redis connectivity status
  • 6.11 Configure AgentIdP Express app to serve dashboard/dist/ at /dashboard
  • 6.12 Write dashboard/README.md
  • 6.13 QA: TypeScript strict, zero any, OWASP Top 10 review, responsive layout verified

Workstream 7: Prometheus + Grafana Monitoring

  • 7.1 Add prom-client to dependencies (after CEO approval A0.4)
  • 7.2 Write src/metrics/registry.ts — shared Prometheus Registry with all 7 metric definitions
  • 7.3 Instrument OAuth2Service.ts — increment agentidp_tokens_issued_total
  • 7.4 Instrument AgentService.ts — increment agentidp_agents_registered_total
  • 7.5 Instrument src/middleware/ — HTTP request counter and duration histogram
  • 7.6 Instrument src/db/pool.ts — DB query duration histogram
  • 7.7 Instrument src/cache/redis.ts — Redis command duration histogram
  • 7.8 Add GET /metrics route (unauthenticated, Prometheus text format)
  • 7.9 Write monitoring/prometheus/prometheus.yml — scrape config
  • 7.10 Write monitoring/grafana/provisioning/ — datasource + dashboard provisioning
  • 7.11 Write monitoring/grafana/dashboards/agentidp.json — pre-built Grafana dashboard
  • 7.12 Write docker-compose.monitoring.yml overlay
  • 7.13 Update docs/devops/operations.md — monitoring section
  • 7.14 QA: all 7 metrics verified under load, Grafana auto-provisions, no auth leak on /metrics

Workstream 8: Multi-Region Deployment (Terraform)

  • 8.1 Write terraform/modules/agentidp/main.tf + variables.tf + outputs.tf
  • 8.2 Write terraform/modules/rds/ — managed PostgreSQL module
  • 8.3 Write terraform/modules/redis/ — managed Redis module
  • 8.4 Write terraform/modules/lb/ — load balancer + TLS module
  • 8.5 Write terraform/environments/aws/main.tf + variables.tf + terraform.tfvars.example
  • 8.6 Write terraform/environments/gcp/main.tf + variables.tf + terraform.tfvars.example
  • 8.7 Write docs/devops/deployment.md — end-to-end AWS and GCP deployment walkthrough
  • 8.8 QA: secrets not hardcoded, TLS enforced, DB/Redis VPC-internal (static review passed; terraform validate requires Terraform CLI not present in this env)

Phase 2 Complete Criteria

All 8 workstreams done. All tasks checked. All QA gates passed. CEO reviewed.