Single-package agentidp SDK in sdk-go/: - AgentIdPClient composing AgentRegistryClient, CredentialClient, TokenServiceClient, AuditClient — all 14 endpoints covered - Goroutine-safe TokenManager (sync.Mutex) with 60s refresh buffer - AgentIdPError implementing error interface with Code/HTTPStatus/Details - Context-aware: all service methods take context.Context as first arg - doRequest shared helper; token endpoints use form-encoded POST directly - go vet: 0 warnings | staticcheck: 0 warnings - go test ./...: 37/37 passed | coverage: 81.0% (>80% gate) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.8 KiB
7.8 KiB
Phase 2: Production-Ready — Tasks
Status: In progress — Workstreams 1, 2, 3 complete.
CEO Approval Gates (required before implementation)
- A0.1 Approve dependency:
node-vault(Vault integration) - A0.2 Approve dependency:
@openpolicyagent/opa-wasm(OPA policy engine) - A0.3 Approve dependency: React 18 + Vite 5 (web dashboard)
- A0.4 Approve dependency:
prom-client(Prometheus metrics) - A0.5 Approve dependency: Terraform (infrastructure as code)
Workstream 1: HashiCorp Vault Integration
- 1.1 Write
src/vault/VaultClient.ts— wrapsnode-vault; methods: writeSecret, readSecret, deleteSecret, verifySecret - 1.2 Write
src/db/migrations/005_add_vault_path.sql— addvault_pathcolumn tocredentials - 1.3 Update
CredentialService.ts— new credentials use Vault; existing bcrypt credentials continue to work - 1.4 Update
docs/devops/environment-variables.md— add VAULT_ADDR, VAULT_TOKEN, VAULT_MOUNT - 1.5 Write
docs/devops/vault-setup.md— Vault dev server setup, production Vault config, migration guide - 1.6 Write unit tests for VaultClient (mocked Vault) and updated CredentialService
- 1.7 QA sign-off: zero
any, TypeScript strict, >80% coverage, coexistence verified
Workstream 2: Python SDK
- 2.1 Create
sdk-python/withpyproject.toml— name: sentryagent-idp, python>=3.9 - 2.2 Write
sdk-python/src/sentryagent_idp/types.py— all request/response dataclasses - 2.3 Write
sdk-python/src/sentryagent_idp/errors.py— AgentIdPError exception - 2.4 Write
sdk-python/src/sentryagent_idp/token_manager.py— sync TokenManager - 2.5 Write
sdk-python/src/sentryagent_idp/async_token_manager.py— async TokenManager (httpx) - 2.6 Write
sdk-python/src/sentryagent_idp/services/agents.py— AgentRegistryClient (sync + async) - 2.7 Write
sdk-python/src/sentryagent_idp/services/credentials.py— CredentialClient (sync + async) - 2.8 Write
sdk-python/src/sentryagent_idp/services/token.py— TokenClient (sync + async) - 2.9 Write
sdk-python/src/sentryagent_idp/services/audit.py— AuditClient (sync + async) - 2.10 Write
sdk-python/src/sentryagent_idp/client.py— AgentIdPClient (sync) + AsyncAgentIdPClient - 2.11 Write
sdk-python/src/sentryagent_idp/__init__.py— barrel exports - 2.12 Write
sdk-python/README.md - 2.13 QA:
mypy --strictclean, all 14 endpoints, AgentIdPError on all failure paths, pytest >80%
Workstream 3: Go SDK
- 3.1 Create
sdk-go/withgo.mod— module: github.com/sentryagent/idp-sdk-go, go 1.21 - 3.2 Write
sdk-go/types.go— all request/response structs - 3.3 Write
sdk-go/errors.go— AgentIdPError type implementing error interface - 3.4 Write
sdk-go/token_manager.go— mutex-guarded TokenManager - 3.5 Write
sdk-go/agents.go— AgentRegistryClient (flat package; see ADR below) - 3.6 Write
sdk-go/credentials.go— CredentialClient - 3.7 Write
sdk-go/token_service.go— TokenServiceClient - 3.8 Write
sdk-go/audit.go— AuditClient - 3.9 Write
sdk-go/client.go— AgentIdPClient - 3.10 Write
sdk-go/README.md - 3.11 QA:
go vetclean,staticcheckclean, all 14 endpoints, goroutine-safe,go test ./...>80%
Workstream 4: Java SDK
- 4.1 Create
sdk-java/withpom.xml— groupId: ai.sentryagent, artifactId: idp-sdk, Java 17 - 4.2 Write all POJO request/response model classes
- 4.3 Write
AgentIdPException.javaextending RuntimeException - 4.4 Write
TokenManager.java— synchronized cache with 60s refresh buffer - 4.5 Write
AgentRegistryClient.java— sync + CompletableFuture methods - 4.6 Write
CredentialClient.java— sync + CompletableFuture methods - 4.7 Write
TokenClient.java— sync + CompletableFuture methods - 4.8 Write
AuditClient.java— sync + CompletableFuture methods - 4.9 Write
AgentIdPClient.java— composes all service clients - 4.10 Write
sdk-java/README.md - 4.11 QA:
mvn verifypasses, all 14 endpoints, AgentIdPException on all failure paths, JUnit 5 >80%
Workstream 5: OPA Policy Engine
- 5.1 Write
policies/authz.rego— allow/deny rules matching all current scope checks - 5.2 Write
policies/data/scopes.json— scope to endpoint permission mapping - 5.3 Write
src/middleware/opa.ts— OpaMiddleware: loads Wasm, evaluates input, returns allow/deny - 5.4 Replace static scope check in
src/middleware/auth.tswith OpaMiddleware - 5.5 Add SIGHUP handler in
src/server.tsto hot-reload policy files - 5.6 Update
docs/devops/environment-variables.md— add POLICY_DIR - 5.7 QA: all existing auth tests pass unchanged, new OPA unit tests, hot-reload verified
Workstream 6: Web Dashboard UI
- 6.1 Create
dashboard/with Vite 5 + React 18 + TypeScript strict configuration - 6.2 Set up shadcn/ui with Tailwind CSS
- 6.3 Write
dashboard/src/lib/auth.ts— credential entry, TokenManager, sessionStorage - 6.4 Write
dashboard/src/lib/client.ts— wraps @sentryagent/idp-sdk AgentIdPClient - 6.5 Write Login page (
/dashboard/login) - 6.6 Write Agents page (
/dashboard/agents) — list, search, filter by status - 6.7 Write Agent Detail page (
/dashboard/agents/:id) — suspend/reactivate with confirm dialog - 6.8 Write Credentials page (
/dashboard/agents/:id/credentials) — rotate/revoke with confirm - 6.9 Write Audit Log page (
/dashboard/audit) — filters, pagination - 6.10 Write Health page (
/dashboard/health) — PostgreSQL + Redis connectivity status - 6.11 Configure AgentIdP Express app to serve
dashboard/dist/at/dashboard - 6.12 Write
dashboard/README.md - 6.13 QA: TypeScript strict, zero
any, OWASP Top 10 review, responsive layout verified
Workstream 7: Prometheus + Grafana Monitoring
- 7.1 Add
prom-clientto dependencies (after CEO approval A0.4) - 7.2 Write
src/metrics/registry.ts— shared Prometheus Registry with all 7 metric definitions - 7.3 Instrument
OAuth2Service.ts— incrementagentidp_tokens_issued_total - 7.4 Instrument
AgentService.ts— incrementagentidp_agents_registered_total - 7.5 Instrument
src/middleware/— HTTP request counter and duration histogram - 7.6 Instrument
src/db/pool.ts— DB query duration histogram - 7.7 Instrument
src/cache/redis.ts— Redis command duration histogram - 7.8 Add
GET /metricsroute (unauthenticated, Prometheus text format) - 7.9 Write
monitoring/prometheus/prometheus.yml— scrape config - 7.10 Write
monitoring/grafana/provisioning/— datasource + dashboard provisioning - 7.11 Write
monitoring/grafana/dashboards/agentidp.json— pre-built Grafana dashboard - 7.12 Write
docker-compose.monitoring.ymloverlay - 7.13 Update
docs/devops/operations.md— monitoring section - 7.14 QA: all 7 metrics verified under load, Grafana auto-provisions, no auth leak on /metrics
Workstream 8: Multi-Region Deployment (Terraform)
- 8.1 Write
terraform/modules/agentidp/main.tf+variables.tf+outputs.tf - 8.2 Write
terraform/modules/rds/— managed PostgreSQL module - 8.3 Write
terraform/modules/redis/— managed Redis module - 8.4 Write
terraform/modules/lb/— load balancer + TLS module - 8.5 Write
terraform/environments/aws/main.tf+variables.tf+terraform.tfvars.example - 8.6 Write
terraform/environments/gcp/main.tf+variables.tf+terraform.tfvars.example - 8.7 Write
docs/devops/deployment.md— end-to-end AWS and GCP deployment walkthrough - 8.8 QA:
terraform validatepasses, secrets not hardcoded, TLS enforced, DB/Redis VPC-internal
Phase 2 Complete Criteria
All 8 workstreams done. All tasks checked. All QA gates passed. CEO reviewed.