Files

SentryAgent.ai Developer 7593bfe1c1 chore: Phase 2 OpenSpec scoping — proposal, design, specs, tasks

8 workstreams scoped per OpenSpec standards:
1. HashiCorp Vault integration (secret management)
2. Python SDK (sentryagent-idp)
3. Go SDK (idp-sdk-go)
4. Java SDK (ai.sentryagent:idp-sdk)
5. OPA policy engine (dynamic ABAC, hot-reload Rego)
6. Web Dashboard UI (React 18 + TypeScript)
7. Prometheus + Grafana monitoring (7 metrics, pre-built dashboard)
8. Multi-region Terraform deployment (AWS + GCP)

Status: proposed — awaiting CEO dependency approvals (A0.1–A0.5)
before any implementation begins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-28 14:53:09 +00:00

8.0 KiB

Raw Permalink Blame History

Phase 2: Production-Ready — Technical Design

Date: 2026-03-28 Author: Virtual Architect Status: Draft — pending CEO approval of proposal

1. HashiCorp Vault Integration

Architecture

AgentIdP Server
  └── CredentialService
        └── VaultClient (new)
              └── HashiCorp Vault (sidecar or external)
                    └── KV Secrets Engine v2

Design Decisions

ADR-001: Vault over AWS KMS/GCP Secret Manager Vault is cloud-agnostic, open-source, and already standard in enterprise environments. Using Vault keeps Phase 2 cloud-provider independent.

ADR-002: KV Secrets Engine v2 KV v2 provides versioned secrets and metadata. When a credential is rotated, the old version is retained in Vault history, enabling audit-grade secret lifecycle tracking.

ADR-003: AgentIdP stores Vault path, not secret credentials.vault_path stores the Vault KV path (e.g. secret/agentidp/agents/{agentId}/credentials/{credentialId}). The secret itself is never written to PostgreSQL.

New environment variables

Variable	Description
`VAULT_ADDR`	Vault server address
`VAULT_TOKEN`	Vault root/service token
`VAULT_MOUNT`	KV mount path (default: `secret`)

Migration

Add vault_path column to credentials table (005_add_vault_path.sql). Existing credentials retain bcrypt hashes; new credentials use Vault. Both code paths coexist until all credentials are rotated (migration guide provided).

2. Multi-Language SDKs

Shared contract (all SDKs implement identically)

AgentIdPClient(baseUrl, clientId, clientSecret, scopes?)
  .agents     → AgentRegistryClient   (5 methods)
  .credentials → CredentialClient     (4 methods)
  .tokens     → TokenClient           (2 methods)
  .audit      → AuditClient           (2 methods)
  .clearTokenCache()

TokenManager — auto-refresh 60s before expiry
AgentIdPError — code, message, httpStatus, details

Python SDK (`sentryagent-idp`)

Python 3.9+ (httpx for async, requests for sync)
Both sync and async client variants
PyPI package: sentryagent-idp
Type hints throughout (mypy --strict clean)

Go SDK (`github.com/sentryagent/idp-sdk-go`)

Go 1.21+, standard library net/http
Context-aware methods (context.Context first arg)
Idiomatic Go error handling (error return, no panic)
Go module: github.com/sentryagent/idp-sdk-go

Java SDK (`ai.sentryagent:idp-sdk`)

Java 17+, Apache HttpClient 5
Synchronous and CompletableFuture async variants
Maven Central: ai.sentryagent:idp-sdk
Fully typed with generics

3. OPA Policy Engine

Architecture

HTTP Request
  → Auth Middleware (JWT verify) — unchanged
  → OPA Middleware (new) — evaluates policy
      → OPA Wasm (embedded, no network call)
          → Rego policy files (hot-reloadable)
  → Controller

Design Decisions

ADR-004: OPA Wasm over OPA sidecar Embedding OPA as Wasm in the Node.js process eliminates a network hop and removes a runtime dependency. Policy files are loaded from policies/ directory at startup and reloaded on SIGHUP.

ADR-005: Policy replaces, does not wrap, scope check The existing static scope check in auth.ts is replaced by an OPA policy evaluation. This keeps the policy as the single source of truth for access control.

Policy structure (`policies/`)

policies/
  authz.rego          — main policy: allow/deny
  data/
    scopes.json       — scope → permission mapping

4. Web Dashboard UI

Architecture

dashboard/            (new — separate from sdk/)
  src/
    components/       — reusable UI components
    pages/            — Agents, Credentials, Audit, Health
    hooks/            — useAgents, useCredentials, useAudit
    lib/
      client.ts       — wraps @sentryagent/idp-sdk
      auth.ts         — credential entry and storage

Tech Stack

React 18 + TypeScript strict
Vite 5 (build tool)
TanStack Query v5 (server state)
shadcn/ui components (Radix UI + Tailwind CSS)

Pages

Page	Scope Required	Features
Agents	`agents:read`	List, search, view detail, suspend/reactivate
Credentials	`agents:read`	List credentials per agent, rotate, revoke
Audit Log	`audit:read`	Filter by agent/action/outcome/date, paginate
Health	None	Server uptime, Redis/PostgreSQL connectivity

Authentication

The dashboard accepts clientId + clientSecret via a login form. The @sentryagent/idp-sdk TokenManager handles token acquisition and caching in sessionStorage. No backend session — all state is client-side.

5. Prometheus + Grafana Monitoring

Metrics exposed at `GET /metrics`

Metric	Type	Description
`agentidp_tokens_issued_total`	Counter	Tokens issued, labelled by outcome
`agentidp_agents_registered_total`	Counter	Agent registrations
`agentidp_http_requests_total`	Counter	All requests, labelled by method/path/status
`agentidp_http_request_duration_seconds`	Histogram	Request latency
`agentidp_rate_limit_rejections_total`	Counter	429 responses
`agentidp_db_query_duration_seconds`	Histogram	PostgreSQL query latency
`agentidp_redis_command_duration_seconds`	Histogram	Redis command latency

Grafana dashboard

Pre-built JSON dashboard shipped in monitoring/grafana/dashboards/agentidp.json. Auto-provisioned via monitoring/grafana/provisioning/.

Docker Compose extension

Add prometheus and grafana services to a docker-compose.monitoring.yml overlay — keeps the base docker-compose.yml clean for developers who don't need monitoring.

6. Multi-Region Deployment (Terraform)

Structure

terraform/
  modules/
    agentidp/         — reusable module: compute + networking
    rds/              — managed PostgreSQL
    redis/            — managed Redis
    lb/               — load balancer + TLS
  environments/
    aws/              — AWS-specific config (ECS + RDS + ElastiCache)
    gcp/              — GCP-specific config (Cloud Run + Cloud SQL + Memorystore)

Design Decisions

ADR-006: Two provider targets (AWS + GCP) in Phase 2 AWS and GCP cover the majority of developer deployments. Azure module is Phase 3. Each environment is a thin wrapper over the shared agentidp module.

ADR-007: Terraform over Pulumi/CDK Terraform is the most widely-used IaC tool, familiar to most DevOps teams. The HCL syntax is simpler for documentation purposes.

Component Interaction Map (Phase 2)

                      ┌────────────────────┐
                      │   Web Dashboard    │
                      │  (React + Vite)    │
                      └────────┬───────────┘
                               │ HTTPS
              ┌────────────────▼────────────────┐
              │         AgentIdP Server         │
              │  Auth MW → OPA MW → Controllers │
              │  /metrics (prom-client)         │
              └──┬──────────┬──────────┬────────┘
                 │          │          │
           ┌─────▼──┐  ┌────▼───┐  ┌──▼───────┐
           │Postgres│  │ Redis  │  │  Vault   │
           └────────┘  └────────┘  └──────────┘
                 │
        ┌────────▼────────┐
        │   Prometheus    │
        └────────┬────────┘
                 │
        ┌────────▼────────┐
        │    Grafana      │
        └─────────────────┘

8.0 KiB Raw Permalink Blame History

Phase 2: Production-Ready — Technical Design

1. HashiCorp Vault Integration

Architecture

Design Decisions

New environment variables

Migration

2. Multi-Language SDKs

Shared contract (all SDKs implement identically)

Python SDK (sentryagent-idp)

Go SDK (github.com/sentryagent/idp-sdk-go)

Java SDK (ai.sentryagent:idp-sdk)

3. OPA Policy Engine

Architecture

Design Decisions

Policy structure (policies/)

4. Web Dashboard UI

Architecture

Tech Stack

Pages

Authentication

5. Prometheus + Grafana Monitoring

Metrics exposed at GET /metrics

Grafana dashboard

Docker Compose extension

6. Multi-Region Deployment (Terraform)

Structure

Design Decisions

Component Interaction Map (Phase 2)

8.0 KiB

Raw Permalink Blame History

Python SDK (`sentryagent-idp`)

Go SDK (`github.com/sentryagent/idp-sdk-go`)

Java SDK (`ai.sentryagent:idp-sdk`)

Policy structure (`policies/`)

Metrics exposed at `GET /metrics`