sentryagent-idp/openspec/changes/phase-4-developer-growth/design.md

## Context

SentryAgent.ai has completed three phases of development: Phase 1 (MVP — core agent registry, OAuth 2.0, audit log), Phase 2 (Production-Ready — Vault, 4 SDKs, OPA, React dashboard, Prometheus, Terraform), and Phase 3 (Enterprise — multi-tenancy, W3C DIDs, OIDC, AGNTCY federation, webhooks, SOC 2). The product is technically complete and enterprise-grade.

Phase 4's constraint is that the codebase is a single Express + TypeScript monorepo (`src/`) with a co-located React dashboard (`dashboard/`). The new developer portal and CLI are independent packages that must not couple into the existing API codebase beyond HTTP calls to the public API.

Known technical debt to resolve before launch: the `GET /audit/verify` rate limiter is process-local (`express-rate-limit` in-memory store), which breaks under horizontal scaling. This must be fixed before public launch.

## Goals / Non-Goals

**Goals:**
- Eliminate the in-memory rate limiter gap — all rate limiting is Redis-backed and horizontally safe
- Give developers a public portal to discover, learn, and onboard onto SentryAgent.ai
- Ship a CLI that lets developers manage agents from their terminal without writing code
- Create a public agent marketplace powered by existing agent registry + DID infrastructure
- Enable CI/CD-native agent identity via GitHub Actions OIDC federation
- Lay the monetization foundation — usage metering, Stripe billing, free/paid tier enforcement

**Non-Goals:**
- Multi-cloud or self-hosted billing (Stripe only)
- Full SaaS admin panel (beyond existing React dashboard additions)
- Mobile apps
- WebSocket-based real-time CLI tail (polling is acceptable for MVP)
- Marketplace payments or agent listings with pricing (discovery only, no transactions)

## Decisions

### ADR-1: ioredis replaces express-rate-limit in-memory store
**Decision:** Switch from `express-rate-limit` (default memory store) to a Redis-backed sliding window using `ioredis` + `rate-limiter-flexible`.
**Rationale:** The in-memory store is process-local — horizontal scaling (multiple Express instances behind a load balancer) produces independent rate limit windows per process, making limits meaningless. `ioredis` is already the preferred Redis client (faster, promises-native, cluster-aware). `rate-limiter-flexible` is battle-tested and supports sliding window, fixed window, and token bucket algorithms in Redis.
**Alternatives considered:** `redis` (official client) — less ergonomic, no cluster support out of box. `express-rate-limit` with `rate-limit-redis` store — additional dependency on top of ioredis, less control.

### ADR-2: Developer portal is a separate Next.js 14 app in `portal/`
**Decision:** The developer portal lives at `portal/` — a standalone Next.js 14 application — not inside the existing `dashboard/` React app.
**Rationale:** The portal is a public-facing marketing/onboarding site (unauthenticated), not an internal management dashboard (authenticated). Mixing public and authenticated surfaces in one bundle increases attack surface and deployment complexity. `portal/` can be deployed independently (Vercel, Cloudflare Pages) while the dashboard remains behind the API.
**Alternatives considered:** Single React app with public/private routing — increases bundle size and complicates auth guards. Embedding portal in existing Express static serving — prevents CDN-edge deployment.

### ADR-3: CLI is a standalone npm package in `cli/`
**Decision:** The `sentryagent` CLI lives at `cli/` with its own `package.json` and is published separately to npm as `sentryagent`.
**Rationale:** CLI users install globally (`npm i -g sentryagent`). Bundling into the API monorepo would force users to install all API dependencies. Separate package = minimal install surface + independent versioning + dedicated README on npm.
**Alternatives considered:** Monorepo workspace — possible but adds tooling complexity for a single-package CLI.

### ADR-4: Agent Marketplace is implemented as new routes in the existing Express API
**Decision:** Marketplace endpoints (`GET /marketplace/agents`, `GET /marketplace/agents/:id`) are added to the existing Express API, not a separate service.
**Rationale:** Marketplace data is derived from the existing `agents` table + DID infrastructure — it is a read-only projection of existing data with public access controls. No new persistence layer needed. Adding routes to Express is the simplest, lowest-risk approach.
**Alternatives considered:** Separate microservice — unnecessary complexity for read-only projections of existing data.

### ADR-5: GitHub Actions use OIDC token exchange (not stored secrets)
**Decision:** `sentryagent/register-agent` and `sentryagent/issue-token` Actions use GitHub's OIDC provider to exchange a GitHub-issued JWT for a SentryAgent.ai agent token — no API keys stored in GitHub Secrets.
**Rationale:** Storing long-lived API keys in GitHub Secrets creates a credential leak risk (secrets can be logged, forked into other repos, etc.). OIDC token exchange is keyless — credentials are ephemeral and scoped to the workflow run. The existing OIDC Provider (Phase 3 WS3) already supports external OIDC federation.
**Alternatives considered:** API key in GitHub Secrets — simpler but credential leak risk. GitHub App installation tokens — more complex, not needed when OIDC already exists.

### ADR-6: Billing uses Stripe with webhook-driven state synchronization
**Decision:** Stripe Checkout + Stripe Webhooks drive subscription state. SentryAgent.ai does not poll Stripe — it receives webhook events (`customer.subscription.created`, `invoice.payment_succeeded`, `customer.subscription.deleted`) to update a `tenant_subscriptions` table.
**Rationale:** Polling Stripe for subscription status introduces latency and API rate limit risk. Webhook-driven state is the Stripe-recommended pattern. Tenant subscription state is stored locally to avoid Stripe API calls on every request.
**Alternatives considered:** Paddle — less developer familiarity, smaller ecosystem. Lemon Squeezy — less mature. Manual invoicing — not scalable.

### ADR-7: Usage metering uses in-request counters flushed to PostgreSQL
**Decision:** Per-request middleware increments in-memory counters per tenant per metric type (api_calls, token_issuances). A 60-second flush interval writes aggregated counts to a `usage_events` table in PostgreSQL. Free tier limits are checked at request time against a cached summary.
**Rationale:** Synchronous database writes on every API request would add latency and DB load. Async aggregation + periodic flush gives near-real-time metering with minimal overhead. Redis could buffer these, but PostgreSQL is sufficient for MVP flush intervals.
**Alternatives considered:** Stripe Metered Billing API (report per-unit usage to Stripe) — locked to Stripe, adds latency on usage reporting, complex to roll back. ClickHouse/TimescaleDB — overkill for MVP scale.

## Risks / Trade-offs

- **[Risk] Portal deployment is separate from API** → Mitigation: Document CORS configuration clearly. Portal calls the public API via `NEXT_PUBLIC_API_URL` env var. Deployments are independent.
- **[Risk] CLI polling for audit tail adds API load** → Mitigation: Polling interval defaults to 5s with exponential backoff. Document this limitation. Real-time tail via WebSockets is a Phase 5 enhancement.
- **[Risk] Stripe webhook signature verification must be enforced** → Mitigation: All webhook handlers verify `stripe-signature` header using `stripe.webhooks.constructEvent()` before processing. Reject without verification.
- **[Risk] GitHub Actions OIDC requires trust policy configuration per repo** → Mitigation: Document trust policy setup clearly in Action README. Provide a quickstart template for `/.github/workflows/sentryagent-setup.yml`.
- **[Risk] Free tier limit checks add latency on every request** → Mitigation: Limit summaries are cached in Redis with a 60s TTL. Stale cache means brief over-limit grace window — acceptable for MVP.
- **[Risk] ioredis migration may break existing Redis usage** → Mitigation: Existing Redis usage (Bull queue, session) already uses `ioredis` under the hood (Bull requires it). Migration is additive — replace rate-limiter middleware only, no existing code removed.

## Migration Plan

1. **WS1 first** (before any public traffic): deploy ioredis rate limiter, connection pool tuning, and detailed health endpoint. Run k6 load tests. Only proceed to WS2+ after load tests pass.
2. **WS2 + WS3 in parallel**: portal and CLI are independent. Portal deployed to CDN/Vercel. CLI published to npm.
3. **WS4**: Marketplace routes added to Express API behind feature flag (`MARKETPLACE_ENABLED=true`). Enable after WS1 hardening is confirmed stable.
4. **WS5**: GitHub Actions published to GitHub Actions Marketplace after OIDC trust policy documentation is complete.
5. **WS6 last**: Billing affects all tenants. Stripe webhooks registered in Stripe dashboard. `tenant_subscriptions` table migration applied. Free tier limits initially set generously; tightened after monitoring confirms limit logic is correct.

**Rollback strategy per workstream:**
- WS1: Rate limiter is middleware — revert to in-memory store by toggling env var (`REDIS_RATE_LIMIT_ENABLED=false`)
- WS2: Portal is separate deployment — roll back independently
- WS3: npm package — unpublish or yank specific version
- WS4: Feature flag `MARKETPLACE_ENABLED=false`
- WS5: GitHub Actions are versioned — pin to prior release tag
- WS6: Feature flag `BILLING_ENABLED=false` — disables enforcement, metering continues

## Open Questions

- **Portal domain**: Will `portal/` be served from `sentryagent.ai` (marketing site) or `app.sentryagent.ai` (portal subdomain)? Affects CORS and Next.js `basePath` config. Recommend: `sentryagent.ai` for portal, `app.sentryagent.ai` for dashboard.
- **Free tier limits**: Are 10 agents and 1,000 API calls/day the final limits, or placeholders? If placeholder, billing enforcement should be gated behind `BILLING_ENABLED` flag until limits are confirmed.
- **Marketplace moderation**: Will agent marketplace listings be auto-published on registration, or require manual approval? Recommend: auto-publish for MVP, flag-based moderation later.