chore(openspec): archive phase-4-developer-growth change

All 90 tasks complete. Phase 4 — Developer Growth & Go-to-Market
fully delivered and archived per OpenSpec protocol.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
SentryAgent.ai Developer
2026-04-02 15:17:18 +00:00
parent af630b43d4
commit 831e91c467
12 changed files with 0 additions and 0 deletions

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-04-02

View File

@@ -0,0 +1,92 @@
## Context
SentryAgent.ai has completed three phases of development: Phase 1 (MVP — core agent registry, OAuth 2.0, audit log), Phase 2 (Production-Ready — Vault, 4 SDKs, OPA, React dashboard, Prometheus, Terraform), and Phase 3 (Enterprise — multi-tenancy, W3C DIDs, OIDC, AGNTCY federation, webhooks, SOC 2). The product is technically complete and enterprise-grade.
Phase 4's constraint is that the codebase is a single Express + TypeScript monorepo (`src/`) with a co-located React dashboard (`dashboard/`). The new developer portal and CLI are independent packages that must not couple into the existing API codebase beyond HTTP calls to the public API.
Known technical debt to resolve before launch: the `GET /audit/verify` rate limiter is process-local (`express-rate-limit` in-memory store), which breaks under horizontal scaling. This must be fixed before public launch.
## Goals / Non-Goals
**Goals:**
- Eliminate the in-memory rate limiter gap — all rate limiting is Redis-backed and horizontally safe
- Give developers a public portal to discover, learn, and onboard onto SentryAgent.ai
- Ship a CLI that lets developers manage agents from their terminal without writing code
- Create a public agent marketplace powered by existing agent registry + DID infrastructure
- Enable CI/CD-native agent identity via GitHub Actions OIDC federation
- Lay the monetization foundation — usage metering, Stripe billing, free/paid tier enforcement
**Non-Goals:**
- Multi-cloud or self-hosted billing (Stripe only)
- Full SaaS admin panel (beyond existing React dashboard additions)
- Mobile apps
- WebSocket-based real-time CLI tail (polling is acceptable for MVP)
- Marketplace payments or agent listings with pricing (discovery only, no transactions)
## Decisions
### ADR-1: ioredis replaces express-rate-limit in-memory store
**Decision:** Switch from `express-rate-limit` (default memory store) to a Redis-backed sliding window using `ioredis` + `rate-limiter-flexible`.
**Rationale:** The in-memory store is process-local — horizontal scaling (multiple Express instances behind a load balancer) produces independent rate limit windows per process, making limits meaningless. `ioredis` is already the preferred Redis client (faster, promises-native, cluster-aware). `rate-limiter-flexible` is battle-tested and supports sliding window, fixed window, and token bucket algorithms in Redis.
**Alternatives considered:** `redis` (official client) — less ergonomic, no cluster support out of box. `express-rate-limit` with `rate-limit-redis` store — additional dependency on top of ioredis, less control.
### ADR-2: Developer portal is a separate Next.js 14 app in `portal/`
**Decision:** The developer portal lives at `portal/` — a standalone Next.js 14 application — not inside the existing `dashboard/` React app.
**Rationale:** The portal is a public-facing marketing/onboarding site (unauthenticated), not an internal management dashboard (authenticated). Mixing public and authenticated surfaces in one bundle increases attack surface and deployment complexity. `portal/` can be deployed independently (Vercel, Cloudflare Pages) while the dashboard remains behind the API.
**Alternatives considered:** Single React app with public/private routing — increases bundle size and complicates auth guards. Embedding portal in existing Express static serving — prevents CDN-edge deployment.
### ADR-3: CLI is a standalone npm package in `cli/`
**Decision:** The `sentryagent` CLI lives at `cli/` with its own `package.json` and is published separately to npm as `sentryagent`.
**Rationale:** CLI users install globally (`npm i -g sentryagent`). Bundling into the API monorepo would force users to install all API dependencies. Separate package = minimal install surface + independent versioning + dedicated README on npm.
**Alternatives considered:** Monorepo workspace — possible but adds tooling complexity for a single-package CLI.
### ADR-4: Agent Marketplace is implemented as new routes in the existing Express API
**Decision:** Marketplace endpoints (`GET /marketplace/agents`, `GET /marketplace/agents/:id`) are added to the existing Express API, not a separate service.
**Rationale:** Marketplace data is derived from the existing `agents` table + DID infrastructure — it is a read-only projection of existing data with public access controls. No new persistence layer needed. Adding routes to Express is the simplest, lowest-risk approach.
**Alternatives considered:** Separate microservice — unnecessary complexity for read-only projections of existing data.
### ADR-5: GitHub Actions use OIDC token exchange (not stored secrets)
**Decision:** `sentryagent/register-agent` and `sentryagent/issue-token` Actions use GitHub's OIDC provider to exchange a GitHub-issued JWT for a SentryAgent.ai agent token — no API keys stored in GitHub Secrets.
**Rationale:** Storing long-lived API keys in GitHub Secrets creates a credential leak risk (secrets can be logged, forked into other repos, etc.). OIDC token exchange is keyless — credentials are ephemeral and scoped to the workflow run. The existing OIDC Provider (Phase 3 WS3) already supports external OIDC federation.
**Alternatives considered:** API key in GitHub Secrets — simpler but credential leak risk. GitHub App installation tokens — more complex, not needed when OIDC already exists.
### ADR-6: Billing uses Stripe with webhook-driven state synchronization
**Decision:** Stripe Checkout + Stripe Webhooks drive subscription state. SentryAgent.ai does not poll Stripe — it receives webhook events (`customer.subscription.created`, `invoice.payment_succeeded`, `customer.subscription.deleted`) to update a `tenant_subscriptions` table.
**Rationale:** Polling Stripe for subscription status introduces latency and API rate limit risk. Webhook-driven state is the Stripe-recommended pattern. Tenant subscription state is stored locally to avoid Stripe API calls on every request.
**Alternatives considered:** Paddle — less developer familiarity, smaller ecosystem. Lemon Squeezy — less mature. Manual invoicing — not scalable.
### ADR-7: Usage metering uses in-request counters flushed to PostgreSQL
**Decision:** Per-request middleware increments in-memory counters per tenant per metric type (api_calls, token_issuances). A 60-second flush interval writes aggregated counts to a `usage_events` table in PostgreSQL. Free tier limits are checked at request time against a cached summary.
**Rationale:** Synchronous database writes on every API request would add latency and DB load. Async aggregation + periodic flush gives near-real-time metering with minimal overhead. Redis could buffer these, but PostgreSQL is sufficient for MVP flush intervals.
**Alternatives considered:** Stripe Metered Billing API (report per-unit usage to Stripe) — locked to Stripe, adds latency on usage reporting, complex to roll back. ClickHouse/TimescaleDB — overkill for MVP scale.
## Risks / Trade-offs
- **[Risk] Portal deployment is separate from API** → Mitigation: Document CORS configuration clearly. Portal calls the public API via `NEXT_PUBLIC_API_URL` env var. Deployments are independent.
- **[Risk] CLI polling for audit tail adds API load** → Mitigation: Polling interval defaults to 5s with exponential backoff. Document this limitation. Real-time tail via WebSockets is a Phase 5 enhancement.
- **[Risk] Stripe webhook signature verification must be enforced** → Mitigation: All webhook handlers verify `stripe-signature` header using `stripe.webhooks.constructEvent()` before processing. Reject without verification.
- **[Risk] GitHub Actions OIDC requires trust policy configuration per repo** → Mitigation: Document trust policy setup clearly in Action README. Provide a quickstart template for `/.github/workflows/sentryagent-setup.yml`.
- **[Risk] Free tier limit checks add latency on every request** → Mitigation: Limit summaries are cached in Redis with a 60s TTL. Stale cache means brief over-limit grace window — acceptable for MVP.
- **[Risk] ioredis migration may break existing Redis usage** → Mitigation: Existing Redis usage (Bull queue, session) already uses `ioredis` under the hood (Bull requires it). Migration is additive — replace rate-limiter middleware only, no existing code removed.
## Migration Plan
1. **WS1 first** (before any public traffic): deploy ioredis rate limiter, connection pool tuning, and detailed health endpoint. Run k6 load tests. Only proceed to WS2+ after load tests pass.
2. **WS2 + WS3 in parallel**: portal and CLI are independent. Portal deployed to CDN/Vercel. CLI published to npm.
3. **WS4**: Marketplace routes added to Express API behind feature flag (`MARKETPLACE_ENABLED=true`). Enable after WS1 hardening is confirmed stable.
4. **WS5**: GitHub Actions published to GitHub Actions Marketplace after OIDC trust policy documentation is complete.
5. **WS6 last**: Billing affects all tenants. Stripe webhooks registered in Stripe dashboard. `tenant_subscriptions` table migration applied. Free tier limits initially set generously; tightened after monitoring confirms limit logic is correct.
**Rollback strategy per workstream:**
- WS1: Rate limiter is middleware — revert to in-memory store by toggling env var (`REDIS_RATE_LIMIT_ENABLED=false`)
- WS2: Portal is separate deployment — roll back independently
- WS3: npm package — unpublish or yank specific version
- WS4: Feature flag `MARKETPLACE_ENABLED=false`
- WS5: GitHub Actions are versioned — pin to prior release tag
- WS6: Feature flag `BILLING_ENABLED=false` — disables enforcement, metering continues
## Open Questions
- **Portal domain**: Will `portal/` be served from `sentryagent.ai` (marketing site) or `app.sentryagent.ai` (portal subdomain)? Affects CORS and Next.js `basePath` config. Recommend: `sentryagent.ai` for portal, `app.sentryagent.ai` for dashboard.
- **Free tier limits**: Are 10 agents and 1,000 API calls/day the final limits, or placeholders? If placeholder, billing enforcement should be gated behind `BILLING_ENABLED` flag until limits are confirmed.
- **Marketplace moderation**: Will agent marketplace listings be auto-published on registration, or require manual approval? Recommend: auto-publish for MVP, flag-based moderation later.

View File

@@ -0,0 +1,50 @@
## Why
Phases 13 delivered a complete, enterprise-grade AgentIdP — authenticated, federated, multi-tenanted, and SOC 2 prepared. The product now needs to reach developers: Phase 4 shifts from building infrastructure to growing the ecosystem by making SentryAgent.ai frictionless to discover, adopt, and operate at scale in production.
## What Changes
- **Production Hardening**: Replace in-memory rate limiter with Redis-backed distributed limiter; tune database connection pooling; add detailed health endpoint; introduce k6 load test suite
- **Public Developer Portal**: Next.js 14 public website with interactive API explorer (Swagger UI), guided agent registration wizard, free tier docs, and SDK links
- **CLI Tool** (`sentryagent`): npm-installable CLI for register-agent, list-agents, issue-token, rotate-credentials, and tail-audit-log with `~/.sentryagent/config.json` and shell completion
- **Agent Marketplace**: Searchable public registry of AGNTCY-compliant agents with DID documents, capabilities, and publisher profiles — powered by existing agent registry and DID infrastructure
- **GitHub Actions Integration**: `sentryagent/register-agent` and `sentryagent/issue-token` Actions using OIDC federation with GitHub's OIDC provider — published to the GitHub Actions Marketplace
- **Billing & Usage Metering**: Stripe integration for paid tier; per-tenant usage tracking (API calls, active agents, token issuances); free tier limits enforced; usage dashboard in existing web dashboard
## Capabilities
### New Capabilities
- `production-hardening`: Redis-backed rate limiting, connection pooling, detailed health endpoint, and k6 load test suite
- `developer-portal`: Next.js 14 public website with Swagger UI API explorer, onboarding wizard, and SDK links
- `cli-tool`: `sentryagent` npm CLI with full agent lifecycle commands and shell completion
- `agent-marketplace`: Searchable public registry of published AGNTCY-compliant agents with DID documents
- `github-actions`: `register-agent` and `issue-token` GitHub Actions using OIDC federation
- `billing-metering`: Stripe-backed paid tier, per-tenant usage tracking, free tier enforcement, and usage dashboard
### Modified Capabilities
- `web-dashboard`: Usage metering panel added to existing dashboard (new billing/usage tab)
- `monitoring`: New Prometheus metrics for rate limiter hits, connection pool saturation, and per-tenant API call counters
## Impact
**Code affected:**
- `src/middleware/rateLimiter.ts` — replace express-rate-limit (in-memory) with ioredis-backed limiter
- `src/infrastructure/database.ts` — pg connection pool tuning
- `src/routes/health.ts` — add `/health/detailed` endpoint
- `src/services/UsageService.ts` — new service for per-tenant metering
- `src/controllers/BillingController.ts` — new controller for Stripe webhooks and subscription management
- `portal/` — new Next.js 14 application (separate directory)
- `cli/` — new CLI package (separate directory)
- `marketplace/` — new marketplace routes added to existing Express API
- `.github/actions/` — two new GitHub Actions
**New dependencies (CEO approved):**
- `ioredis` — Redis-backed rate limiting (WS1)
- `next` + `tailwindcss` — Developer portal (WS2)
- `swagger-ui-react` — Interactive API explorer (WS2)
- `commander` + `chalk` — CLI framework (WS3)
- `stripe` — Billing (WS6)
**Delivery sequence:** WS1 → WS2 + WS3 (parallel) → WS4 → WS5 → WS6

View File

@@ -0,0 +1,45 @@
## ADDED Requirements
### Requirement: Marketplace listing endpoint returns public agent registry
The system SHALL expose `GET /marketplace/agents` returning a paginated list of publicly visible agents. Each listing SHALL include: `agentId`, `name`, `description`, `capabilities` (array of strings), `publisherName`, `did`, `createdAt`. The endpoint SHALL be unauthenticated (public access). Agents are included in the marketplace when their `isPublic` flag is `true`.
#### Scenario: Unauthenticated user lists marketplace agents
- **WHEN** an unauthenticated client calls `GET /marketplace/agents`
- **THEN** the response is HTTP 200 with a paginated list of public agents
#### Scenario: Pagination works correctly
- **WHEN** a client calls `GET /marketplace/agents?page=2&limit=20`
- **THEN** the response returns the correct page of results with `totalCount`, `page`, and `totalPages` in the response envelope
### Requirement: Marketplace search filters agents by capability, publisher, or name
The system SHALL support `GET /marketplace/agents?q=<search>` performing a case-insensitive search across agent name, description, and capabilities. The system SHALL also support `GET /marketplace/agents?capability=<cap>` and `GET /marketplace/agents?publisher=<name>` for structured filtering.
#### Scenario: Full-text search returns relevant agents
- **WHEN** a client calls `GET /marketplace/agents?q=translation`
- **THEN** agents whose name, description, or capabilities contain "translation" are returned
#### Scenario: Capability filter returns matching agents
- **WHEN** a client calls `GET /marketplace/agents?capability=nlp`
- **THEN** only agents with "nlp" in their capabilities array are returned
### Requirement: Marketplace detail endpoint returns agent with DID document
The system SHALL expose `GET /marketplace/agents/:agentId` returning the full public agent profile including the W3C DID document and AGNTCY agent card. The endpoint SHALL be unauthenticated.
#### Scenario: Agent detail includes DID document
- **WHEN** a client calls `GET /marketplace/agents/:agentId` for a public agent
- **THEN** the response includes `agentId`, `name`, `description`, `capabilities`, `did`, `didDocument`, `agentCard`, and `publisherName`
#### Scenario: Private agent returns 404 on marketplace
- **WHEN** a client calls `GET /marketplace/agents/:agentId` for an agent with `isPublic: false`
- **THEN** the response is HTTP 404
### Requirement: Agents can be published to or withdrawn from the marketplace
The system SHALL allow authenticated tenant users to set `isPublic: true` on an agent via `PATCH /agents/:agentId` (`{ "isPublic": true }`), making it appear in the marketplace. Setting `isPublic: false` removes it from marketplace listings without deleting the agent.
#### Scenario: Agent published to marketplace
- **WHEN** an authenticated user calls `PATCH /agents/:agentId` with `{ "isPublic": true }`
- **THEN** the agent appears in `GET /marketplace/agents` results
#### Scenario: Agent withdrawn from marketplace
- **WHEN** an authenticated user calls `PATCH /agents/:agentId` with `{ "isPublic": false }`
- **THEN** the agent no longer appears in `GET /marketplace/agents` results

View File

@@ -0,0 +1,60 @@
## ADDED Requirements
### Requirement: Per-tenant usage is tracked for API calls, active agents, and token issuances
The system SHALL track the following usage metrics per tenant per day: `api_calls` (every authenticated API request), `token_issuances` (every successful `POST /oauth2/token`), `active_agents` (count of non-revoked agents at end of day). Usage SHALL be aggregated in memory and flushed to a `usage_events` PostgreSQL table every 60 seconds.
#### Scenario: API call increments usage counter
- **WHEN** an authenticated tenant makes any API request
- **THEN** the tenant's `api_calls` counter for the current day is incremented
#### Scenario: Usage is persisted to database on flush interval
- **WHEN** 60 seconds elapse since the last flush
- **THEN** all in-memory counters are written to the `usage_events` table and reset to zero
### Requirement: Free tier limits are enforced per tenant
The system SHALL enforce free tier limits: 10 active agents maximum, 1,000 API calls per day. When a limit is exceeded, the offending request SHALL be rejected with HTTP 429 and a response body indicating which limit was reached and how to upgrade. Limit summaries SHALL be cached in Redis with a 60-second TTL.
#### Scenario: Agent registration blocked at free tier limit
- **WHEN** a free-tier tenant with 10 active agents calls `POST /agents`
- **THEN** the response is HTTP 429 with `{ "error": "free_tier_limit", "limit": "agents", "max": 10, "upgradeUrl": "..." }`
#### Scenario: API call blocked after daily limit
- **WHEN** a free-tier tenant has made 1,000 API calls today and makes another request
- **THEN** the response is HTTP 429 with `{ "error": "free_tier_limit", "limit": "api_calls", "max": 1000, "upgradeUrl": "..." }`
#### Scenario: Paid tenant is not rate limited by usage tiers
- **WHEN** a paid-tier tenant exceeds free tier thresholds
- **THEN** the request is processed normally with no usage-based rejection
### Requirement: Stripe Checkout initiates paid tier subscription
The system SHALL expose `POST /billing/checkout` (authenticated) that creates a Stripe Checkout session for the paid tier plan and returns a `checkoutUrl`. The tenant is redirected to Stripe Checkout to complete payment. On success, Stripe sends a `customer.subscription.created` webhook event.
#### Scenario: Checkout session created
- **WHEN** an authenticated tenant calls `POST /billing/checkout`
- **THEN** the response is HTTP 200 with `{ "checkoutUrl": "https://checkout.stripe.com/..." }`
#### Scenario: Duplicate subscription prevented
- **WHEN** a tenant with an active paid subscription calls `POST /billing/checkout`
- **THEN** the response is HTTP 409 with `{ "error": "already_subscribed" }`
### Requirement: Stripe webhooks update tenant subscription state
The system SHALL expose `POST /billing/webhook` (Stripe webhook endpoint) that verifies the `stripe-signature` header using `stripe.webhooks.constructEvent()` and processes: `customer.subscription.created` (set tenant to paid), `invoice.payment_succeeded` (extend subscription period), `customer.subscription.deleted` (revert tenant to free tier). All events without valid signatures SHALL be rejected with HTTP 400.
#### Scenario: Webhook without valid signature is rejected
- **WHEN** `POST /billing/webhook` is called with an invalid or missing `stripe-signature` header
- **THEN** the response is HTTP 400 and no state is changed
#### Scenario: Subscription created webhook activates paid tier
- **WHEN** Stripe sends a valid `customer.subscription.created` event for a tenant
- **THEN** the tenant's `subscriptionStatus` is updated to `active` and free tier limits no longer apply
#### Scenario: Subscription deleted webhook reverts to free tier
- **WHEN** Stripe sends a valid `customer.subscription.deleted` event
- **THEN** the tenant's `subscriptionStatus` is updated to `cancelled` and free tier limits are re-enforced
### Requirement: Billing is feature-flag gated
All billing enforcement and Stripe integration SHALL be gated behind the `BILLING_ENABLED` environment variable. When `BILLING_ENABLED=false`, free tier limits are not enforced, all tenants have paid-tier access, and Stripe webhook endpoint returns HTTP 200 without processing. Usage metering continues regardless of this flag.
#### Scenario: Billing disabled — no limits enforced
- **WHEN** `BILLING_ENABLED=false` and a free-tier tenant has 11 active agents
- **THEN** agent registration succeeds without HTTP 429

View File

@@ -0,0 +1,65 @@
## ADDED Requirements
### Requirement: sentryagent CLI is an installable npm package
The system SHALL provide a `sentryagent` CLI at `cli/` with its own `package.json`, built with `commander` and `chalk`, and published to npm as `sentryagent`. The CLI SHALL be installable globally via `npm install -g sentryagent`. The CLI binary SHALL be named `sentryagent`.
#### Scenario: CLI installs and shows help
- **WHEN** a user runs `npm install -g sentryagent` and then `sentryagent --help`
- **THEN** the command displays available subcommands and global options without errors
#### Scenario: CLI version flag works
- **WHEN** a user runs `sentryagent --version`
- **THEN** the CLI outputs its version number matching `package.json`
### Requirement: CLI persists configuration in ~/.sentryagent/config.json
The CLI SHALL store API endpoint (`apiUrl`) and credentials (`clientId`, `clientSecret`) in `~/.sentryagent/config.json`. The `sentryagent configure` command SHALL prompt for these values interactively and write them to the config file. All other commands SHALL read from this config file automatically.
#### Scenario: Configure command saves credentials
- **WHEN** a user runs `sentryagent configure` and enters an API URL, client ID, and client secret
- **THEN** `~/.sentryagent/config.json` is created or updated with the entered values
#### Scenario: Command fails gracefully when not configured
- **WHEN** a user runs any command before running `sentryagent configure`
- **THEN** the CLI outputs a human-readable error: "Not configured. Run `sentryagent configure` first."
### Requirement: register-agent command registers a new agent
The CLI SHALL provide `sentryagent register-agent --name <name> [--description <desc>]` that calls `POST /agents` and outputs the created agent's ID and name.
#### Scenario: Agent registered successfully
- **WHEN** a user runs `sentryagent register-agent --name "my-agent"`
- **THEN** the CLI outputs the new agent ID and confirms registration
### Requirement: list-agents command lists all agents
The CLI SHALL provide `sentryagent list-agents` that calls `GET /agents` and outputs a formatted table of agent ID, name, status, and creation date.
#### Scenario: Agents listed in table format
- **WHEN** a user runs `sentryagent list-agents`
- **THEN** the CLI outputs a formatted table with all agents for the authenticated tenant
### Requirement: issue-token command issues an OAuth2 token
The CLI SHALL provide `sentryagent issue-token --agent-id <id>` that calls `POST /oauth2/token` and outputs the access token and expiry.
#### Scenario: Token issued successfully
- **WHEN** a user runs `sentryagent issue-token --agent-id <id>`
- **THEN** the CLI outputs the access token and its expiry timestamp
### Requirement: rotate-credentials command rotates agent credentials
The CLI SHALL provide `sentryagent rotate-credentials --agent-id <id>` that calls `POST /agents/:id/credentials/rotate` and outputs the new client secret.
#### Scenario: Credentials rotated with confirmation prompt
- **WHEN** a user runs `sentryagent rotate-credentials --agent-id <id>`
- **THEN** the CLI prompts for confirmation ("This will invalidate the current secret. Continue? [y/N]") before rotating
### Requirement: tail-audit-log command polls and streams audit events
The CLI SHALL provide `sentryagent tail-audit-log [--agent-id <id>]` that polls `GET /audit/logs` every 5 seconds and streams new events to stdout in a human-readable format. The command SHALL run until the user presses Ctrl+C.
#### Scenario: Audit log events stream to stdout
- **WHEN** a user runs `sentryagent tail-audit-log`
- **THEN** new audit events appear in the terminal as they are created, one per line
### Requirement: CLI supports bash and zsh shell completion
The CLI SHALL provide `sentryagent completion bash` and `sentryagent completion zsh` commands that output shell completion scripts. Installation instructions SHALL be included in the CLI README.
#### Scenario: Bash completion script is generated
- **WHEN** a user runs `sentryagent completion bash`
- **THEN** a valid bash completion script is output to stdout

View File

@@ -0,0 +1,48 @@
## ADDED Requirements
### Requirement: Public developer portal is a standalone Next.js 14 application
The system SHALL provide a public developer portal at `portal/` — a standalone Next.js 14 application with Tailwind CSS. The portal SHALL be deployable independently of the API (to Vercel, Cloudflare Pages, or any static host). The portal SHALL communicate with the API exclusively via the public REST API at `NEXT_PUBLIC_API_URL`. No server-side API secrets SHALL be stored in the portal.
#### Scenario: Portal builds and exports successfully
- **WHEN** `npm run build` is executed in `portal/`
- **THEN** the build completes without errors and produces a deployable `out/` or `.next/` directory
#### Scenario: API URL is configurable via environment variable
- **WHEN** `NEXT_PUBLIC_API_URL=https://api.sentryagent.ai` is set and the portal is built
- **THEN** all API calls in the portal use that base URL
### Requirement: Interactive API explorer is embedded in the portal
The portal SHALL embed a Swagger UI (`swagger-ui-react`) loaded from the existing OpenAPI spec at `/openapi.json` (served by the Express API). The explorer SHALL allow unauthenticated browsing of all endpoints and authenticated execution using a Bearer token entered by the user.
#### Scenario: API explorer loads the OpenAPI spec
- **WHEN** a user visits `/api-explorer`
- **THEN** Swagger UI loads and renders all endpoints from the OpenAPI spec without errors
#### Scenario: User executes authenticated request in explorer
- **WHEN** a user enters a Bearer token in the Swagger UI Authorize dialog and executes `GET /agents`
- **THEN** the request is sent with `Authorization: Bearer <token>` and the response is displayed
### Requirement: Agent registration onboarding wizard guides new users
The portal SHALL provide a guided wizard at `/get-started` covering: (1) sign up / log in, (2) create first agent, (3) generate credentials, (4) copy code snippet for their preferred SDK (Node.js, Python, Go, Java). The wizard SHALL complete in ≤ 4 steps.
#### Scenario: Wizard completes agent registration
- **WHEN** a new user completes all wizard steps
- **THEN** an agent is registered via the API, credentials are generated, and a ready-to-run code snippet is displayed
#### Scenario: SDK code snippet matches selected language
- **WHEN** a user selects "Python" as their preferred SDK in step 4
- **THEN** the displayed code snippet uses the Python SDK (`sentryagent-idp`) syntax
### Requirement: Free tier documentation page explains limits and upgrade path
The portal SHALL include a `/pricing` page documenting free tier limits (10 agents, 1,000 API calls/day), the paid tier capabilities (unlimited agents, higher rate limits, SLA), and a clear call-to-action to upgrade via Stripe Checkout. The page SHALL not require authentication to view.
#### Scenario: Pricing page is publicly accessible
- **WHEN** an unauthenticated user visits `/pricing`
- **THEN** the page renders with free tier limits and upgrade CTA without requiring login
### Requirement: Portal links to all four SDKs
The portal SHALL include an `/sdks` page listing all four officially supported SDKs (Node.js, Python, Go, Java) with: package name, installation command, a minimal working example, and a link to the SDK repository.
#### Scenario: SDK page displays all four SDKs
- **WHEN** a user visits `/sdks`
- **THEN** all four SDKs are listed with installation commands and code examples

View File

@@ -0,0 +1,41 @@
## ADDED Requirements
### Requirement: register-agent Action registers an agent in CI using OIDC
The system SHALL provide a GitHub Action at `.github/actions/register-agent/action.yml` (`sentryagent/register-agent@v1`) that registers a new agent via the SentryAgent.ai API using GitHub OIDC token exchange. The Action SHALL accept inputs: `api-url` (required), `agent-name` (required), `agent-description` (optional). The Action SHALL output: `agent-id`. No long-lived API credentials SHALL be required.
#### Scenario: Agent registered in CI workflow
- **WHEN** a GitHub Actions workflow includes `uses: sentryagent/register-agent@v1` with valid `api-url` and `agent-name` inputs
- **THEN** the step completes successfully, an agent is registered in SentryAgent.ai, and `steps.<id>.outputs.agent-id` is populated
#### Scenario: OIDC exchange fails — action fails with clear message
- **WHEN** the GitHub OIDC token cannot be exchanged (e.g., trust policy not configured)
- **THEN** the action fails with an error message explaining how to configure the OIDC trust policy
### Requirement: issue-token Action issues an OAuth2 token in CI using OIDC
The system SHALL provide a GitHub Action at `.github/actions/issue-token/action.yml` (`sentryagent/issue-token@v1`) that issues an OAuth2 access token for an agent via OIDC exchange. The Action SHALL accept inputs: `api-url` (required), `agent-id` (required). The Action SHALL output: `access-token`, `expires-at`. The access token SHALL be masked in GitHub Actions logs.
#### Scenario: Token issued in CI workflow
- **WHEN** a GitHub Actions workflow includes `uses: sentryagent/issue-token@v1` with `api-url` and `agent-id`
- **THEN** the step completes and `steps.<id>.outputs.access-token` contains a valid Bearer token
#### Scenario: Access token is masked in logs
- **WHEN** the action issues a token
- **THEN** the token value is registered with `core.setSecret()` and does not appear in plaintext in the workflow log
### Requirement: GitHub OIDC trust policy is configurable via API
The system SHALL allow tenants to register a GitHub OIDC trust policy via `POST /oidc/trust-policies` specifying: `provider: "github"`, `repository` (e.g., `org/repo`), `branch` (optional), and `agentId`. Only workflows matching the trust policy SHALL be permitted to exchange GitHub OIDC tokens for SentryAgent.ai agent tokens.
#### Scenario: Trust policy restricts token exchange to specified repo
- **WHEN** a trust policy is registered for `org/repo-a` and a GitHub OIDC token from `org/repo-b` is presented
- **THEN** the token exchange is rejected with HTTP 403
#### Scenario: Trust policy permits token exchange for matching repo
- **WHEN** a trust policy is registered for `org/repo-a` and a valid GitHub OIDC token from `org/repo-a` is presented
- **THEN** the token exchange succeeds and an agent access token is returned
### Requirement: Both Actions include README with setup instructions
Each Action directory SHALL include a `README.md` with: purpose, prerequisites (OIDC trust policy setup), inputs table, outputs table, a minimal workflow example, and a link to full documentation on the developer portal.
#### Scenario: README is present and complete
- **WHEN** a developer reads `register-agent/README.md`
- **THEN** they can configure the OIDC trust policy and add the action to their workflow without external documentation

View File

@@ -0,0 +1,29 @@
## ADDED Requirements
### Requirement: Rate limiter hit counter is exposed as Prometheus metric
The system SHALL expose a `agentidp_rate_limit_hits_total` counter (labels: `endpoint`, `tenant_id`) incremented each time a request is rejected by the Redis-backed rate limiter (HTTP 429). This metric SHALL be available at `GET /metrics` alongside existing metrics.
#### Scenario: Rate limit rejection increments counter
- **WHEN** a client is rejected by the rate limiter on `POST /oauth2/token`
- **THEN** `agentidp_rate_limit_hits_total{endpoint="/oauth2/token"}` is incremented by 1
### Requirement: Database connection pool saturation is exposed as Prometheus metric
The system SHALL expose `agentidp_db_pool_active_connections` (gauge) and `agentidp_db_pool_waiting_requests` (gauge) reflecting the current number of active database connections and queued requests waiting for a free connection.
#### Scenario: Pool metrics reflect current state
- **WHEN** 15 of 20 pool connections are in use and 2 requests are queued
- **THEN** `agentidp_db_pool_active_connections` reads 15 and `agentidp_db_pool_waiting_requests` reads 2
### Requirement: Per-tenant API call rate is exposed as Prometheus metric
The system SHALL expose `agentidp_tenant_api_calls_total` counter (label: `tenant_id`) incremented on every authenticated API request. This enables per-tenant traffic analysis in Grafana.
#### Scenario: Per-tenant counter increments on authenticated request
- **WHEN** tenant `org-abc` makes an authenticated API call
- **THEN** `agentidp_tenant_api_calls_total{tenant_id="org-abc"}` is incremented by 1
### Requirement: Usage tier enforcement rejections are tracked as Prometheus metric
The system SHALL expose `agentidp_billing_limit_rejections_total` counter (labels: `tenant_id`, `limit_type`) incremented each time a request is rejected due to a free tier limit (`agents` or `api_calls`).
#### Scenario: Agent limit rejection increments counter
- **WHEN** a free-tier tenant is rejected from creating an agent due to the 10-agent limit
- **THEN** `agentidp_billing_limit_rejections_total{limit_type="agents"}` is incremented by 1

View File

@@ -0,0 +1,53 @@
## ADDED Requirements
### Requirement: Redis-backed distributed rate limiting replaces in-memory limiter
The system SHALL use `ioredis` + `rate-limiter-flexible` to enforce rate limits across all Express instances using a Redis sliding window algorithm. The in-memory `express-rate-limit` store SHALL be removed. Rate limit configuration SHALL be injectable via environment variables (`RATE_LIMIT_WINDOW_MS`, `RATE_LIMIT_MAX_REQUESTS`). When `REDIS_RATE_LIMIT_ENABLED=false`, the system SHALL fall back to an in-memory limiter for local development.
#### Scenario: Rate limit enforced across multiple instances
- **WHEN** two Express instances are running behind a load balancer and a client sends requests alternating between instances
- **THEN** the rate limit counter is shared across both instances via Redis and the client is rejected after the combined limit is reached
#### Scenario: Redis unavailable — graceful fallback
- **WHEN** Redis is unreachable and `REDIS_RATE_LIMIT_ENABLED=true`
- **THEN** the system SHALL log a warning and fall back to in-memory limiting rather than rejecting all requests
#### Scenario: Rate limit exceeded
- **WHEN** a client exceeds the configured request limit within the window
- **THEN** the system SHALL respond with HTTP 429 and a `Retry-After` header indicating when the window resets
### Requirement: Database connection pool is explicitly configured
The system SHALL configure `pg` connection pool with explicit `max`, `min`, `idleTimeoutMillis`, and `connectionTimeoutMillis` parameters via environment variables (`DB_POOL_MAX`, `DB_POOL_MIN`, `DB_POOL_IDLE_TIMEOUT_MS`, `DB_POOL_CONNECTION_TIMEOUT_MS`). Defaults SHALL be: max=20, min=2, idleTimeout=30000ms, connectionTimeout=5000ms.
#### Scenario: Pool exhaustion under load
- **WHEN** all pool connections are in use and a new query is requested
- **THEN** the system SHALL queue the request and resolve it within `DB_POOL_CONNECTION_TIMEOUT_MS`, or reject with a 503 if timeout is exceeded
#### Scenario: Idle connections are reaped
- **WHEN** a connection has been idle for longer than `DB_POOL_IDLE_TIMEOUT_MS`
- **THEN** the pool SHALL close the connection and reduce active pool size toward `min`
### Requirement: Detailed health endpoint reports per-service status
The system SHALL expose `GET /health/detailed` returning a JSON object with individual status for each dependency: `database`, `redis`, `vault` (if configured), `opa` (if configured). Each service SHALL report `status` (`healthy` | `degraded` | `unreachable`), `latencyMs`, and an optional `message`. The overall response status SHALL be HTTP 200 if all services are healthy, HTTP 207 if any are degraded, and HTTP 503 if any are unreachable.
#### Scenario: All services healthy
- **WHEN** all dependencies respond within acceptable latency
- **THEN** `GET /health/detailed` returns HTTP 200 with all services reporting `status: "healthy"`
#### Scenario: Redis unreachable
- **WHEN** Redis does not respond within 2000ms
- **THEN** `GET /health/detailed` returns HTTP 503 with `redis.status: "unreachable"` and overall `status: "unhealthy"`
#### Scenario: Vault degraded
- **WHEN** Vault responds but with latency exceeding 1000ms
- **THEN** `GET /health/detailed` returns HTTP 207 with `vault.status: "degraded"` and a latency measurement
### Requirement: k6 load test suite validates production readiness
The system SHALL include a k6 load test suite at `tests/load/` covering: agent registration under load (100 virtual users, 60s), token issuance under load (1000 virtual users, 60s), and credential rotation under load (50 virtual users, 60s). Each scenario SHALL define pass/fail thresholds: p95 response time < 500ms, error rate < 1%.
#### Scenario: Token issuance load test passes thresholds
- **WHEN** the k6 load test `token-issuance.js` runs with 1000 virtual users for 60 seconds
- **THEN** p95 response time SHALL be below 500ms and error rate SHALL be below 1%
#### Scenario: Load test threshold failure surfaces clearly
- **WHEN** a k6 threshold is breached during the load test run
- **THEN** the k6 process SHALL exit with a non-zero exit code, making CI failure explicit

View File

@@ -0,0 +1,23 @@
## ADDED Requirements
### Requirement: Usage dashboard tab displays per-tenant metering data
The web dashboard SHALL include a "Usage" tab in the main navigation displaying the current billing period's usage: API calls used / daily limit, active agents count / agent limit, token issuances this period. Usage data SHALL be fetched from `GET /billing/usage` (new authenticated endpoint). The tab SHALL update on page load and on a 60-second polling interval.
#### Scenario: Usage tab shows current metrics
- **WHEN** an authenticated user navigates to the Usage tab
- **THEN** the dashboard displays current API call count, agent count, and token issuance count for the current billing period
#### Scenario: Free tier warning shown when approaching limit
- **WHEN** a free-tier tenant has used ≥ 80% of their daily API call limit
- **THEN** a warning banner is displayed with a link to the upgrade/pricing page
### Requirement: Billing status panel shows subscription tier and upgrade CTA
The web dashboard Usage tab SHALL include a billing status panel showing: current tier (Free / Paid), subscription status (active / cancelled / trial), and — for free-tier tenants — an "Upgrade" button linking to `POST /billing/checkout` flow.
#### Scenario: Free tier tenant sees upgrade CTA
- **WHEN** a free-tier tenant views the Usage tab
- **THEN** an "Upgrade to Paid" button is visible and initiates Stripe Checkout when clicked
#### Scenario: Paid tier tenant sees subscription status
- **WHEN** a paid-tier tenant views the Usage tab
- **THEN** the panel shows "Paid" tier with subscription status and next renewal date, with no upgrade CTA

View File

@@ -0,0 +1,122 @@
## 1. WS1: Production Hardening — Redis Rate Limiting
- [x] 1.1 Install `ioredis` and `rate-limiter-flexible` — add to package.json dependencies
- [x] 1.2 Create `src/infrastructure/redisClient.ts` — singleton ioredis client with connection error handling and `REDIS_RATE_LIMIT_ENABLED` env var guard
- [x] 1.3 Replace in-memory `express-rate-limit` with `RateLimiterRedis` from `rate-limiter-flexible` — sliding window, configurable via `RATE_LIMIT_WINDOW_MS` and `RATE_LIMIT_MAX_REQUESTS`
- [x] 1.4 Implement graceful fallback to `RateLimiterMemory` when Redis is unreachable
- [x] 1.5 Add `agentidp_rate_limit_hits_total` Prometheus counter (labels: `endpoint`) — increment on HTTP 429
- [x] 1.6 Update rate limiter middleware to set `Retry-After` header on rejection
- [x] 1.7 Write unit tests for rate limiter middleware — Redis path, fallback path, 429 response shape
## 2. WS1: Production Hardening — Database Pool & Health
- [x] 2.1 Add `DB_POOL_MAX`, `DB_POOL_MIN`, `DB_POOL_IDLE_TIMEOUT_MS`, `DB_POOL_CONNECTION_TIMEOUT_MS` env vars to `.env.example` and database config
- [x] 2.2 Configure `pg.Pool` with explicit pool parameters; defaults: max=20, min=2, idle=30000ms, conn timeout=5000ms
- [x] 2.3 Expose `agentidp_db_pool_active_connections` gauge and `agentidp_db_pool_waiting_requests` gauge — update on pool events
- [x] 2.4 Create `GET /health/detailed` route and controller — check database, Redis, Vault (if configured), OPA (if configured)
- [x] 2.5 Implement per-service health checks with latency measurement — `healthy` / `degraded` (>1000ms) / `unreachable` (timeout/error)
- [x] 2.6 Return HTTP 200 (all healthy), HTTP 207 (any degraded), HTTP 503 (any unreachable)
- [x] 2.7 Write unit tests for health controller — all healthy, degraded, unreachable scenarios
## 3. WS1: Production Hardening — Load Tests
- [x] 3.1 Install k6 and create `tests/load/` directory with `README.md` explaining how to run tests
- [x] 3.2 Write `tests/load/agent-registration.js` — 100 VUs, 60s, threshold: p95 < 500ms, error rate < 1%
- [x] 3.3 Write `tests/load/token-issuance.js` — 1000 VUs, 60s, threshold: p95 < 500ms, error rate < 1%
- [x] 3.4 Write `tests/load/credential-rotation.js` — 50 VUs, 60s, threshold: p95 < 500ms, error rate < 1%
- [x] 3.5 Add `npm run load-test` script to package.json running all three k6 scenarios sequentially
## 4. WS2: Developer Portal — Setup & Core Pages
- [x] 4.1 Scaffold `portal/` as a standalone Next.js 14 app with Tailwind CSS — `npx create-next-app@latest portal --typescript --tailwind`
- [x] 4.2 Add `NEXT_PUBLIC_API_URL` env var support — create `portal/.env.example`
- [x] 4.3 Create portal home page (`portal/app/page.tsx`) — hero, product description, CTA to `/get-started`
- [x] 4.4 Create `/pricing` page with free tier limits table (10 agents, 1,000 calls/day) and paid tier CTA
- [x] 4.5 Create `/sdks` page listing all 4 SDKs with installation commands and minimal code examples
- [x] 4.6 Create shared nav component with links to: Home, API Explorer, Get Started, SDKs, Pricing
## 5. WS2: Developer Portal — API Explorer & Onboarding Wizard
- [x] 5.1 Install `swagger-ui-react` in `portal/` — add to portal package.json
- [x] 5.2 Create `/api-explorer` page embedding Swagger UI loaded from `NEXT_PUBLIC_API_URL/openapi.json`
- [x] 5.3 Configure Swagger UI with `persistAuthorization: true` and Bearer token auth scheme
- [x] 5.4 Create `/get-started` wizard — Step 1: account setup instructions
- [x] 5.5 Create wizard Step 2: agent name input → calls `POST /agents` via API → displays agent ID
- [x] 5.6 Create wizard Step 3: generate credentials → calls credentials endpoint → displays client ID/secret with copy buttons
- [x] 5.7 Create wizard Step 4: SDK selection → displays ready-to-run code snippet for chosen SDK (Node.js / Python / Go / Java)
- [x] 5.8 Wizard state management using React `useState` — no external state library needed
- [x] 5.9 Build `portal/``npm run build` passes without errors or TypeScript errors
## 6. WS3: CLI Tool — Setup & Configuration
- [x] 6.1 Scaffold `cli/` directory with `package.json` (name: `sentryagent`, bin: `sentryagent`) — TypeScript with `commander` and `chalk`
- [x] 6.2 Create `cli/src/config.ts` — read/write `~/.sentryagent/config.json` with `apiUrl`, `clientId`, `clientSecret`
- [x] 6.3 Implement `sentryagent configure` command — prompts for API URL, client ID, client secret using `readline` — writes to config file
- [x] 6.4 Implement config validation helper — fail with "Not configured. Run `sentryagent configure` first." if config missing
- [x] 6.5 Implement `sentryagent --version` outputting version from package.json
- [x] 6.6 Implement `sentryagent --help` showing all available commands
## 7. WS3: CLI Tool — Agent Commands
- [x] 7.1 Implement `sentryagent register-agent --name <name> [--description <desc>]` — calls `POST /agents`, outputs agent ID
- [x] 7.2 Implement `sentryagent list-agents` — calls `GET /agents`, outputs formatted table with chalk
- [x] 7.3 Implement `sentryagent issue-token --agent-id <id>` — calls `POST /oauth2/token`, outputs access token and expiry
- [x] 7.4 Implement `sentryagent rotate-credentials --agent-id <id>` — prompts for confirmation, calls rotate endpoint, outputs new secret
- [x] 7.5 Implement `sentryagent tail-audit-log [--agent-id <id>]` — polls `GET /audit/logs` every 5s, streams new events to stdout, runs until Ctrl+C
- [x] 7.6 Implement `sentryagent completion bash` and `sentryagent completion zsh` — output shell completion scripts
- [x] 7.7 Write `cli/README.md` — installation, configuration, all commands with examples, shell completion setup
- [x] 7.8 Build CLI — `npm run build` in `cli/` passes; `node dist/index.js --help` works
## 8. WS4: Agent Marketplace
- [x] 8.1 Add `is_public` boolean column (default false) to `agents` table — create migration `006_add_agent_marketplace.sql`
- [x] 8.2 Update `PATCH /agents/:id` to accept `isPublic` field — update AgentService and AgentController
- [x] 8.3 Create `MarketplaceService` with `listPublicAgents(filters, pagination)` and `getPublicAgent(agentId)` methods
- [x] 8.4 Create `GET /marketplace/agents` endpoint — unauthenticated, paginated, supports `?q=`, `?capability=`, `?publisher=` filters
- [x] 8.5 Create `GET /marketplace/agents/:agentId` endpoint — unauthenticated, returns agent with DID document and agent card
- [x] 8.6 Add `agentidp_tenant_api_calls_total` Prometheus counter (label: `tenant_id`) — increment on authenticated requests
- [x] 8.7 Add `MARKETPLACE_ENABLED` feature flag — return 404 on all marketplace routes when disabled
- [x] 8.8 Write unit tests for MarketplaceService — list, filter, get, public/private visibility
- [x] 8.9 Update OpenAPI spec to document `/marketplace/agents` endpoints
## 9. WS5: GitHub Actions
- [x] 9.1 Create `.github/actions/register-agent/action.yml` — inputs: `api-url`, `agent-name`, `agent-description`; outputs: `agent-id`
- [x] 9.2 Implement register-agent Action script (`action.js`) — exchange GitHub OIDC token via `POST /oidc/token`, then call `POST /agents`
- [x] 9.3 Implement OIDC token exchange error handling in register-agent — clear error message with trust policy setup link
- [x] 9.4 Create `.github/actions/issue-token/action.yml` — inputs: `api-url`, `agent-id`; outputs: `access-token`, `expires-at`
- [x] 9.5 Implement issue-token Action script — exchange GitHub OIDC token, call `POST /oauth2/token`, mask token with `core.setSecret()`
- [x] 9.6 Create `POST /oidc/trust-policies` endpoint — accepts `provider`, `repository`, `branch`, `agentId` — stores trust policy
- [x] 9.7 Enforce trust policy on GitHub OIDC token exchange — reject tokens from repos not matching a registered policy with HTTP 403
- [x] 9.8 Write `register-agent/README.md` — purpose, OIDC trust policy setup, inputs, outputs, example workflow
- [x] 9.9 Write `issue-token/README.md` — same structure as register-agent README
## 10. WS6: Billing & Usage Metering
- [x] 10.1 Create migration `007_add_billing.sql``tenant_subscriptions` table (tenant_id, status, stripe_customer_id, stripe_subscription_id, current_period_end) and `usage_events` table (tenant_id, date, metric_type, count)
- [x] 10.2 Install `stripe` npm package — add to package.json
- [x] 10.3 Create `UsageMeteringMiddleware` — increments in-memory per-tenant counters on every authenticated request; flushes to `usage_events` every 60s
- [x] 10.4 Create `UsageService` with `getDailyUsage(tenantId, date)` and `getActivAgentCount(tenantId)` methods
- [x] 10.5 Create `FreeTierEnforcementMiddleware` — checks usage cache (Redis, 60s TTL) before agent creation and API calls; rejects with HTTP 429 when limit exceeded; skips when `BILLING_ENABLED=false`
- [x] 10.6 Add `agentidp_billing_limit_rejections_total` Prometheus counter (labels: `tenant_id`, `limit_type`)
- [x] 10.7 Create `BillingService` with `createCheckoutSession(tenantId)`, `handleWebhookEvent(event)`, `getSubscriptionStatus(tenantId)` methods
- [x] 10.8 Create `POST /billing/checkout` endpoint — creates Stripe Checkout session, returns `checkoutUrl`
- [x] 10.9 Create `POST /billing/webhook` endpoint — verifies Stripe signature, processes subscription events, updates `tenant_subscriptions`
- [x] 10.10 Create `GET /billing/usage` endpoint (authenticated) — returns current period usage summary for tenant
- [x] 10.11 Add `BILLING_ENABLED` env var — disable enforcement and Stripe processing when false; document in `.env.example`
- [x] 10.12 Write unit tests for UsageService, BillingService, FreeTierEnforcementMiddleware — free tier block, paid tier pass-through, webhook processing
- [x] 10.13 Update web dashboard — add "Usage" tab to navigation with billing status panel and usage metrics from `GET /billing/usage`
## 11. QA & Release
- [x] 11.1 Run full TypeScript check across all packages (`tsc --noEmit`) — zero errors
- [x] 11.2 Run all unit tests (`npm test`) — all pass, coverage ≥ 80%
- [x] 11.3 Run k6 load tests — all thresholds pass (p95 < 500ms, error rate < 1%)
- [x] 11.4 Verify `GET /health/detailed` returns correct status for all dependency states
- [x] 11.5 Verify marketplace endpoints are unauthenticated and return correct data
- [x] 11.6 Verify Stripe webhook signature rejection on invalid signature
- [x] 11.7 Verify free tier limit enforcement with `BILLING_ENABLED=true`
- [x] 11.8 Verify `BILLING_ENABLED=false` disables enforcement without breaking metering
- [x] 11.9 Build portal — `npm run build` passes in `portal/`
- [x] 11.10 Build CLI — `npm run build` passes in `cli/`; `sentryagent --help` works
- [x] 11.11 Commit all Phase 4 work on `main` — conventional commit message per workstream