Files
sentryagent-idp/openspec/changes/phase-5-scale-ecosystem/specs/api-gateway-tiers/spec.md
SentryAgent.ai Developer 389a764e8d feat(openspec): propose phase-5-scale-ecosystem change
6 workstreams, 119 tasks — Scale & Ecosystem:
- WS1: Rust SDK
- WS2: Agent-to-Agent (A2A) Authorization
- WS3: Advanced Analytics Dashboard
- WS4: Public API Gateway & Rate Limiting SaaS
- WS5: Developer Experience (DX) improvements
- WS6: AGNTCY Compliance Certification Package

Awaiting CEO approval to begin implementation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 15:33:08 +00:00

10 KiB

WS4: Public API Gateway & Rate Limiting SaaS

Purpose

Replace the single flat rate limit (Phase 4) with a multi-tier enforcement model where each tenant's rate limits are determined by their subscription tier (free | pro | enterprise). Expose the tier definitions publicly via GET /tiers so developers can understand limits before registering. Add POST /billing/upgrade so tenants can self-service upgrade their tier without contacting support.

This workstream closes the gap between Phase 4's flat rate limiter and a proper commercial SaaS gateway model.

New Endpoints

GET /tiers

Summary: Return the current tier definitions including rate limits, feature flags, and pricing.

Authentication: None (public endpoint).

Response 200 (application/json):

{
  "tiers": [
    {
      "id": "free",
      "name": "Free",
      "price": {
        "monthly": 0,
        "currency": "USD"
      },
      "limits": {
        "registeredAgents": 10,
        "apiCallsPerDay": 1000,
        "tokenIssuancesPerDay": 200,
        "rateLimitPerMinute": 60,
        "rateLimitBurst": 10,
        "auditLogRetentionDays": 30
      },
      "features": {
        "marketplace": true,
        "githubActions": true,
        "analytics": false,
        "webhooks": false,
        "sso": false,
        "sla": false,
        "customDomain": false,
        "prioritySupport": false
      }
    },
    {
      "id": "pro",
      "name": "Pro",
      "price": {
        "monthly": 49,
        "currency": "USD"
      },
      "limits": {
        "registeredAgents": 100,
        "apiCallsPerDay": 50000,
        "tokenIssuancesPerDay": 10000,
        "rateLimitPerMinute": 600,
        "rateLimitBurst": 100,
        "auditLogRetentionDays": 90
      },
      "features": {
        "marketplace": true,
        "githubActions": true,
        "analytics": true,
        "webhooks": true,
        "sso": false,
        "sla": false,
        "customDomain": false,
        "prioritySupport": false
      }
    },
    {
      "id": "enterprise",
      "name": "Enterprise",
      "price": {
        "monthly": null,
        "currency": "USD",
        "note": "Contact sales"
      },
      "limits": {
        "registeredAgents": null,
        "apiCallsPerDay": null,
        "tokenIssuancesPerDay": null,
        "rateLimitPerMinute": 6000,
        "rateLimitBurst": 1000,
        "auditLogRetentionDays": 365
      },
      "features": {
        "marketplace": true,
        "githubActions": true,
        "analytics": true,
        "webhooks": true,
        "sso": true,
        "sla": true,
        "customDomain": true,
        "prioritySupport": true
      }
    }
  ]
}

Error Responses:

Status Code Description
429 RATE_LIMITED Rate limit exceeded (even unauthenticated endpoints have a global IP-based limit)

Notes:

  • null limits mean unlimited
  • Tier definitions are sourced from a static configuration object in the codebase, not a database table
  • The response is cached at the HTTP layer with Cache-Control: public, max-age=3600

POST /billing/upgrade

Summary: Initiate a self-service tier upgrade for the authenticated tenant. Creates a Stripe Checkout session for the target tier.

Authentication: Bearer token (tenant-scoped).

Request Body (application/json):

{
  "targetTier": "pro"
}
Field Type Required Constraints
targetTier string yes Enum: pro, enterprise — cannot downgrade via this endpoint

Response 200 (application/json):

{
  "checkoutUrl": "https://checkout.stripe.com/pay/cs_...",
  "sessionId": "cs_...",
  "targetTier": "pro",
  "expiresAt": "string (ISO 8601)"
}

Error Responses:

Status Code Description
400 ALREADY_ON_TIER Tenant is already subscribed to targetTier
400 INVALID_TARGET_TIER targetTier is not a valid upgradeable tier
400 DOWNGRADE_NOT_SUPPORTED targetTier is lower than the tenant's current tier
401 UNAUTHORIZED Missing or invalid Bearer token
422 STRIPE_ERROR Stripe API returned an error creating the Checkout session
429 RATE_LIMITED Rate limit exceeded

Business Rules:

  • This endpoint extends the existing BillingService — a new upgradeTier(tenantId, targetTier) method creates a Stripe Checkout session with the correct Stripe Price ID for the target tier
  • The Stripe Price IDs per tier are configured via environment variables: STRIPE_PRICE_ID_PRO, STRIPE_PRICE_ID_ENTERPRISE
  • After payment, Stripe sends customer.subscription.created webhook → existing webhook handler updates tenant_subscriptions
  • The TierRateLimiter reads the updated tier from tenant_subscriptions within 60 seconds (Redis cache TTL for tier lookup)
  • Downgrade is handled via the existing Stripe customer portal — not exposed as an API endpoint

TierRateLimiter Middleware

This replaces the single RateLimiterRedis middleware for all authenticated routes. It reads the tenant's current tier, looks up the tier rate limit configuration, and enforces it using per-tenant Redis keys via rate-limiter-flexible.

Middleware behavior:

  1. Extract tenantId from the authenticated request context
  2. Look up tier from Redis cache key tier:{tenantId} (TTL: 60 seconds)
  3. On cache miss: query tenant_subscriptions for tenantId, cache result for 60s
  4. Look up rate limit configuration for the tier from the static tier config
  5. Apply rate-limiter-flexible with key rl:{tier}:{tenantId} and tier-specific limits
  6. On rate limit exceeded: return HTTP 429 with headers:
    • X-RateLimit-Limit: <limit>
    • X-RateLimit-Remaining: <remaining>
    • X-RateLimit-Reset: <unix timestamp>
    • Retry-After: <seconds>
  7. Increment agentidp_rate_limit_hits_total counter (labels: tier, tenant_id, endpoint)

Unauthenticated routes: Continue to use the existing flat RateLimiterRedis with IP-based keys (unchanged from Phase 4).

Tier Configuration Object

Centralized in src/config/tiers.ts — this is the single source of truth for all tier limits and features. Both GET /tiers and TierRateLimiter read from this same object.

export const TIER_CONFIG: Record<TierName, TierDefinition> = {
    free: {
        id: 'free',
        limits: {
            registeredAgents: 10,
            apiCallsPerDay: 1000,
            tokenIssuancesPerDay: 200,
            rateLimitPerMinute: 60,
            rateLimitBurst: 10,
            auditLogRetentionDays: 30,
        },
        features: { analytics: false, webhooks: false, sso: false, sla: false },
        stripeProductId: null,
    },
    pro: {
        id: 'pro',
        limits: {
            registeredAgents: 100,
            apiCallsPerDay: 50000,
            tokenIssuancesPerDay: 10000,
            rateLimitPerMinute: 600,
            rateLimitBurst: 100,
            auditLogRetentionDays: 90,
        },
        features: { analytics: true, webhooks: true, sso: false, sla: false },
        stripeProductId: process.env.STRIPE_PRICE_ID_PRO ?? '',
    },
    enterprise: {
        id: 'enterprise',
        limits: {
            registeredAgents: null,
            apiCallsPerDay: null,
            tokenIssuancesPerDay: null,
            rateLimitPerMinute: 6000,
            rateLimitBurst: 1000,
            auditLogRetentionDays: 365,
        },
        features: { analytics: true, webhooks: true, sso: true, sla: true },
        stripeProductId: process.env.STRIPE_PRICE_ID_ENTERPRISE ?? '',
    },
};

New Source Files

File Description
src/config/tiers.ts Static tier configuration — single source of truth for limits and features
src/middleware/tierRateLimiter.ts TierRateLimiter middleware — reads tenant tier, enforces tier-specific limits
src/routes/tiers.ts Express router for GET /tiers
src/types/tiers.ts TypeScript interfaces: TierDefinition, TierName, TierLimits, TierFeatures

Modified Source Files

File Change
src/middleware/rateLimiter.ts Retain for unauthenticated routes; authenticated routes switch to tierRateLimiter
src/services/BillingService.ts Add upgradeTier(tenantId, targetTier) method
src/controllers/BillingController.ts Add handler for POST /billing/upgrade
src/routes/billing.ts Register POST /billing/upgrade route
src/routes/index.ts Register tiers router
.env.example Add STRIPE_PRICE_ID_PRO, STRIPE_PRICE_ID_ENTERPRISE, TIER_RATE_LIMITING_ENABLED
docs/openapi.yaml Add GET /tiers and POST /billing/upgrade endpoints

Prometheus Metrics

Metric Type Labels Description
agentidp_rate_limit_hits_total Counter tier, tenant_id, endpoint Rate limit rejections per tier (replaces old flat counter)
agentidp_tier_cache_hits_total Counter Tier Redis cache hits
agentidp_tier_cache_misses_total Counter Tier Redis cache misses
agentidp_billing_upgrades_total Counter from_tier, to_tier Self-service upgrade checkout sessions created

Feature Flag

TIER_RATE_LIMITING_ENABLED (default: true). When false, the system uses the old flat RateLimiterRedis middleware — this is the rollback mechanism.

Acceptance Criteria

  • GET /tiers returns all three tier definitions matching TIER_CONFIG exactly — no database query, cached Cache-Control: max-age=3600
  • POST /billing/upgrade creates a Stripe Checkout session and returns checkoutUrl
  • POST /billing/upgrade returns HTTP 400 ALREADY_ON_TIER when tenant is already on the target tier
  • POST /billing/upgrade returns HTTP 400 DOWNGRADE_NOT_SUPPORTED when target tier is lower than current
  • TierRateLimiter enforces free tier limits (60 req/min) for free tenants
  • TierRateLimiter enforces pro tier limits (600 req/min) for pro tenants
  • Tier lookup is cached in Redis — second request does not query tenant_subscriptions
  • Rate limit response includes X-RateLimit-* headers and Retry-After
  • After a Stripe webhook updates tenant_subscriptions to pro, TierRateLimiter applies pro limits within 60 seconds (next cache refresh)
  • Unit tests cover: tier lookup (cached), tier lookup (miss), free limit enforcement, pro limit enforcement, upgrade (success), upgrade (already on tier), upgrade (downgrade rejected)