feat(phase-4): WS1 — Production Hardening (Redis rate limiting, DB pool, health endpoint, k6)
Rate limiting: - Replace in-memory express-rate-limit with ioredis + rate-limiter-flexible (sliding window) - Graceful fallback to RateLimiterMemory when Redis unreachable - RATE_LIMIT_WINDOW_MS / RATE_LIMIT_MAX_REQUESTS env var config - Retry-After header on 429 responses - agentidp_rate_limit_hits_total Prometheus counter Database pool: - Explicit pg.Pool config via DB_POOL_MAX/MIN/IDLE_TIMEOUT_MS/CONNECTION_TIMEOUT_MS - Defaults: max=20, min=2, idle=30s, conn timeout=5s - agentidp_db_pool_active_connections + agentidp_db_pool_waiting_requests gauges Health endpoint: - GET /health/detailed — per-service status (database, Redis, Vault, OPA) - healthy / degraded (>1000ms) / unreachable classification - HTTP 200 (all healthy) / 207 (any degraded) / 503 (any unreachable) Load tests: - tests/load/ with k6 scenarios for agent registration (100 VUs), token issuance (1000 VUs), credential rotation (50 VUs) - npm run load-test script Tests: 586 passing, zero TypeScript errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -116,3 +116,34 @@ export const auditChainIntegrity = new Gauge({
|
||||
help: 'Binary gauge: 1 = most recent audit chain verification passed, 0 = failed.',
|
||||
registers: [metricsRegistry],
|
||||
});
|
||||
|
||||
/**
|
||||
* Total number of HTTP 429 responses returned by the rate limiter.
|
||||
* Labels: endpoint (req.path at time of rejection)
|
||||
*/
|
||||
export const rateLimitHitsTotal = new Counter({
|
||||
name: 'agentidp_rate_limit_hits_total',
|
||||
help: 'Total number of HTTP 429 responses returned by the rate limiter.',
|
||||
labelNames: ['endpoint'] as const,
|
||||
registers: [metricsRegistry],
|
||||
});
|
||||
|
||||
/**
|
||||
* Current number of active (checked-out) PostgreSQL pool connections.
|
||||
* Updated on pool `acquire` and `remove` events.
|
||||
*/
|
||||
export const dbPoolActiveConnections = new Gauge({
|
||||
name: 'agentidp_db_pool_active_connections',
|
||||
help: 'Current number of active (checked-out) PostgreSQL pool connections.',
|
||||
registers: [metricsRegistry],
|
||||
});
|
||||
|
||||
/**
|
||||
* Current number of waiting client requests in the PostgreSQL pool queue.
|
||||
* Updated whenever the pool queue length changes.
|
||||
*/
|
||||
export const dbPoolWaitingRequests = new Gauge({
|
||||
name: 'agentidp_db_pool_waiting_requests',
|
||||
help: 'Current number of requests waiting for a PostgreSQL connection.',
|
||||
registers: [metricsRegistry],
|
||||
});
|
||||
|
||||
Reference in New Issue
Block a user