## WS3: Advanced Analytics Dashboard ### Purpose Give paying tenants actionable visibility into their agent usage patterns. Analytics surface four dimensions: agent activity over time (heatmap), token issuance frequency and volume (trends), credential rotation frequency (rotation frequency table), and per-endpoint API call patterns (call patterns breakdown). Data is pre-aggregated nightly from the existing `usage_events` table into a new `analytics_daily_aggregates` table. Analytics are rendered in a new Analytics tab in the existing React web dashboard. ### New Endpoints #### `GET /analytics/usage-summary` **Summary:** Return a high-level usage summary for the authenticated tenant over a date range. **Authentication:** Bearer token (tenant-scoped). **Query Parameters:** | Parameter | Type | Required | Default | Constraints | |---|---|---|---|---| | `from` | string (YYYY-MM-DD) | no | 30 days ago | Must be <= `to` | | `to` | string (YYYY-MM-DD) | no | today | Must be <= today | **Response 200** (`application/json`): ```json { "tenantId": "string (UUID)", "period": { "from": "string (YYYY-MM-DD)", "to": "string (YYYY-MM-DD)" }, "summary": { "totalApiCalls": 84320, "totalTokenIssuances": 12400, "totalCredentialRotations": 48, "activeAgentCount": 23, "averageDailyApiCalls": 2810, "peakDailyApiCalls": 5102, "peakDate": "2026-03-28" } } ``` **Error Responses:** | Status | Code | Description | |---|---|---| | 400 | `INVALID_DATE_RANGE` | `from` > `to`, or date range exceeds 365 days | | 401 | `UNAUTHORIZED` | Missing or invalid Bearer token | | 403 | `ANALYTICS_NOT_AVAILABLE` | Tenant is on free tier — analytics require Pro or Enterprise | | 429 | `RATE_LIMITED` | Rate limit exceeded | --- #### `GET /analytics/agent-activity` **Summary:** Return per-agent daily activity counts for heatmap rendering. **Authentication:** Bearer token (tenant-scoped). **Query Parameters:** | Parameter | Type | Required | Default | Constraints | |---|---|---|---|---| | `from` | string (YYYY-MM-DD) | no | 30 days ago | Must be <= `to` | | `to` | string (YYYY-MM-DD) | no | today | Max range: 90 days | | `agentId` | string (UUID) | no | (all agents) | Filter to a single agent | **Response 200** (`application/json`): ```json { "tenantId": "string (UUID)", "period": { "from": "string (YYYY-MM-DD)", "to": "string (YYYY-MM-DD)" }, "agents": [ { "agentId": "string (UUID)", "agentName": "string", "dailyActivity": [ { "date": "2026-03-01", "apiCalls": 342, "tokenIssuances": 12, "credentialRotations": 0 } ] } ] } ``` **Error Responses:** | Status | Code | Description | |---|---|---| | 400 | `INVALID_DATE_RANGE` | `from` > `to`, or date range exceeds 90 days | | 401 | `UNAUTHORIZED` | Missing or invalid Bearer token | | 403 | `ANALYTICS_NOT_AVAILABLE` | Free tier — requires Pro or Enterprise | | 404 | `AGENT_NOT_FOUND` | `agentId` filter specified but agent does not belong to tenant | | 429 | `RATE_LIMITED` | Rate limit exceeded | --- #### `GET /analytics/token-trends` **Summary:** Return daily token issuance counts and success/failure breakdown for trend charts. **Authentication:** Bearer token (tenant-scoped). **Query Parameters:** | Parameter | Type | Required | Default | Constraints | |---|---|---|---|---| | `from` | string (YYYY-MM-DD) | no | 30 days ago | Must be <= `to` | | `to` | string (YYYY-MM-DD) | no | today | Max range: 365 days | | `granularity` | string | no | `day` | Enum: `day`, `week` | **Response 200** (`application/json`): ```json { "tenantId": "string (UUID)", "period": { "from": "string (YYYY-MM-DD)", "to": "string (YYYY-MM-DD)" }, "granularity": "day", "dataPoints": [ { "date": "2026-03-01", "totalIssuances": 420, "successfulIssuances": 415, "failedIssuances": 5, "uniqueAgents": 8 } ] } ``` **Error Responses:** | Status | Code | Description | |---|---|---| | 400 | `INVALID_DATE_RANGE` | `from` > `to`, or date range exceeds 365 days | | 400 | `INVALID_GRANULARITY` | `granularity` is not `day` or `week` | | 401 | `UNAUTHORIZED` | Missing or invalid Bearer token | | 403 | `ANALYTICS_NOT_AVAILABLE` | Free tier — requires Pro or Enterprise | | 429 | `RATE_LIMITED` | Rate limit exceeded | --- ### Database Schema Changes #### Migration: `009_add_analytics_aggregates.sql` ```sql CREATE TABLE analytics_daily_aggregates ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE, agent_id UUID REFERENCES agents(id) ON DELETE SET NULL, -- NULL = tenant-wide aggregate date DATE NOT NULL, metric_type VARCHAR(64) NOT NULL, -- 'api_calls' | 'token_issuances' | 'credential_rotations' | 'token_failures' count BIGINT NOT NULL DEFAULT 0, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), CONSTRAINT uq_daily_aggregate UNIQUE (tenant_id, agent_id, date, metric_type) ); -- Index for analytics queries (tenant + date range) CREATE INDEX idx_analytics_tenant_date ON analytics_daily_aggregates(tenant_id, date); CREATE INDEX idx_analytics_agent_date ON analytics_daily_aggregates(agent_id, date) WHERE agent_id IS NOT NULL; ``` #### Nightly Aggregation Job A `node-cron` job runs at `00:05 UTC` daily inside the Express API process. It executes an upsert query aggregating the previous day's `usage_events` rows into `analytics_daily_aggregates`. The job is idempotent — running it twice for the same date produces no duplicates (upsert on the unique constraint). Job logic (pseudocode): ``` 1. Compute target_date = yesterday (UTC) 2. SELECT tenant_id, agent_id, metric_type, SUM(count) FROM usage_events WHERE date = target_date GROUP BY tenant_id, agent_id, metric_type 3. UPSERT INTO analytics_daily_aggregates ON CONFLICT (tenant_id, agent_id, date, metric_type) DO UPDATE SET count = EXCLUDED.count, updated_at = NOW() ``` ### New Source Files | File | Description | |---|---| | `src/services/AnalyticsService.ts` | Business logic: query aggregates, build response shapes, Redis caching | | `src/controllers/AnalyticsController.ts` | HTTP handlers for analytics endpoints | | `src/routes/analytics.ts` | Express router for `/analytics/` prefix | | `src/jobs/analyticsAggregation.ts` | `node-cron` job that aggregates usage_events nightly | | `src/types/analytics.ts` | TypeScript interfaces: `UsageSummary`, `AgentActivityResponse`, `TokenTrendsResponse`, `DailyAggregate` | | `dashboard/src/pages/Analytics.tsx` | New Analytics tab in existing React dashboard | | `dashboard/src/components/charts/AgentHeatmap.tsx` | Heatmap component using `recharts` `ResponsiveContainer` + custom cells | | `dashboard/src/components/charts/TokenTrendsChart.tsx` | Line chart of token issuance over time using `recharts` `LineChart` | | `dashboard/src/components/charts/RotationFrequencyTable.tsx` | Sortable table of credential rotation counts per agent | | `dashboard/src/api/analyticsApi.ts` | Typed fetch functions for analytics endpoints | ### Modified Source Files | File | Change | |---|---| | `src/app.ts` | Register `analytics` router; start nightly aggregation cron job | | `src/infrastructure/migrations/` | Add `009_add_analytics_aggregates.sql` | | `dashboard/src/App.tsx` | Add Analytics route and nav link | | `package.json` (API) | Add `node-cron` dependency | | `package.json` (dashboard) | Add `recharts`, `date-fns` dependencies | | `docs/openapi.yaml` | Add analytics endpoints | ### Redis Caching Analytics responses are cached in Redis with `analytics:{tenantId}:{endpoint}:{queryHash}` keys. TTL: 5 minutes for agent-activity and token-trends; 60 seconds for usage-summary. Cache is invalidated on the next request after TTL expiry (no explicit invalidation). ### `AnalyticsService` Interface ```typescript interface IAnalyticsService { /** * Return a high-level usage summary for a tenant over a date range. */ getUsageSummary(tenantId: string, from: Date, to: Date): Promise; /** * Return per-agent daily activity data for heatmap rendering. */ getAgentActivity( tenantId: string, from: Date, to: Date, agentId?: string ): Promise; /** * Return daily token issuance trends with success/failure breakdown. */ getTokenTrends( tenantId: string, from: Date, to: Date, granularity: 'day' | 'week' ): Promise; } ``` ### Prometheus Metrics | Metric | Type | Labels | Description | |---|---|---|---| | `agentidp_analytics_query_duration_ms` | Histogram | `endpoint` | Analytics query latency (before cache) | | `agentidp_analytics_cache_hits_total` | Counter | `endpoint` | Analytics Redis cache hits | | `agentidp_analytics_cache_misses_total` | Counter | `endpoint` | Analytics Redis cache misses | | `agentidp_analytics_aggregation_job_duration_ms` | Gauge | — | Nightly aggregation job runtime | | `agentidp_analytics_aggregation_job_last_run` | Gauge | — | Unix timestamp of last successful aggregation job run | ### Feature Flags | Variable | Default | Description | |---|---|---| | `ANALYTICS_ENABLED` | `true` | When `false`, all `/analytics/` routes return HTTP 404 | | `ANALYTICS_FREE_TIER` | `false` | When `true`, free tier tenants can access analytics (for beta/testing) | ### Acceptance Criteria - `GET /analytics/usage-summary` returns correct aggregate counts for a date range - `GET /analytics/agent-activity` returns per-agent daily rows matching `analytics_daily_aggregates` - `GET /analytics/token-trends` returns daily and weekly granularity correctly - All three endpoints return HTTP 403 for free-tier tenants (when `ANALYTICS_FREE_TIER=false`) - Date range validation rejects `from > to` with HTTP 400 - Nightly aggregation job runs idempotently — running twice for same date produces no duplicates - Analytics responses are cached in Redis — a second identical request does not hit the DB - Dashboard Analytics tab renders heatmap, trend chart, and rotation table with mock data in Storybook - Unit test coverage >= 80% on `AnalyticsService` - Integration tests cover: summary, activity, trends (daily), trends (weekly), free-tier rejection, invalid date range