6 workstreams, 119 tasks — Scale & Ecosystem: - WS1: Rust SDK - WS2: Agent-to-Agent (A2A) Authorization - WS3: Advanced Analytics Dashboard - WS4: Public API Gateway & Rate Limiting SaaS - WS5: Developer Experience (DX) improvements - WS6: AGNTCY Compliance Certification Package Awaiting CEO approval to begin implementation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
280 lines
10 KiB
Markdown
280 lines
10 KiB
Markdown
## WS3: Advanced Analytics Dashboard
|
|
|
|
### Purpose
|
|
|
|
Give paying tenants actionable visibility into their agent usage patterns. Analytics surface four dimensions: agent activity over time (heatmap), token issuance frequency and volume (trends), credential rotation frequency (rotation frequency table), and per-endpoint API call patterns (call patterns breakdown). Data is pre-aggregated nightly from the existing `usage_events` table into a new `analytics_daily_aggregates` table. Analytics are rendered in a new Analytics tab in the existing React web dashboard.
|
|
|
|
### New Endpoints
|
|
|
|
#### `GET /analytics/usage-summary`
|
|
|
|
**Summary:** Return a high-level usage summary for the authenticated tenant over a date range.
|
|
|
|
**Authentication:** Bearer token (tenant-scoped).
|
|
|
|
**Query Parameters:**
|
|
|
|
| Parameter | Type | Required | Default | Constraints |
|
|
|---|---|---|---|---|
|
|
| `from` | string (YYYY-MM-DD) | no | 30 days ago | Must be <= `to` |
|
|
| `to` | string (YYYY-MM-DD) | no | today | Must be <= today |
|
|
|
|
**Response 200** (`application/json`):
|
|
```json
|
|
{
|
|
"tenantId": "string (UUID)",
|
|
"period": {
|
|
"from": "string (YYYY-MM-DD)",
|
|
"to": "string (YYYY-MM-DD)"
|
|
},
|
|
"summary": {
|
|
"totalApiCalls": 84320,
|
|
"totalTokenIssuances": 12400,
|
|
"totalCredentialRotations": 48,
|
|
"activeAgentCount": 23,
|
|
"averageDailyApiCalls": 2810,
|
|
"peakDailyApiCalls": 5102,
|
|
"peakDate": "2026-03-28"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Error Responses:**
|
|
|
|
| Status | Code | Description |
|
|
|---|---|---|
|
|
| 400 | `INVALID_DATE_RANGE` | `from` > `to`, or date range exceeds 365 days |
|
|
| 401 | `UNAUTHORIZED` | Missing or invalid Bearer token |
|
|
| 403 | `ANALYTICS_NOT_AVAILABLE` | Tenant is on free tier — analytics require Pro or Enterprise |
|
|
| 429 | `RATE_LIMITED` | Rate limit exceeded |
|
|
|
|
---
|
|
|
|
#### `GET /analytics/agent-activity`
|
|
|
|
**Summary:** Return per-agent daily activity counts for heatmap rendering.
|
|
|
|
**Authentication:** Bearer token (tenant-scoped).
|
|
|
|
**Query Parameters:**
|
|
|
|
| Parameter | Type | Required | Default | Constraints |
|
|
|---|---|---|---|---|
|
|
| `from` | string (YYYY-MM-DD) | no | 30 days ago | Must be <= `to` |
|
|
| `to` | string (YYYY-MM-DD) | no | today | Max range: 90 days |
|
|
| `agentId` | string (UUID) | no | (all agents) | Filter to a single agent |
|
|
|
|
**Response 200** (`application/json`):
|
|
```json
|
|
{
|
|
"tenantId": "string (UUID)",
|
|
"period": {
|
|
"from": "string (YYYY-MM-DD)",
|
|
"to": "string (YYYY-MM-DD)"
|
|
},
|
|
"agents": [
|
|
{
|
|
"agentId": "string (UUID)",
|
|
"agentName": "string",
|
|
"dailyActivity": [
|
|
{
|
|
"date": "2026-03-01",
|
|
"apiCalls": 342,
|
|
"tokenIssuances": 12,
|
|
"credentialRotations": 0
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Error Responses:**
|
|
|
|
| Status | Code | Description |
|
|
|---|---|---|
|
|
| 400 | `INVALID_DATE_RANGE` | `from` > `to`, or date range exceeds 90 days |
|
|
| 401 | `UNAUTHORIZED` | Missing or invalid Bearer token |
|
|
| 403 | `ANALYTICS_NOT_AVAILABLE` | Free tier — requires Pro or Enterprise |
|
|
| 404 | `AGENT_NOT_FOUND` | `agentId` filter specified but agent does not belong to tenant |
|
|
| 429 | `RATE_LIMITED` | Rate limit exceeded |
|
|
|
|
---
|
|
|
|
#### `GET /analytics/token-trends`
|
|
|
|
**Summary:** Return daily token issuance counts and success/failure breakdown for trend charts.
|
|
|
|
**Authentication:** Bearer token (tenant-scoped).
|
|
|
|
**Query Parameters:**
|
|
|
|
| Parameter | Type | Required | Default | Constraints |
|
|
|---|---|---|---|---|
|
|
| `from` | string (YYYY-MM-DD) | no | 30 days ago | Must be <= `to` |
|
|
| `to` | string (YYYY-MM-DD) | no | today | Max range: 365 days |
|
|
| `granularity` | string | no | `day` | Enum: `day`, `week` |
|
|
|
|
**Response 200** (`application/json`):
|
|
```json
|
|
{
|
|
"tenantId": "string (UUID)",
|
|
"period": {
|
|
"from": "string (YYYY-MM-DD)",
|
|
"to": "string (YYYY-MM-DD)"
|
|
},
|
|
"granularity": "day",
|
|
"dataPoints": [
|
|
{
|
|
"date": "2026-03-01",
|
|
"totalIssuances": 420,
|
|
"successfulIssuances": 415,
|
|
"failedIssuances": 5,
|
|
"uniqueAgents": 8
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Error Responses:**
|
|
|
|
| Status | Code | Description |
|
|
|---|---|---|
|
|
| 400 | `INVALID_DATE_RANGE` | `from` > `to`, or date range exceeds 365 days |
|
|
| 400 | `INVALID_GRANULARITY` | `granularity` is not `day` or `week` |
|
|
| 401 | `UNAUTHORIZED` | Missing or invalid Bearer token |
|
|
| 403 | `ANALYTICS_NOT_AVAILABLE` | Free tier — requires Pro or Enterprise |
|
|
| 429 | `RATE_LIMITED` | Rate limit exceeded |
|
|
|
|
---
|
|
|
|
### Database Schema Changes
|
|
|
|
#### Migration: `009_add_analytics_aggregates.sql`
|
|
|
|
```sql
|
|
CREATE TABLE analytics_daily_aggregates (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
|
|
agent_id UUID REFERENCES agents(id) ON DELETE SET NULL, -- NULL = tenant-wide aggregate
|
|
date DATE NOT NULL,
|
|
metric_type VARCHAR(64) NOT NULL, -- 'api_calls' | 'token_issuances' | 'credential_rotations' | 'token_failures'
|
|
count BIGINT NOT NULL DEFAULT 0,
|
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
|
|
CONSTRAINT uq_daily_aggregate UNIQUE (tenant_id, agent_id, date, metric_type)
|
|
);
|
|
|
|
-- Index for analytics queries (tenant + date range)
|
|
CREATE INDEX idx_analytics_tenant_date ON analytics_daily_aggregates(tenant_id, date);
|
|
CREATE INDEX idx_analytics_agent_date ON analytics_daily_aggregates(agent_id, date) WHERE agent_id IS NOT NULL;
|
|
```
|
|
|
|
#### Nightly Aggregation Job
|
|
|
|
A `node-cron` job runs at `00:05 UTC` daily inside the Express API process. It executes an upsert query aggregating the previous day's `usage_events` rows into `analytics_daily_aggregates`. The job is idempotent — running it twice for the same date produces no duplicates (upsert on the unique constraint).
|
|
|
|
Job logic (pseudocode):
|
|
```
|
|
1. Compute target_date = yesterday (UTC)
|
|
2. SELECT tenant_id, agent_id, metric_type, SUM(count)
|
|
FROM usage_events
|
|
WHERE date = target_date
|
|
GROUP BY tenant_id, agent_id, metric_type
|
|
3. UPSERT INTO analytics_daily_aggregates
|
|
ON CONFLICT (tenant_id, agent_id, date, metric_type)
|
|
DO UPDATE SET count = EXCLUDED.count, updated_at = NOW()
|
|
```
|
|
|
|
### New Source Files
|
|
|
|
| File | Description |
|
|
|---|---|
|
|
| `src/services/AnalyticsService.ts` | Business logic: query aggregates, build response shapes, Redis caching |
|
|
| `src/controllers/AnalyticsController.ts` | HTTP handlers for analytics endpoints |
|
|
| `src/routes/analytics.ts` | Express router for `/analytics/` prefix |
|
|
| `src/jobs/analyticsAggregation.ts` | `node-cron` job that aggregates usage_events nightly |
|
|
| `src/types/analytics.ts` | TypeScript interfaces: `UsageSummary`, `AgentActivityResponse`, `TokenTrendsResponse`, `DailyAggregate` |
|
|
| `dashboard/src/pages/Analytics.tsx` | New Analytics tab in existing React dashboard |
|
|
| `dashboard/src/components/charts/AgentHeatmap.tsx` | Heatmap component using `recharts` `ResponsiveContainer` + custom cells |
|
|
| `dashboard/src/components/charts/TokenTrendsChart.tsx` | Line chart of token issuance over time using `recharts` `LineChart` |
|
|
| `dashboard/src/components/charts/RotationFrequencyTable.tsx` | Sortable table of credential rotation counts per agent |
|
|
| `dashboard/src/api/analyticsApi.ts` | Typed fetch functions for analytics endpoints |
|
|
|
|
### Modified Source Files
|
|
|
|
| File | Change |
|
|
|---|---|
|
|
| `src/app.ts` | Register `analytics` router; start nightly aggregation cron job |
|
|
| `src/infrastructure/migrations/` | Add `009_add_analytics_aggregates.sql` |
|
|
| `dashboard/src/App.tsx` | Add Analytics route and nav link |
|
|
| `package.json` (API) | Add `node-cron` dependency |
|
|
| `package.json` (dashboard) | Add `recharts`, `date-fns` dependencies |
|
|
| `docs/openapi.yaml` | Add analytics endpoints |
|
|
|
|
### Redis Caching
|
|
|
|
Analytics responses are cached in Redis with `analytics:{tenantId}:{endpoint}:{queryHash}` keys. TTL: 5 minutes for agent-activity and token-trends; 60 seconds for usage-summary. Cache is invalidated on the next request after TTL expiry (no explicit invalidation).
|
|
|
|
### `AnalyticsService` Interface
|
|
|
|
```typescript
|
|
interface IAnalyticsService {
|
|
/**
|
|
* Return a high-level usage summary for a tenant over a date range.
|
|
*/
|
|
getUsageSummary(tenantId: string, from: Date, to: Date): Promise<UsageSummary>;
|
|
|
|
/**
|
|
* Return per-agent daily activity data for heatmap rendering.
|
|
*/
|
|
getAgentActivity(
|
|
tenantId: string,
|
|
from: Date,
|
|
to: Date,
|
|
agentId?: string
|
|
): Promise<AgentActivityResponse>;
|
|
|
|
/**
|
|
* Return daily token issuance trends with success/failure breakdown.
|
|
*/
|
|
getTokenTrends(
|
|
tenantId: string,
|
|
from: Date,
|
|
to: Date,
|
|
granularity: 'day' | 'week'
|
|
): Promise<TokenTrendsResponse>;
|
|
}
|
|
```
|
|
|
|
### Prometheus Metrics
|
|
|
|
| Metric | Type | Labels | Description |
|
|
|---|---|---|---|
|
|
| `agentidp_analytics_query_duration_ms` | Histogram | `endpoint` | Analytics query latency (before cache) |
|
|
| `agentidp_analytics_cache_hits_total` | Counter | `endpoint` | Analytics Redis cache hits |
|
|
| `agentidp_analytics_cache_misses_total` | Counter | `endpoint` | Analytics Redis cache misses |
|
|
| `agentidp_analytics_aggregation_job_duration_ms` | Gauge | — | Nightly aggregation job runtime |
|
|
| `agentidp_analytics_aggregation_job_last_run` | Gauge | — | Unix timestamp of last successful aggregation job run |
|
|
|
|
### Feature Flags
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `ANALYTICS_ENABLED` | `true` | When `false`, all `/analytics/` routes return HTTP 404 |
|
|
| `ANALYTICS_FREE_TIER` | `false` | When `true`, free tier tenants can access analytics (for beta/testing) |
|
|
|
|
### Acceptance Criteria
|
|
|
|
- `GET /analytics/usage-summary` returns correct aggregate counts for a date range
|
|
- `GET /analytics/agent-activity` returns per-agent daily rows matching `analytics_daily_aggregates`
|
|
- `GET /analytics/token-trends` returns daily and weekly granularity correctly
|
|
- All three endpoints return HTTP 403 for free-tier tenants (when `ANALYTICS_FREE_TIER=false`)
|
|
- Date range validation rejects `from > to` with HTTP 400
|
|
- Nightly aggregation job runs idempotently — running twice for same date produces no duplicates
|
|
- Analytics responses are cached in Redis — a second identical request does not hit the DB
|
|
- Dashboard Analytics tab renders heatmap, trend chart, and rotation table with mock data in Storybook
|
|
- Unit test coverage >= 80% on `AnalyticsService`
|
|
- Integration tests cover: summary, activity, trends (daily), trends (weekly), free-tier rejection, invalid date range
|