Files
sentryagent-idp/openspec/changes/phase-5-scale-ecosystem/specs/analytics-dashboard/spec.md
SentryAgent.ai Developer 389a764e8d feat(openspec): propose phase-5-scale-ecosystem change
6 workstreams, 119 tasks — Scale & Ecosystem:
- WS1: Rust SDK
- WS2: Agent-to-Agent (A2A) Authorization
- WS3: Advanced Analytics Dashboard
- WS4: Public API Gateway & Rate Limiting SaaS
- WS5: Developer Experience (DX) improvements
- WS6: AGNTCY Compliance Certification Package

Awaiting CEO approval to begin implementation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 15:33:08 +00:00

10 KiB

WS3: Advanced Analytics Dashboard

Purpose

Give paying tenants actionable visibility into their agent usage patterns. Analytics surface four dimensions: agent activity over time (heatmap), token issuance frequency and volume (trends), credential rotation frequency (rotation frequency table), and per-endpoint API call patterns (call patterns breakdown). Data is pre-aggregated nightly from the existing usage_events table into a new analytics_daily_aggregates table. Analytics are rendered in a new Analytics tab in the existing React web dashboard.

New Endpoints

GET /analytics/usage-summary

Summary: Return a high-level usage summary for the authenticated tenant over a date range.

Authentication: Bearer token (tenant-scoped).

Query Parameters:

Parameter Type Required Default Constraints
from string (YYYY-MM-DD) no 30 days ago Must be <= to
to string (YYYY-MM-DD) no today Must be <= today

Response 200 (application/json):

{
  "tenantId": "string (UUID)",
  "period": {
    "from": "string (YYYY-MM-DD)",
    "to": "string (YYYY-MM-DD)"
  },
  "summary": {
    "totalApiCalls": 84320,
    "totalTokenIssuances": 12400,
    "totalCredentialRotations": 48,
    "activeAgentCount": 23,
    "averageDailyApiCalls": 2810,
    "peakDailyApiCalls": 5102,
    "peakDate": "2026-03-28"
  }
}

Error Responses:

Status Code Description
400 INVALID_DATE_RANGE from > to, or date range exceeds 365 days
401 UNAUTHORIZED Missing or invalid Bearer token
403 ANALYTICS_NOT_AVAILABLE Tenant is on free tier — analytics require Pro or Enterprise
429 RATE_LIMITED Rate limit exceeded

GET /analytics/agent-activity

Summary: Return per-agent daily activity counts for heatmap rendering.

Authentication: Bearer token (tenant-scoped).

Query Parameters:

Parameter Type Required Default Constraints
from string (YYYY-MM-DD) no 30 days ago Must be <= to
to string (YYYY-MM-DD) no today Max range: 90 days
agentId string (UUID) no (all agents) Filter to a single agent

Response 200 (application/json):

{
  "tenantId": "string (UUID)",
  "period": {
    "from": "string (YYYY-MM-DD)",
    "to": "string (YYYY-MM-DD)"
  },
  "agents": [
    {
      "agentId": "string (UUID)",
      "agentName": "string",
      "dailyActivity": [
        {
          "date": "2026-03-01",
          "apiCalls": 342,
          "tokenIssuances": 12,
          "credentialRotations": 0
        }
      ]
    }
  ]
}

Error Responses:

Status Code Description
400 INVALID_DATE_RANGE from > to, or date range exceeds 90 days
401 UNAUTHORIZED Missing or invalid Bearer token
403 ANALYTICS_NOT_AVAILABLE Free tier — requires Pro or Enterprise
404 AGENT_NOT_FOUND agentId filter specified but agent does not belong to tenant
429 RATE_LIMITED Rate limit exceeded

Summary: Return daily token issuance counts and success/failure breakdown for trend charts.

Authentication: Bearer token (tenant-scoped).

Query Parameters:

Parameter Type Required Default Constraints
from string (YYYY-MM-DD) no 30 days ago Must be <= to
to string (YYYY-MM-DD) no today Max range: 365 days
granularity string no day Enum: day, week

Response 200 (application/json):

{
  "tenantId": "string (UUID)",
  "period": {
    "from": "string (YYYY-MM-DD)",
    "to": "string (YYYY-MM-DD)"
  },
  "granularity": "day",
  "dataPoints": [
    {
      "date": "2026-03-01",
      "totalIssuances": 420,
      "successfulIssuances": 415,
      "failedIssuances": 5,
      "uniqueAgents": 8
    }
  ]
}

Error Responses:

Status Code Description
400 INVALID_DATE_RANGE from > to, or date range exceeds 365 days
400 INVALID_GRANULARITY granularity is not day or week
401 UNAUTHORIZED Missing or invalid Bearer token
403 ANALYTICS_NOT_AVAILABLE Free tier — requires Pro or Enterprise
429 RATE_LIMITED Rate limit exceeded

Database Schema Changes

Migration: 009_add_analytics_aggregates.sql

CREATE TABLE analytics_daily_aggregates (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    agent_id        UUID REFERENCES agents(id) ON DELETE SET NULL,   -- NULL = tenant-wide aggregate
    date            DATE NOT NULL,
    metric_type     VARCHAR(64) NOT NULL,   -- 'api_calls' | 'token_issuances' | 'credential_rotations' | 'token_failures'
    count           BIGINT NOT NULL DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),

    CONSTRAINT uq_daily_aggregate UNIQUE (tenant_id, agent_id, date, metric_type)
);

-- Index for analytics queries (tenant + date range)
CREATE INDEX idx_analytics_tenant_date ON analytics_daily_aggregates(tenant_id, date);
CREATE INDEX idx_analytics_agent_date ON analytics_daily_aggregates(agent_id, date) WHERE agent_id IS NOT NULL;

Nightly Aggregation Job

A node-cron job runs at 00:05 UTC daily inside the Express API process. It executes an upsert query aggregating the previous day's usage_events rows into analytics_daily_aggregates. The job is idempotent — running it twice for the same date produces no duplicates (upsert on the unique constraint).

Job logic (pseudocode):

1. Compute target_date = yesterday (UTC)
2. SELECT tenant_id, agent_id, metric_type, SUM(count)
   FROM usage_events
   WHERE date = target_date
   GROUP BY tenant_id, agent_id, metric_type
3. UPSERT INTO analytics_daily_aggregates
   ON CONFLICT (tenant_id, agent_id, date, metric_type)
   DO UPDATE SET count = EXCLUDED.count, updated_at = NOW()

New Source Files

File Description
src/services/AnalyticsService.ts Business logic: query aggregates, build response shapes, Redis caching
src/controllers/AnalyticsController.ts HTTP handlers for analytics endpoints
src/routes/analytics.ts Express router for /analytics/ prefix
src/jobs/analyticsAggregation.ts node-cron job that aggregates usage_events nightly
src/types/analytics.ts TypeScript interfaces: UsageSummary, AgentActivityResponse, TokenTrendsResponse, DailyAggregate
dashboard/src/pages/Analytics.tsx New Analytics tab in existing React dashboard
dashboard/src/components/charts/AgentHeatmap.tsx Heatmap component using recharts ResponsiveContainer + custom cells
dashboard/src/components/charts/TokenTrendsChart.tsx Line chart of token issuance over time using recharts LineChart
dashboard/src/components/charts/RotationFrequencyTable.tsx Sortable table of credential rotation counts per agent
dashboard/src/api/analyticsApi.ts Typed fetch functions for analytics endpoints

Modified Source Files

File Change
src/app.ts Register analytics router; start nightly aggregation cron job
src/infrastructure/migrations/ Add 009_add_analytics_aggregates.sql
dashboard/src/App.tsx Add Analytics route and nav link
package.json (API) Add node-cron dependency
package.json (dashboard) Add recharts, date-fns dependencies
docs/openapi.yaml Add analytics endpoints

Redis Caching

Analytics responses are cached in Redis with analytics:{tenantId}:{endpoint}:{queryHash} keys. TTL: 5 minutes for agent-activity and token-trends; 60 seconds for usage-summary. Cache is invalidated on the next request after TTL expiry (no explicit invalidation).

AnalyticsService Interface

interface IAnalyticsService {
    /**
     * Return a high-level usage summary for a tenant over a date range.
     */
    getUsageSummary(tenantId: string, from: Date, to: Date): Promise<UsageSummary>;

    /**
     * Return per-agent daily activity data for heatmap rendering.
     */
    getAgentActivity(
        tenantId: string,
        from: Date,
        to: Date,
        agentId?: string
    ): Promise<AgentActivityResponse>;

    /**
     * Return daily token issuance trends with success/failure breakdown.
     */
    getTokenTrends(
        tenantId: string,
        from: Date,
        to: Date,
        granularity: 'day' | 'week'
    ): Promise<TokenTrendsResponse>;
}

Prometheus Metrics

Metric Type Labels Description
agentidp_analytics_query_duration_ms Histogram endpoint Analytics query latency (before cache)
agentidp_analytics_cache_hits_total Counter endpoint Analytics Redis cache hits
agentidp_analytics_cache_misses_total Counter endpoint Analytics Redis cache misses
agentidp_analytics_aggregation_job_duration_ms Gauge Nightly aggregation job runtime
agentidp_analytics_aggregation_job_last_run Gauge Unix timestamp of last successful aggregation job run

Feature Flags

Variable Default Description
ANALYTICS_ENABLED true When false, all /analytics/ routes return HTTP 404
ANALYTICS_FREE_TIER false When true, free tier tenants can access analytics (for beta/testing)

Acceptance Criteria

  • GET /analytics/usage-summary returns correct aggregate counts for a date range
  • GET /analytics/agent-activity returns per-agent daily rows matching analytics_daily_aggregates
  • GET /analytics/token-trends returns daily and weekly granularity correctly
  • All three endpoints return HTTP 403 for free-tier tenants (when ANALYTICS_FREE_TIER=false)
  • Date range validation rejects from > to with HTTP 400
  • Nightly aggregation job runs idempotently — running twice for same date produces no duplicates
  • Analytics responses are cached in Redis — a second identical request does not hit the DB
  • Dashboard Analytics tab renders heatmap, trend chart, and rotation table with mock data in Storybook
  • Unit test coverage >= 80% on AnalyticsService
  • Integration tests cover: summary, activity, trends (daily), trends (weekly), free-tier rejection, invalid date range