Files
SentryAgent.ai Developer d42c653eea chore(openspec): archive engineering-docs and phase-2-production-ready changes
- engineering-docs → archive/2026-03-29-engineering-docs (63/63 tasks complete)
- phase-2-production-ready → archive/2026-03-29-phase-2-production-ready (89/89 tasks complete)
- openspec/specs/ synced with all Phase 1 + Phase 2 + engineering-docs capabilities (22 specs total)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:41:53 +00:00

33 lines
1.5 KiB
Markdown

# Spec: Prometheus + Grafana Monitoring
**Status**: Pending CEO approval
**Workstream**: 7 of 8
## Scope
- `prom-client` integration — expose `GET /metrics`
- 7 metrics (counters + histograms) across all services
- `monitoring/` directory: Prometheus config + Grafana provisioning
- `docker-compose.monitoring.yml` overlay (adds prometheus + grafana services)
- Pre-built Grafana dashboard JSON (`monitoring/grafana/dashboards/agentidp.json`)
## Metrics
| Metric | Type | Labels |
|--------|------|--------|
| `agentidp_tokens_issued_total` | Counter | `outcome` (success/failure) |
| `agentidp_agents_registered_total` | Counter | `outcome` |
| `agentidp_http_requests_total` | Counter | `method`, `path`, `status_code` |
| `agentidp_http_request_duration_seconds` | Histogram | `method`, `path` |
| `agentidp_rate_limit_rejections_total` | Counter | — |
| `agentidp_db_query_duration_seconds` | Histogram | `operation` |
| `agentidp_redis_command_duration_seconds` | Histogram | `command` |
## Acceptance Criteria
- [ ] `GET /metrics` returns Prometheus text format
- [ ] `/metrics` endpoint does NOT require Bearer auth (Prometheus scrapes it)
- [ ] All 7 metrics present and updating under load
- [ ] Grafana dashboard auto-provisions on `docker compose -f docker-compose.monitoring.yml up`
- [ ] Grafana runs on port 3001 (no conflict with AgentIdP on 3000)
- [ ] `docs/devops/operations.md` updated with monitoring section
- [ ] `prom-client` added as new dependency — CEO approval gate