- engineering-docs → archive/2026-03-29-engineering-docs (63/63 tasks complete) - phase-2-production-ready → archive/2026-03-29-phase-2-production-ready (89/89 tasks complete) - openspec/specs/ synced with all Phase 1 + Phase 2 + engineering-docs capabilities (22 specs total) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
33 lines
1.5 KiB
Markdown
33 lines
1.5 KiB
Markdown
# Spec: Prometheus + Grafana Monitoring
|
|
|
|
**Status**: Pending CEO approval
|
|
**Workstream**: 7 of 8
|
|
|
|
## Scope
|
|
- `prom-client` integration — expose `GET /metrics`
|
|
- 7 metrics (counters + histograms) across all services
|
|
- `monitoring/` directory: Prometheus config + Grafana provisioning
|
|
- `docker-compose.monitoring.yml` overlay (adds prometheus + grafana services)
|
|
- Pre-built Grafana dashboard JSON (`monitoring/grafana/dashboards/agentidp.json`)
|
|
|
|
## Metrics
|
|
|
|
| Metric | Type | Labels |
|
|
|--------|------|--------|
|
|
| `agentidp_tokens_issued_total` | Counter | `outcome` (success/failure) |
|
|
| `agentidp_agents_registered_total` | Counter | `outcome` |
|
|
| `agentidp_http_requests_total` | Counter | `method`, `path`, `status_code` |
|
|
| `agentidp_http_request_duration_seconds` | Histogram | `method`, `path` |
|
|
| `agentidp_rate_limit_rejections_total` | Counter | — |
|
|
| `agentidp_db_query_duration_seconds` | Histogram | `operation` |
|
|
| `agentidp_redis_command_duration_seconds` | Histogram | `command` |
|
|
|
|
## Acceptance Criteria
|
|
- [ ] `GET /metrics` returns Prometheus text format
|
|
- [ ] `/metrics` endpoint does NOT require Bearer auth (Prometheus scrapes it)
|
|
- [ ] All 7 metrics present and updating under load
|
|
- [ ] Grafana dashboard auto-provisions on `docker compose -f docker-compose.monitoring.yml up`
|
|
- [ ] Grafana runs on port 3001 (no conflict with AgentIdP on 3000)
|
|
- [ ] `docs/devops/operations.md` updated with monitoring section
|
|
- [ ] `prom-client` added as new dependency — CEO approval gate
|