feat(phase-2): workstream 7 — Prometheus + Grafana Monitoring

- Add prom-client 15; shared registry in src/metrics/registry.ts (7 metrics)
- HTTP request counter + duration histogram via metricsMiddleware
- DB query duration histogram wrapping pg Pool.query
- Redis command duration histogram via typed instrumentRedisMethod wrapper
- agentidp_tokens_issued_total in OAuth2Service
- agentidp_agents_registered_total in AgentService
- GET /metrics unauthenticated endpoint (Prometheus text format)
- docker-compose.monitoring.yml overlay (Prometheus + Grafana)
- Grafana auto-provisioned datasource + pre-built AgentIdP dashboard
- docs/devops/operations.md monitoring section added
- 36/36 unit tests passing, 100% coverage on new metrics code
- Fix pre-existing unused import in tests/integration/agents.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
SentryAgent.ai Developer
2026-03-29 06:13:41 +00:00
parent 7d6e248a14
commit a504964e5f
21 changed files with 1053 additions and 15 deletions

View File

@@ -247,3 +247,38 @@ docker-compose exec redis redis-cli GET "rate:<client_id>:$WINDOW"
```
**Fix:** Wait until `X-RateLimit-Reset` (Unix timestamp in the response header) before retrying. The window resets every 60 seconds.
---
## Monitoring
AgentIdP exposes a Prometheus metrics endpoint at `GET /metrics` (unauthenticated, plain text).
### Metrics Exposed
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `agentidp_tokens_issued_total` | Counter | `scope` | OAuth 2.0 tokens issued successfully |
| `agentidp_agents_registered_total` | Counter | `deployment_env` | Agents registered successfully |
| `agentidp_http_requests_total` | Counter | `method`, `route`, `status_code` | HTTP requests received |
| `agentidp_http_request_duration_seconds` | Histogram | `method`, `route`, `status_code` | HTTP request duration |
| `agentidp_db_query_duration_seconds` | Histogram | `operation` | PostgreSQL query duration |
| `agentidp_redis_command_duration_seconds` | Histogram | `command` | Redis command duration |
### Starting the Monitoring Stack
```bash
# Start the full stack with monitoring
docker compose -f docker-compose.yml -f docker-compose.monitoring.yml up -d
# Prometheus: http://localhost:9090
# Grafana: http://localhost:3001 (admin / agentidp)
```
The Grafana dashboard auto-provisions on first start. Navigate to **Dashboards → AgentIdP → SentryAgent.ai — AgentIdP**.
### Security Note
`GET /metrics` is unauthenticated. In production, ensure this endpoint is:
- Only accessible from your internal network (firewall rule or reverse proxy restriction)
- Not exposed on a public-facing port