Implements all P0 features per OpenSpec change phase-1-mvp-implementation: - Agent Registry Service (CRUD) — full lifecycle management - OAuth 2.0 Token Service (Client Credentials flow) - Credential Management (generate, rotate, revoke) - Immutable Audit Log Service Tech: Node.js 18+, TypeScript 5.3+ strict, Express 4.18+, PostgreSQL 14+, Redis 7+ Standards: OpenAPI 3.0 specs, DRY/SOLID, zero `any` types Quality: 18 unit test suites, 244 tests passing, 97%+ coverage OpenAPI: 4 complete specs (14 endpoints total) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
131 lines
7.1 KiB
Markdown
131 lines
7.1 KiB
Markdown
## Context
|
|
|
|
SentryAgent.ai AgentIdP is a greenfield Node.js/TypeScript service with no existing implementation. The codebase contains only scaffolding. Four CEO-approved OpenAPI 3.0 specs define the full API surface. This design governs the architecture for all four P0 services and their shared infrastructure.
|
|
|
|
**Constraints:**
|
|
- TypeScript 5.3+ strict mode — no `any` types, ever
|
|
- DRY and SOLID enforced on every file
|
|
- PostgreSQL 14+ for all persistent state; Redis 7+ for caching and rate limiting
|
|
- Express 4.18+ as the HTTP framework
|
|
- All secrets bcrypt-hashed (10 rounds); `clientSecret` never persisted in plain text
|
|
- Specs are the source of truth — implementation must match exactly
|
|
|
|
## Goals / Non-Goals
|
|
|
|
**Goals:**
|
|
- Implement all 4 P0 services (Agent Registry, OAuth2 Token, Credential Management, Audit Log) as typed Express route handlers backed by typed service classes
|
|
- Enforce free-tier limits (100 agents, 10,000 tokens/month, 100 req/min, 90-day audit retention)
|
|
- Provide a single Express app entry point with all middleware and routing wired up
|
|
- Provide PostgreSQL migrations for all 4 tables
|
|
- Provide a Docker Compose file for local development (Node.js app + Postgres + Redis)
|
|
|
|
**Non-Goals:**
|
|
- HashiCorp Vault, OPA, Web UI, Python/Go SDKs (Phase 2+)
|
|
- Multi-region deployment, SOC 2 (Phase 3+)
|
|
- Admin-scoped cross-agent credential management (stub `403` — implement in Phase 2)
|
|
|
|
## Decisions
|
|
|
|
### D1: Layered architecture (Controller → Service → Repository)
|
|
**Decision**: Each feature has a Controller (HTTP), a Service (business logic), and a Repository (DB queries). No business logic in controllers; no SQL outside repositories.
|
|
**Rationale**: SOLID Single Responsibility. Controllers handle HTTP concerns only. Services are testable in isolation (inject mock repository). Repositories are the sole owners of SQL.
|
|
**Alternative considered**: Fat controllers — rejected (untestable, violates SRP).
|
|
|
|
### D2: Dependency injection via constructor injection
|
|
**Decision**: All dependencies (repositories, services, Redis client, JWT utils) are injected via constructor parameters. No `new Foo()` inside business logic.
|
|
**Rationale**: SOLID Dependency Inversion. Enables unit testing with mocks. No global singletons in services.
|
|
**Alternative considered**: Service locator / global singletons — rejected (hidden coupling, hard to test).
|
|
|
|
### D3: Single shared error hierarchy (`SentryAgentError`)
|
|
**Decision**: All custom errors extend `SentryAgentError` (as defined in README §6.6). A single Express error-handling middleware maps each error class to its HTTP status code and `ErrorResponse` shape.
|
|
**Rationale**: DRY — error-to-status mapping exists in exactly one place. Every thrown error is typed and explicit.
|
|
|
|
### D4: JWT signed with RS256 (asymmetric)
|
|
**Decision**: Access tokens are signed with RS256 (RSA 2048-bit). Public key exposed for external verification.
|
|
**Rationale**: Allows downstream services to verify tokens without calling back to AgentIdP. Industry standard for OAuth2 JWTs. Symmetric HS256 would require sharing the secret with every verifier.
|
|
**Alternative considered**: HS256 — rejected (key distribution problem at scale).
|
|
|
|
### D5: Redis for token revocation and rate limiting
|
|
**Decision**: Revoked token JTIs are stored in Redis with TTL = token expiry. Rate-limit counters use Redis sliding window. Free-tier monthly token count uses Redis with monthly TTL.
|
|
**Rationale**: Redis provides O(1) token revocation checks without DB round-trips. Token introspection path must be fast (<100ms per spec).
|
|
|
|
### D6: `clientSecret` format — `sk_live_` prefix + 32 random hex bytes
|
|
**Decision**: Generated secrets follow the pattern `sk_live_<64 hex chars>`. Stored as bcrypt hash (10 rounds).
|
|
**Rationale**: Prefixed format is recognisable in logs/config and grep-able for secret scanning. 64 hex chars = 256 bits of entropy.
|
|
|
|
### D7: Audit log written synchronously within the request transaction
|
|
**Decision**: Audit events are inserted within the same DB transaction as the action that triggers them (where applicable). For token issuance (Redis-only operation), audit is a separate async fire-and-forget insert.
|
|
**Rationale**: For state-changing DB operations (agent creation, credential rotation) atomicity guarantees the audit record is never lost. Token issuance latency must be <100ms — synchronous audit insert would risk this on high load.
|
|
|
|
### D8: Project file layout
|
|
```
|
|
src/
|
|
app.ts — Express app factory (no listen call — testable)
|
|
server.ts — Entry point (calls app.ts, calls listen)
|
|
types/index.ts — All shared TypeScript interfaces and types
|
|
utils/
|
|
crypto.ts — Secret generation, bcrypt helpers
|
|
jwt.ts — JWT sign/verify
|
|
validators.ts — Joi schemas for all request bodies
|
|
errors.ts — SentryAgentError hierarchy
|
|
middleware/
|
|
auth.ts — Bearer token extraction and verification
|
|
rateLimit.ts — Redis-backed rate limiter
|
|
errorHandler.ts — Global Express error handler
|
|
db/
|
|
pool.ts — pg Pool singleton
|
|
migrations/ — SQL migration files (001_create_agents.sql, etc.)
|
|
cache/
|
|
redis.ts — Redis client singleton
|
|
services/
|
|
AgentService.ts
|
|
OAuth2Service.ts
|
|
CredentialService.ts
|
|
AuditService.ts
|
|
repositories/
|
|
AgentRepository.ts
|
|
CredentialRepository.ts
|
|
AuditRepository.ts
|
|
TokenRepository.ts
|
|
routes/
|
|
agents.ts
|
|
token.ts
|
|
credentials.ts
|
|
audit.ts
|
|
controllers/
|
|
AgentController.ts
|
|
TokenController.ts
|
|
CredentialController.ts
|
|
AuditController.ts
|
|
tests/
|
|
unit/
|
|
services/
|
|
utils/
|
|
integration/
|
|
agents.test.ts
|
|
token.test.ts
|
|
credentials.test.ts
|
|
audit.test.ts
|
|
```
|
|
|
|
## Risks / Trade-offs
|
|
|
|
- **[Risk] RS256 key management in Phase 1** → Keys loaded from `PEM` env vars (`JWT_PRIVATE_KEY`, `JWT_PUBLIC_KEY`). Rotation not automated until Phase 2 (Vault). Mitigation: documented in deployment guide.
|
|
- **[Risk] Async audit insert on token issuance may drop events on crash** → Acceptable for Phase 1 free tier. Synchronous insert + queue buffering addressed in Phase 2.
|
|
- **[Risk] bcrypt 10 rounds adds ~100ms to credential verification** → Token endpoint latency target is <100ms. Bcrypt is only called on `POST /token` (credential verification), not on every authenticated request (JWT verification is fast). Acceptable.
|
|
- **[Trade-off] No admin scope in Phase 1** → Agents can only manage their own credentials. Cross-agent admin operations return `403 FORBIDDEN` with a clear message. Unblocks Phase 1 shipping without scope management complexity.
|
|
|
|
## Migration Plan
|
|
|
|
1. Run `npm install` to install all dependencies
|
|
2. Start Docker Compose (`docker-compose up -d`) — spins up Postgres + Redis
|
|
3. Run migrations: `npm run db:migrate`
|
|
4. Set required env vars (see `.env.example`)
|
|
5. Start server: `npm run dev`
|
|
|
|
**Rollback**: Drop database, stop containers, revert to previous commit. No shared state in Phase 1 (single-instance).
|
|
|
|
## Open Questions
|
|
|
|
- _None_ — all decisions required for Phase 1 implementation are resolved above.
|