# WS1 — Service Deep Dives: Phase 3–6 Additions

**Target file:** `docs/engineering/05-services.md`

**Operation:** Append the following 9 service entries after the existing `### Prometheus/Grafana Monitoring` section (which is the last entry in the current file). Each entry follows the exact format of existing entries.

---

## Instructions to Developer

Append the following Markdown verbatim to the end of `docs/engineering/05-services.md`, starting after the final line of the `### Prometheus/Grafana Monitoring` section. Do not modify any existing content.

---

## Content to Append

```markdown
---

### AnalyticsService

**Purpose**: Records daily aggregated analytics events (token issuances, agent activity) and exposes query methods for token trends, agent activity heatmaps, and per-agent usage summaries. All query methods scope results strictly to the supplied `tenantId`. The `recordEvent` method is fire-and-forget — it catches all errors internally and never propagates them to the caller, so analytics writes never block primary request paths.

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `recordEvent` | `tenantId: string, metricType: string` | `Promise<void>` | Upserts a daily counter row in `analytics_events` via `INSERT ... ON CONFLICT DO UPDATE SET count = count + 1`. Catches and swallows all errors; safe to call with `void` on hot paths. |
| `getTokenTrend` | `tenantId: string, days: number` | `Promise<ITokenTrendEntry[]>` | Returns daily token issuance counts for the last N days (clamped to 90). Uses `generate_series` + `LEFT JOIN` so that days with no events appear as `count: 0`. Results sorted ascending by date. |
| `getAgentActivity` | `tenantId: string` | `Promise<IAgentActivityEntry[]>` | Returns agent activity bucketed by day-of-week (0=Sun…6=Sat) and hour-of-day for the last 30 days. Reads only rows whose `metric_type` matches the pattern `agent:<agentId>:<metricType>`. |
| `getAgentUsageSummary` | `tenantId: string` | `Promise<IAgentUsageSummaryEntry[]>` | Returns per-agent token issuance totals for the current calendar month, joined with the agent name (`owner` field). Sorted descending by `token_count`. Excludes decommissioned agents. |

**Dependencies**: PostgreSQL connection pool (`Pool` from `pg`). No Redis usage.

**Configuration**: None. `MAX_TREND_DAYS = 90` is a module-level constant.

**DB tables**:
- `analytics_events`: `organization_id` (UUID FK to `organizations`), `date` (DATE), `metric_type` (text — e.g. `'token_issued'`, `'agent:<agentId>:token_issued'`), `count` (integer). Unique constraint on `(organization_id, date, metric_type)`.
- `agents`: read in `getAgentUsageSummary` to join `owner` and filter by `organization_id`.

---

### TierService

**Purpose**: Single authority for all subscription tier business logic — fetches current tier and live usage, initiates Stripe Checkout sessions for upgrades, applies confirmed upgrades to the `organizations` table, and enforces per-tier agent count limits. Controllers and middleware delegate all tier decisions to this service; no tier logic lives elsewhere.

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `getStatus` | `orgId: string` | `Promise<ITierStatus>` | Returns current `tier`, per-tier `limits` (from `TIER_CONFIG`), live `usage` (Redis counters + DB agent count), and `resetAt` (ISO 8601 next UTC midnight). Falls back to `0` for Redis counters when Redis is unavailable. |
| `initiateUpgrade` | `orgId: string, targetTier: TierName` | `Promise<IUpgradeInitiation>` | Validates that `targetTier` is strictly higher rank than current tier. Creates a Stripe Checkout Session with `mode: 'subscription'`, `metadata: { orgId, targetTier }`, and the price ID from `STRIPE_PRICE_ID_<TIER>` env var. Returns `{ checkoutUrl }`. |
| `applyUpgrade` | `orgId: string, tier: TierName` | `Promise<void>` | Sets `organizations.tier` and `organizations.tier_updated_at = NOW()`. Called by the Stripe webhook handler after `checkout.session.completed`. |
| `fetchTier` | `orgId: string` | `Promise<TierName>` | Queries `organizations.tier` for the given org. Returns `'free'` as a safe default when no row is found or the stored value is not a valid `TierName`. |
| `enforceAgentLimit` | `orgId: string, tier: TierName` | `Promise<void>` | Counts non-decommissioned agents for the org and throws `TierLimitError` when count is at or over `TIER_CONFIG[tier].maxAgents`. No-op for Enterprise (infinite limit). Called by `AgentService` before creating a new agent. |

**Dependencies**: PostgreSQL (`Pool`), Redis (`RedisClientType`), Stripe client (`Stripe`). Imports `TIER_CONFIG` and `TIER_RANK` from `src/config/tiers.ts`.

**Configuration**:
- `STRIPE_PRICE_ID_PRO` — Stripe price ID for the Pro tier
- `STRIPE_PRICE_ID_ENTERPRISE` — Stripe price ID for the Enterprise tier
- `STRIPE_PRICE_ID` — Fallback Stripe price ID when tier-specific vars are not set
- `STRIPE_SUCCESS_URL` — Redirect URL on successful checkout (default: `APP_BASE_URL/dashboard?billing=success`)
- `STRIPE_CANCEL_URL` — Redirect URL when checkout is cancelled (default: `APP_BASE_URL/dashboard?billing=cancel`)
- `APP_BASE_URL` — Base URL for redirect URL construction (default: `http://localhost:3000`)

**Redis keys**:
- `rate:tier:calls:<orgId>` — integer, daily API call counter; TTL set at next UTC midnight. Read in `getStatus`.
- `rate:tier:tokens:<orgId>` — integer, daily token issuance counter; same TTL. Read in `getStatus`.

**DB tables**:
- `organizations`: `organization_id` (UUID PK), `tier` (text — `'free'|'pro'|'enterprise'`), `tier_updated_at` (timestamptz). Read in `fetchTier`; written in `applyUpgrade`.
- `agents`: read in `enforceAgentLimit` and `getStatus` to count non-decommissioned agents per org.

**Error types**:
- `ValidationError` (400) — target tier is not higher than current tier
- `TierLimitError` (429) — agent count limit reached for the current tier

---

### ComplianceService

**Purpose**: Generates AGNTCY-standard compliance reports and exports agent cards for a tenant. Reports cover two sections: `agent-identity` (DID presence and credential expiry checks) and `audit-trail` (cryptographic hash chain verification). Reports are cached in Redis for 5 minutes to avoid repeated expensive DB queries. Agent card export returns all active agents in AGNTCY-standard JSON format.

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `generateReport` | `tenantId: string` | `Promise<IComplianceReport>` | Attempts to read `compliance:report:<tenantId>` from Redis; if found, returns it with `from_cache: true`. Otherwise builds the report by running `buildAgentIdentitySection` and `buildAuditTrailSection` in parallel, rolls up the overall status (fail > warn > pass), caches the result for 300 seconds, and returns it. |
| `exportAgentCards` | `tenantId: string` | `Promise<IAgentCard[]>` | Queries all non-decommissioned agents for the tenant and maps each to an AGNTCY agent card with `id` (DID or agent UUID), `name`, `capabilities`, `endpoint`, `created_at`, and `agntcy_schema_version: '1.0'`. |

**Dependencies**: PostgreSQL (`Pool`), Redis (`RedisClientType`). Internally instantiates `AuditVerificationService` for hash chain verification.

**Configuration**: None. `CACHE_TTL_SECONDS = 300` and `AGNTCY_SCHEMA_VERSION = '1.0'` are module-level constants.

**Redis keys**:
- `compliance:report:<tenantId>` — JSON-serialised `IComplianceReport`, TTL 300 seconds. Written by `generateReport`; read on every call within the cache window.

**DB tables**:
- `agents`: queried in both `buildAgentIdentitySection` (checks DID presence) and `exportAgentCards`.
- `credentials`: queried in `buildAgentIdentitySection` to check active credential expiry per agent.
- `audit_events`: read via `AuditVerificationService` in `buildAuditTrailSection` to verify hash chain integrity.

**Error types**: None thrown directly. Internal errors in section builders produce `status: 'fail'` sections rather than exceptions.

**Report structure**:
- `agent-identity` section: `fail` when any active agent is missing a DID or has expired credentials; `warn` when any credential expires within 7 days; `pass` otherwise.
- `audit-trail` section: `fail` when `AuditVerificationService.verifyChain()` returns `verified: false`; `pass` otherwise.

---

### FederationService

**Purpose**: Manages trusted federation partners and cross-IdP JWT token verification. At partner registration, the partner's JWKS endpoint is validated and the keys are cached in Redis. At token verification, the service fetches (or reuses cached) partner JWKS, verifies the JWT signature and standard claims, enforces the partner's `allowed_organizations` filter, and rejects tokens from suspended or expired partners.

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `registerPartner` | `req: ICreatePartnerRequest` | `Promise<IFederationPartner>` | Validates the `jwks_uri` is reachable (5-second timeout) and returns valid JWKS. Inserts the partner row into `federation_partners`. Caches the JWKS in Redis under `federation:jwks:<issuer>`. |
| `listPartners` | _(none)_ | `Promise<IFederationPartner[]>` | Updates any partners past `expires_at` to `status = 'expired'` before returning all rows ordered by `created_at DESC`. |
| `getPartner` | `id: string` | `Promise<IFederationPartner>` | Applies the same expiry update, then returns the partner row. Throws `FederationPartnerNotFoundError` (404) when not found. |
| `updatePartner` | `id: string, req: IUpdatePartnerRequest` | `Promise<IFederationPartner>` | Applies a partial update. When `jwks_uri` changes, invalidates the old issuer's JWKS cache entry (`DEL federation:jwks:<oldIssuer>`). |
| `deletePartner` | `id: string` | `Promise<void>` | Deletes the partner row and invalidates the JWKS cache. |
| `verifyFederatedToken` | `req: IFederationVerifyRequest` | `Promise<IFederationVerifyResult>` | Decodes token header/payload without verification, rejects `alg:none`, looks up partner by `iss`, checks partner status and expiry, fetches JWKS (cache-first), finds matching key by `kid`, converts JWK to PEM, verifies signature via `jsonwebtoken.verify` (RS256 or ES256), enforces `allowed_organizations` filter. Returns `{ valid, issuer, subject, organization_id, claims }`. |

**Dependencies**: PostgreSQL (`Pool`), Redis (`RedisClientType`). Uses `jsonwebtoken` for JWT decoding/verification and Node.js `crypto.createPublicKey` for JWK-to-PEM conversion.

**Configuration**:
- `FEDERATION_JWKS_CACHE_TTL_SECONDS` — TTL for cached partner JWKS in Redis (default: `3600`)

**Redis keys**:
- `federation:jwks:<issuer>` — JSON-serialised `IJWKSKey[]`, TTL from `FEDERATION_JWKS_CACHE_TTL_SECONDS`. Written on partner registration and on cache miss during token verification; deleted when a partner is updated (JWKS URI changed) or deleted.

**DB tables**:
- `federation_partners`: `id` (UUID PK), `name` (text), `issuer` (text — the IdP's issuer URL), `jwks_uri` (text), `allowed_organizations` (text[] — empty means all orgs allowed), `status` (`active|suspended|expired`), `created_at`, `updated_at`, `expires_at` (nullable timestamptz).

**Error types**:
- `FederationPartnerError` (400) — JWKS endpoint unreachable or returns invalid JWKS
- `FederationPartnerNotFoundError` (404) — partner UUID not found
- `FederationVerificationError` (401) — token malformed, alg:none, unknown issuer, partner suspended/expired, signature invalid, org not in allow list

---

### DIDService

**Purpose**: Manages W3C DID Core 1.0 document generation, EC P-256 key pair creation, and AGNTCY agent card export. Generates per-agent `did:web` identifiers, stores private keys in HashiCorp Vault (or records a `dev:no-vault` marker in dev mode), and caches DID documents in Redis. Builds both an instance-level DID document (for AgentIdP itself) and per-agent DID documents with AGNTCY extension properties.

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `generateDIDForAgent` | `agentId: string, organizationId: string` | `Promise<{ did: string; publicKeyJwk: IPublicKeyJwk }>` | Generates an EC P-256 key pair. Stores the private key PEM in Vault KV v2 at `{mount}/data/agentidp/agents/{agentId}/did-key`. Encrypts the vault path via `EncryptionService` (when configured). Inserts a row into `agent_did_keys`. Updates `agents.did` and `agents.did_created_at`. Returns the `did:web` identifier and public key JWK. |
| `buildInstanceDIDDocument` | _(none)_ | `Promise<IDIDDocument>` | Builds the root instance DID document for AgentIdP (format: `did:web:{DID_WEB_DOMAIN}`). Cached in Redis under `did:doc:instance`. |
| `buildAgentDIDDocument` | `agentId: string` | `Promise<IAgentDIDDocumentResult>` | Builds a per-agent DID document (format: `did:web:{DID_WEB_DOMAIN}:agents:{agentId}`). Decommissioned agents get a deactivated document with an `AgentStatus: decommissioned` service entry. Cached in Redis under `did:doc:{agentId}` for active agents only. Throws `AgentNotFoundError` if the agent does not exist. |
| `buildResolutionResult` | `agentId: string` | `Promise<IDIDResolutionResult>` | Wraps `buildAgentDIDDocument` with W3C DID Resolution metadata (`didDocumentMetadata`, `didResolutionMetadata`). |
| `buildAgentCard` | `agentId: string` | `Promise<IAgentCard>` | Returns an AGNTCY-format agent card with `did`, `name` (agent email), `agentType`, `capabilities`, `owner`, `version`, `deploymentEnv`, `identityProvider`, and `issuedAt`. |

**Dependencies**: PostgreSQL (`Pool`), Redis (`RedisClientType`), optional `VaultClient`, optional `EncryptionService`. Uses `node-vault` directly for DID private key storage.

**Configuration**:
- `DID_WEB_DOMAIN` — required; the domain for `did:web` DID construction (e.g. `idp.sentryagent.ai`)
- `DID_DOCUMENT_CACHE_TTL_SECONDS` — Redis cache TTL for DID documents (default: `300`)
- `VAULT_ADDR`, `VAULT_TOKEN`, `VAULT_MOUNT` — when set, private keys are stored in Vault; otherwise `dev:no-vault` marker is used

**Redis keys**:
- `did:doc:instance` — JSON-serialised instance `IDIDDocument`, TTL from `DID_DOCUMENT_CACHE_TTL_SECONDS`
- `did:doc:<agentId>` — JSON-serialised per-agent `IDIDDocument`, same TTL. Not cached for decommissioned agents.

**DB tables**:
- `agents`: `did` (text — `did:web:...`), `did_created_at` (timestamptz). Written by `generateDIDForAgent`; read in all document-building methods.
- `agent_did_keys`: `key_id` (UUID PK), `agent_id` (UUID FK), `organization_id` (UUID FK), `public_key_jwk` (JSONB), `vault_key_path` (text — Vault KV v2 path or `dev:no-vault`), `key_type` (`'EC'`), `curve` (`'P-256'`), `created_at`. Written by `generateDIDForAgent`.

**Error types**:
- `AgentNotFoundError` (404) — agent UUID not found in `buildAgentDIDDocument`, `buildResolutionResult`, `buildAgentCard`

---

### WebhookService

**Purpose**: Manages webhook subscriptions and their delivery history for a tenant organisation. HMAC signing secrets are stored in HashiCorp Vault KV v2 (when configured) or bcrypt-hashed in PostgreSQL in local mode. The raw secret is only returned once at subscription creation time. `vault_secret_path` is encrypted at rest via `EncryptionService` (AES-256-CBC) before being written to PostgreSQL (SOC 2 CC6.1 compliance).

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `createSubscription` | `orgId: string, req: ICreateWebhookRequest` | `Promise<IWebhookSubscription & { secret: string }>` | Generates a 32-byte random hex HMAC secret. Stores in Vault at `secret/data/agentidp/webhooks/{orgId}/{id}/secret` (Vault mode) or bcrypt-hashes and stores in `secret_hash` (local mode). Encrypts `vault_secret_path` via `EncryptionService`. Returns the subscription including the one-time `secret`. Validates URL must use `https://` and events array must be non-empty. |
| `listSubscriptions` | `orgId: string` | `Promise<IWebhookSubscription[]>` | Returns all subscriptions for the org, ordered by `created_at DESC`. No secret fields are included. |
| `getSubscription` | `id: string, orgId: string` | `Promise<IWebhookSubscription>` | Returns a single subscription. Verifies org ownership. |
| `updateSubscription` | `id: string, orgId: string, req: IUpdateWebhookRequest` | `Promise<IWebhookSubscription>` | Partially updates `name`, `url`, `events`, or `active` fields. Validates `https://` if URL is changing. |
| `deleteSubscription` | `id: string, orgId: string` | `Promise<void>` | Permanently deletes the subscription and all deliveries (via PostgreSQL CASCADE). |
| `getSubscriptionSecret` | `subscriptionId: string, orgId: string` | `Promise<string>` | Retrieves the raw HMAC secret from Vault (Vault mode only). Throws `WebhookValidationError` in local mode since the secret cannot be recovered after creation. |
| `listDeliveries` | `subscriptionId: string, orgId: string, limit: number, offset: number` | `Promise<IPaginatedDeliveriesResponse>` | Returns paginated delivery records for a subscription. Verifies org ownership before querying. |

**Dependencies**: PostgreSQL (`Pool`), optional `VaultClient`, Redis (`RedisClientType` — reserved for future caching), optional `EncryptionService`.

**Configuration**: Inherits Vault configuration from `VaultClient` (`VAULT_ADDR`, `VAULT_TOKEN`, `VAULT_MOUNT`). `EncryptionService` requires `ENCRYPTION_KEY` env var (see `EncryptionService` docs).

**DB tables**:
- `webhook_subscriptions`: `id` (UUID PK), `organization_id` (UUID FK), `name` (text), `url` (text — https only), `events` (JSONB — `WebhookEventType[]`), `secret_hash` (text — bcrypt hash in local mode, `'vault'` in Vault mode), `vault_secret_path` (text — encrypted Vault path or `'local'`), `active` (boolean), `failure_count` (integer), `created_at`, `updated_at`.
- `webhook_deliveries`: `id` (UUID PK), `subscription_id` (UUID FK), `event_type` (text), `payload` (JSONB), `status` (`pending|delivered|failed|dead_letter`), `http_status_code` (integer nullable), `attempt_count` (integer), `next_retry_at` (timestamptz nullable), `delivered_at` (timestamptz nullable), `created_at`, `updated_at`. Cascades on subscription delete.

**Error types**:
- `WebhookNotFoundError` (404) — subscription not found or belongs to another org
- `WebhookValidationError` (400) — invalid URL scheme, empty events array, or secret not recoverable in local mode

---

### BillingService

**Purpose**: Manages Stripe billing integration — creates Checkout Sessions for tenant subscriptions, processes incoming Stripe webhook events (subscription lifecycle and checkout completion), and retrieves current subscription status. When a `checkout.session.completed` event carries `{ orgId, targetTier }` in its metadata, delegates to `TierService.applyUpgrade` to update the organisation's tier.

**Public methods**:

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `createCheckoutSession` | `tenantId: string, successUrl: string, cancelUrl: string` | `Promise<string>` | Creates a Stripe Checkout Session with `mode: 'subscription'`, `client_reference_id: tenantId`, and the price from `STRIPE_PRICE_ID`. Returns the checkout URL. Throws if Stripe does not return a URL. |
| `handleWebhookEvent` | `rawBody: Buffer, sig: string, webhookSecret: string` | `Promise<void>` | Verifies the Stripe webhook signature via `stripe.webhooks.constructEvent`. Handles `customer.subscription.created/updated/deleted` (upserts `tenant_subscriptions`) and `checkout.session.completed` (applies tier upgrade via `TierService` when metadata contains `orgId` and `targetTier`). |
| `getSubscriptionStatus` | `tenantId: string` | `Promise<ISubscriptionStatus>` | Queries `tenant_subscriptions` for the given tenant. Returns `{ tenantId, status: 'free', currentPeriodEnd: null, stripeSubscriptionId: null }` when no row exists. |

**Dependencies**: PostgreSQL (`Pool`), Stripe client (`Stripe`), optional `TierService`.

**Configuration**:
- `STRIPE_PRICE_ID` — Stripe price ID for subscription checkout sessions
- `STRIPE_WEBHOOK_SECRET` — Stripe webhook endpoint secret (`whsec_...`); passed by the webhook controller, not read directly by the service

**DB tables**:
- `tenant_subscriptions`: `tenant_id` (UUID PK or unique), `status` (text — `'free'|'active'|'past_due'|'canceled'`), `stripe_customer_id` (text), `stripe_subscription_id` (text), `current_period_end` (timestamptz nullable), `updated_at`. Upserted on subscription lifecycle events.

**Error types**: None defined in the service. Stripe signature failures raise `Error` from `stripe.webhooks.constructEvent`; these propagate to the error handler as 400 responses.

---

### OIDCService (A2A / OIDC Provider)

**Note**: `src/services/OIDCService.ts` does not exist as a standalone file — OIDC provider functionality is handled by the `oidc-provider` npm package, configured in `src/app.ts` and related route files. The service boundary for OIDC-related business logic is the `DelegationService`. Document the OIDC integration as follows.

**Purpose**: The OIDC/A2A subsystem provides agent-to-agent (A2A) delegation using the `oidc-provider` library (v9.7.x). The provider is mounted as a sub-application at `/oidc` and issues short-lived delegation tokens scoped to a specific `delegatee_id`. The `DelegationService` (`src/services/DelegationService.ts`) manages the `delegation_chains` table for auditing.

**Key endpoints exposed by the OIDC provider**:
- `POST /oidc/token` — issues delegation tokens via `client_credentials` or custom grant
- `GET /oidc/.well-known/openid-configuration` — OIDC discovery document
- `GET /oidc/jwks` — public JWK Set for verifying delegation tokens

**DelegationService public methods** (from `src/services/DelegationService.ts`):

| Method | Parameters | Returns | Description |
|--------|-----------|---------|-------------|
| `createDelegation` | `delegatorId: string, delegateeId: string, scope: string, expiresAt?: Date` | `Promise<IDelegationChain>` | Inserts a delegation chain record into `delegation_chains`. Validates both agents exist and are active. |
| `verifyDelegation` | `token: string, delegateeId: string` | `Promise<IDelegationVerifyResult>` | Verifies the delegation token signature and checks the chain record is active and not expired. |
| `revokeDelegation` | `chainId: string, delegatorId: string` | `Promise<void>` | Sets `delegation_chains.status = 'revoked'` and `revoked_at = NOW()`. Validates the delegator owns the chain. |

**DB tables**:
- `delegation_chains`: `chain_id` (UUID PK), `delegator_id` (UUID), `delegatee_id` (UUID), `scope` (text), `status` (`active|revoked|expired`), `created_at`, `expires_at` (nullable), `revoked_at` (nullable), `token` (text — the delegation JWT).

**Configuration**:
- `A2A_ENABLED` — when set to `'false'`, A2A/delegation endpoints return 404
- `OIDC_ISSUER` — issuer URL for the OIDC provider
```