docs: engineering knowledge base for new hires

Complete docs/engineering/ suite — 12 documents covering company overview, system architecture, tech stack ADRs, codebase structure, service deep dives, annotated code walkthroughs, dev setup, engineering workflow, testing strategy, deployment/ops, SDK guide, and README index. All content verified against source files. All 82 tasks in openspec/changes/engineering-docs/tasks.md marked complete. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:38:42 +00:00
parent 1f95cfe89d
commit eced5f8699
13 changed files with 3820 additions and 0 deletions
--- a/docs/engineering/06-walkthroughs.md
+++ b/docs/engineering/06-walkthroughs.md
@@ -0,0 +1,717 @@
+# 06 — Code Walkthroughs
+
+Last verified against commit: `1f95cfe89d1f45fa43b9fb7cff237f07bf9e889e`
+
+These walkthroughs trace three real production code paths from the HTTP request
+to the database and back. Every step includes a `file:line` reference and a
+"why" annotation explaining the design decision.
+
+---
+
+## Walkthrough 1 — Token Issuance
+
+**Request:** `POST /api/v1/token` with `grant_type=client_credentials`
+
+This is the most security-critical path in the codebase. An AI agent calling this
+endpoint is proving its identity and receiving a token that grants access to the
+entire API for one hour.
+
+---
+
+### Step 1 — Express middleware stack
+
+**File:** `src/app.ts` lines 57–83
+
+```
+helmet()          → security headers
+cors()            → CORS headers
+morgan()          → access log line (skipped in test env)
+express.json()    → parse JSON bodies
+express.urlencoded({ extended: false }) → parse form-encoded bodies
+metricsMiddleware → start request timer, record counters on finish
+```
+
+**Why `extended: false`?** The token endpoint receives `application/x-www-form-urlencoded`
+bodies (RFC 6749 mandates this format for OAuth 2.0). The `express.urlencoded`
+middleware parses them into `req.body`. `extended: false` uses the native `querystring`
+parser, which is sufficient and avoids `qs` library complexity for flat key-value data.
+
+---
+
+### Step 2 — Route dispatch
+
+**File:** `src/routes/token.ts` line 24
+
+```typescript
+router.post('/', asyncHandler(rateLimitMiddleware), asyncHandler(tokenController.issueToken.bind(tokenController)));
+```
+
+**Why no `authMiddleware` here?** The token endpoint is where the agent _gets_ its
+token — it cannot present a Bearer token to authenticate. Instead, credentials go
+in the request body (`client_id`, `client_secret`). `POST /token` is deliberately
+unauthenticated at the transport layer; authentication happens inside the controller.
+
+**Why `asyncHandler`?** Express does not natively support async middleware. `asyncHandler`
+wraps the async function and calls `next(err)` if the promise rejects, routing the
+error to `errorHandler`.
+
+---
+
+### Step 3 — Rate limit check
+
+**File:** `src/middleware/rateLimit.ts`
+
+The rate limiter checks a Redis sliding-window counter for the client's IP address.
+If the counter exceeds 100 requests/minute, it throws `RateLimitError` (429).
+
+**Why Redis, not in-memory?** If the server restarts or scales horizontally to multiple
+instances, an in-memory counter would reset. Redis maintains the counter across
+instances and restarts.
+
+---
+
+### Step 4 — Controller: validate grant_type
+
+**File:** `src/controllers/TokenController.ts` lines 84–103
+
+```typescript
+issueToken = async (req: Request, res: Response, _next: NextFunction): Promise<void> => {
+  const body = req.body as ITokenRequest;
+
+  if (!body.grant_type) { ... return res.status(400).json({error: 'invalid_request', ...}) }
+  if (body.grant_type !== 'client_credentials') { ... return res.status(400).json(...) }
+```
+
+**Why does this method catch errors itself instead of calling `next(err)`?** The token
+endpoint must return errors in the **OAuth 2.0 error format** (`{ error, error_description }`)
+per RFC 6749 §5.2, not the standard SentryAgent.ai format (`{ code, message }`). The
+`mapToOAuth2Error()` helper translates `AuthenticationError` and `AuthorizationError`
+into OAuth2 error codes. The `_next` parameter is intentionally unused for the error path.
+
+---
+
+### Step 5 — Controller: Joi validation and credential extraction
+
+**File:** `src/controllers/TokenController.ts` lines 106–138
+
+```typescript
+const { error, value } = tokenRequestSchema.validate(body, { abortEarly: false });
+// ...
+// Support HTTP Basic auth fallback (RFC 6749 §2.3.1)
+const authHeader = req.headers['authorization'];
+if (authHeader?.startsWith('Basic ')) {
+  const base64 = authHeader.slice(6);
+  const decoded = Buffer.from(base64, 'base64').toString('utf-8');
+  const colonIndex = decoded.indexOf(':');
+  clientId = decoded.slice(0, colonIndex);
+  clientSecret = decoded.slice(colonIndex + 1);
+}
+```
+
+**Why `abortEarly: false`?** This returns all validation errors at once, so the
+client can fix all problems in one round trip.
+
+**Why Basic auth support?** RFC 6749 §2.3.1 specifies that client credentials MAY
+be sent via HTTP Basic authentication. Some OAuth libraries default to this method.
+
+---
+
+### Step 6 — Controller: scope validation
+
+**File:** `src/controllers/TokenController.ts` lines 141–151
+
+```typescript
+const requestedScope = tokenBody.scope ?? 'agents:read';
+const validScopes = ['agents:read', 'agents:write', 'tokens:read', 'audit:read'];
+const scopeList = requestedScope.split(' ');
+const invalidScope = scopeList.find((s) => !validScopes.includes(s));
+if (invalidScope) { return res.status(400).json({error: 'invalid_scope', ...}) }
+```
+
+**Why validate scopes here?** Scope validation at the controller layer provides an
+RFC 6749-compliant `invalid_scope` error before we even look up the agent. This is
+faster and gives the client a clearer error message.
+
+---
+
+### Step 7 — Service: agent lookup
+
+**File:** `src/services/OAuth2Service.ts` lines 83–94
+
+```typescript
+const agent = await this.agentRepository.findById(clientId);
+if (!agent) {
+  void this.auditService.logEvent(clientId, 'auth.failed', 'failure', ..., { reason: 'agent_not_found', clientId });
+  throw new AuthenticationError('Client authentication failed...');
+}
+```
+
+**Why log auth failures?** Failed authentication attempts may indicate a brute-force
+attack or a misconfigured client. Having them in the audit log enables incident
+investigation and alerting.
+
+**Why not distinguish between "agent not found" and "wrong secret" in the error message?**
+Revealing which is wrong gives an attacker information — they can enumerate valid
+`client_id` values by checking whether they get "agent not found" vs "wrong secret".
+Both cases return the same message.
+
+---
+
+### Step 8 — Service: credential verification
+
+**File:** `src/services/OAuth2Service.ts` lines 97–131
+
+```typescript
+const { credentials } = await this.credentialRepository.findByAgentId(clientId, { status: 'active', page: 1, limit: 100 });
+
+for (const cred of credentials) {
+  const credRow = await this.credentialRepository.findById(cred.credentialId);
+  if (credRow) {
+    if (credRow.expiresAt !== null && credRow.expiresAt < new Date()) { continue; }
+
+    let matches: boolean;
+    if (credRow.vaultPath !== null && this.vaultClient !== null) {
+      matches = await this.vaultClient.verifySecret(clientId, credRow.credentialId, clientSecret);
+    } else {
+      matches = await verifySecret(clientSecret, credRow.secretHash);
+    }
+    if (matches) { credentialVerified = true; break; }
+  }
+}
+```
+
+**Why iterate over multiple credentials?** An agent can have multiple active
+credentials (e.g. one per service that calls it). The agent rotates credentials
+one at a time — if credential A is rotated while service X is still using it,
+service X will fail. By checking all active credentials, we allow overlapping rotation.
+
+**Why check expiry before hashing?** Bcrypt is intentionally slow (~100ms). Checking
+expiry first is a cheap early exit that avoids the bcrypt computation on expired
+credentials.
+
+---
+
+### Step 9 — Service: status and monthly limit checks
+
+**File:** `src/services/OAuth2Service.ts` lines 144–176
+
+```typescript
+if (agent.status === 'suspended') { throw new AuthorizationError(...) }
+if (agent.status === 'decommissioned') { throw new AuthorizationError(...) }
+
+const monthlyCount = await this.tokenRepository.getMonthlyCount(clientId);
+if (monthlyCount >= FREE_TIER_MAX_MONTHLY_TOKENS) { throw new FreeTierLimitError(...) }
+```
+
+**Why check status after credential verification?** We verify credentials first so
+a suspended agent with a wrong secret gets `AuthenticationError` (401) not
+`AuthorizationError` (403). This prevents leaking which agents are suspended to
+unauthenticated callers.
+
+---
+
+### Step 10 — Service: sign the JWT
+
+**File:** `src/services/OAuth2Service.ts` lines 179–190
+
+```typescript
+const jti = uuidv4();
+const payload: Omit<ITokenPayload, 'iat' | 'exp'> = { sub: clientId, client_id: clientId, scope, jti };
+const accessToken = signToken(payload, this.privateKey);
+```
+
+**File:** `src/utils/jwt.ts` lines 19–31
+
+```typescript
+export function signToken(payload: Omit<ITokenPayload, 'iat' | 'exp'>, privateKey: string): string {
+  const now = Math.floor(Date.now() / 1000);
+  const fullPayload: ITokenPayload = { ...payload, iat: now, exp: now + TOKEN_EXPIRES_IN };
+  return jwt.sign(fullPayload, privateKey, { algorithm: 'RS256' });
+}
+```
+
+**Why RS256 instead of HS256?** RS256 (RSA asymmetric) allows any consumer of the
+token to verify it using the public key without needing the private signing key.
+HS256 (HMAC symmetric) would require sharing the secret with every service that
+verifies tokens.
+
+**Why `jti` (JWT ID)?** The `jti` is a unique identifier for this specific token.
+It is used as the key in the Redis revocation list. Without `jti`, you cannot
+revoke a single token without revoking all tokens for the agent.
+
+---
+
+### Step 11 — Service: fire-and-forget operations
+
+**File:** `src/services/OAuth2Service.ts` lines 193–207
+
+```typescript
+void this.tokenRepository.incrementMonthlyCount(clientId);
+void this.auditService.logEvent(clientId, 'token.issued', 'success', ..., { scope, expiresAt });
+tokensIssuedTotal.inc({ scope });
+```
+
+**Why `void` (fire-and-forget)?** The token has been signed and is ready to return.
+Waiting for the Redis increment and audit write would add ~5–10ms to every token
+request. These operations are best-effort — if they fail, the token is still valid.
+
+**Why is the Prometheus `.inc()` call synchronous?** Prometheus counters are
+in-process memory operations — they do not write to Redis or PostgreSQL. They are
+O(1) and sub-microsecond.
+
+---
+
+### Step 12 — Response
+
+**File:** `src/controllers/TokenController.ts` lines 163–167
+
+```typescript
+res.setHeader('Cache-Control', 'no-store');
+res.setHeader('Pragma', 'no-cache');
+res.status(200).json(tokenResponse);
+```
+
+**Why `Cache-Control: no-store`?** RFC 6749 §5.1 mandates that token responses
+must not be cached. Without this header, a shared proxy or CDN could cache the
+response and replay it to another client.
+
+Final response:
+```json
+{
+  "access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...",
+  "token_type": "Bearer",
+  "expires_in": 3600,
+  "scope": "agents:read agents:write"
+}
+```
+
+---
+
+## Walkthrough 2 — Agent Registration
+
+**Request:** `POST /api/v1/agents` with Bearer token and agent data JSON body
+
+After token issuance, registering an agent is the second most common operation.
+This walkthrough shows a request that goes through all three auth middleware layers.
+
+---
+
+### Step 1 — Middleware stack
+
+**File:** `src/app.ts` lines 57–83 (same security and parsing middleware as Walkthrough 1)
+
+---
+
+### Step 2 — Route dispatch
+
+**File:** `src/routes/agents.ts` lines 22–27
+
+```typescript
+router.use(asyncHandler(authMiddleware));
+router.use(opaMiddleware);
+router.use(asyncHandler(rateLimitMiddleware));
+router.post('/', asyncHandler(agentController.registerAgent.bind(agentController)));
+```
+
+All three middleware run on every request to the agents router before the handler.
+
+---
+
+### Step 3 — Auth middleware: Bearer token verification
+
+**File:** `src/middleware/auth.ts` lines 28–77
+
+```typescript
+const authHeader = req.headers['authorization'];
+if (!authHeader || !authHeader.startsWith('Bearer ')) { throw new AuthenticationError(...) }
+
+const token = authHeader.slice(7).trim();
+const publicKey = process.env['JWT_PUBLIC_KEY'];
+let payload: ITokenPayload;
+try {
+  payload = verifyToken(token, publicKey);
+} catch (err) {
+  if (err instanceof TokenExpiredError) { throw new AuthenticationError('Token has expired.') }
+  if (err instanceof JsonWebTokenError) { throw new AuthenticationError('Token signature is invalid.') }
+}
+
+const redis = await getRedisClient();
+const revocationKey = `revoked:${payload.jti}`;
+const isRevoked = await redis.get(revocationKey);
+if (isRevoked !== null) { throw new AuthenticationError('Token has been revoked.') }
+
+req.user = payload;
+next();
+```
+
+**Why check Redis after signature verification?** Signature verification is a pure
+cryptographic operation (no I/O). If the token is expired or has a bad signature,
+there is no need to hit Redis. The fast path exits early; Redis is the slower
+secondary check.
+
+**Why `await getRedisClient()` instead of storing the client?** `getRedisClient()`
+returns the same singleton every time — the connection is created once and reused.
+The `await` is fast (no I/O after the first call).
+
+---
+
+### Step 4 — OPA middleware: scope enforcement
+
+**File:** `src/middleware/opa.ts` lines 230–257
+
+```typescript
+const input: OpaInput = {
+  method: req.method,               // "POST"
+  path: req.baseUrl + req.path,     // "/api/v1/agents"
+  scopes: req.user.scope.split(' '), // ["agents:read", "agents:write"]
+};
+
+if (!evaluate(input)) {
+  next(new AuthorizationError());
+  return;
+}
+```
+
+For `POST /api/v1/agents`, the policy requires `["agents:write"]`. If `agents:write`
+is not in the token's scope, the request is rejected with 403 before the controller
+runs.
+
+**Why reconstruct the full path with `req.baseUrl + req.path`?** The OPA policy
+uses full paths (`/api/v1/agents/:id`). Inside a nested router, `req.path` is
+relative to the router's mount point (e.g. `/`). `req.baseUrl` is the mount prefix
+(`/api/v1/agents`). Concatenating them gives the full path the policy expects.
+
+---
+
+### Step 5 — Controller: validation
+
+**File:** `src/controllers/AgentController.ts` lines 37–60
+
+```typescript
+registerAgent = async (req: Request, res: Response, next: NextFunction): Promise<void> => {
+  if (!req.user) { throw new AuthorizationError() }
+
+  const { error, value } = createAgentSchema.validate(req.body, { abortEarly: false });
+  if (error) {
+    throw new ValidationError('Request validation failed.', {
+      details: error.details.map((d) => ({ field: d.path.join('.'), reason: d.message })),
+    });
+  }
+
+  const data = value as ICreateAgentRequest;
+  const ipAddress = req.ip ?? '0.0.0.0';
+  const userAgent = req.headers['user-agent'] ?? 'unknown';
+
+  const agent = await this.agentService.registerAgent(data, ipAddress, userAgent);
+  res.status(201).json(agent);
+```
+
+**Why check `req.user` in the controller when `authMiddleware` already set it?**
+TypeScript's type system marks `req.user` as `ITokenPayload | undefined`. The check
+at line 39 narrows the type so subsequent code can use `req.user` without null
+assertions. It is a guard, not redundant authentication.
+
+**Why pass `ipAddress` and `userAgent` to the service?** The service logs audit events.
+Audit events include the client IP and User-Agent for forensic value. These values
+come from the HTTP request, which the service has no access to — so the controller
+extracts them and passes them down.
+
+---
+
+### Step 6 — Service: free-tier limit check
+
+**File:** `src/services/AgentService.ts` lines 59–65
+
+```typescript
+const currentCount = await this.agentRepository.countActive();
+if (currentCount >= FREE_TIER_MAX_AGENTS) {
+  throw new FreeTierLimitError('Free tier limit of 100 registered agents has been reached.', ...);
+}
+```
+
+**Why count before checking email uniqueness?** If the limit is reached, there is
+no point checking whether the email already exists. Doing the cheaper check (count)
+first avoids an unnecessary query.
+
+---
+
+### Step 7 — Service: email uniqueness check
+
+**File:** `src/services/AgentService.ts` lines 68–71
+
+```typescript
+const existing = await this.agentRepository.findByEmail(data.email);
+if (existing !== null) { throw new AgentAlreadyExistsError(data.email) }
+```
+
+**Why not rely on the database UNIQUE constraint?** We could, but catching a
+PostgreSQL `23505` error code in the repository would be less readable and would
+not produce a typed `AgentAlreadyExistsError` with a structured `details` field.
+The explicit check gives better error messages and keeps the repository layer clean.
+
+---
+
+### Step 8 — Repository: INSERT
+
+**File:** `src/repositories/AgentRepository.ts` lines 67–85
+
+```typescript
+async create(data: ICreateAgentRequest): Promise<IAgent> {
+  const agentId = uuidv4();
+  const result: QueryResult<AgentRow> = await this.pool.query(
+    `INSERT INTO agents (agent_id, email, agent_type, version, capabilities, owner, deployment_env, status, created_at, updated_at)
+     VALUES ($1, $2, $3, $4, $5, $6, $7, 'active', NOW(), NOW())
+     RETURNING *`,
+    [agentId, data.email, data.agentType, data.version, data.capabilities, data.owner, data.deploymentEnv],
+  );
+  return mapRowToAgent(result.rows[0]);
+}
+```
+
+**Why generate `agentId` in application code instead of relying on `gen_random_uuid()`?**
+Because we use the UUID as the OAuth 2.0 `client_id`. We need the UUID before writing
+to the database so we can use it in the audit event and the response. Having it in
+application code avoids a separate SELECT after the INSERT.
+
+**Why `RETURNING *`?** PostgreSQL's `RETURNING` clause sends back the inserted row
+in the same round trip as the INSERT. This avoids a second SELECT to fetch the
+newly created record.
+
+---
+
+### Step 9 — Service: audit event
+
+**File:** `src/services/AgentService.ts` lines 76–83
+
+```typescript
+await this.auditService.logEvent(
+  agent.agentId,
+  'agent.created',
+  'success',
+  ipAddress,
+  userAgent,
+  { agentType: agent.agentType, owner: agent.owner },
+);
+```
+
+**Why `await` here but `void` for token audit events?** Agent registration is a
+database write operation that happens once. Adding ~5ms for the audit write is
+acceptable and ensures the audit event is recorded before the 201 response is sent.
+Token issuance happens far more frequently — audit is fire-and-forget there.
+
+---
+
+### Step 10 — Response
+
+**File:** `src/controllers/AgentController.ts` line 56
+
+```typescript
+res.status(201).json(agent);
+```
+
+Returns the full `IAgent` object with HTTP 201 Created.
+
+---
+
+## Walkthrough 3 — Credential Rotation
+
+**Request:** `POST /api/v1/agents/:agentId/credentials/:credentialId/rotate`
+
+Credential rotation is the process of replacing an existing client secret with a
+new one without changing the `credentialId`. This is the recommended security
+practice — rotate periodically and rotate immediately after suspected compromise.
+
+---
+
+### Step 1 — Route dispatch
+
+**File:** `src/routes/credentials.ts` line 34
+
+```typescript
+router.post('/:credentialId/rotate', asyncHandler(credentialController.rotateCredential.bind(credentialController)));
+```
+
+The credentials router is mounted at `/api/v1/agents/:agentId/credentials` in `app.ts`.
+The full path becomes `POST /api/v1/agents/:agentId/credentials/:credentialId/rotate`.
+
+---
+
+### Step 2 — Auth middleware
+
+Same as Walkthrough 2, Step 3. Bearer token is verified via RS256 and Redis revocation check.
+`req.user` is populated with the JWT payload.
+
+---
+
+### Step 3 — OPA middleware
+
+The path `/api/v1/agents/:agentId/credentials/:credId/rotate` is normalised to
+`/api/v1/agents/:id/credentials/:credId/rotate`. The policy requires `["agents:write"]`.
+
+---
+
+### Step 4 — Controller: ownership check
+
+**File:** `src/controllers/CredentialController.ts` lines 127–137
+
+```typescript
+rotateCredential = async (req: Request, res: Response, next: NextFunction): Promise<void> => {
+  if (!req.user) { throw new AuthenticationError() }
+
+  const { agentId, credentialId } = req.params;
+
+  if (req.user.sub !== agentId) {
+    throw new AuthorizationError('You do not have permission to manage credentials for this agent.');
+  }
+```
+
+**Why check `req.user.sub !== agentId`?** An agent's token contains its own
+`agentId` as the `sub` claim. This check enforces that an agent can only manage
+its own credentials. Even if an agent has `agents:write` scope, it cannot rotate
+another agent's credentials. This is Phase 1 behaviour — there is no admin scope yet.
+
+---
+
+### Step 5 — Controller: request validation
+
+**File:** `src/controllers/CredentialController.ts` lines 139–157
+
+```typescript
+const { error, value } = generateCredentialSchema.validate(req.body ?? {}, { abortEarly: false });
+// generateCredentialSchema validates optional `expiresAt` field
+const data = value as IGenerateCredentialRequest;
+const result = await this.credentialService.rotateCredential(agentId, credentialId, data, ipAddress, userAgent);
+res.status(200).json(result);
+```
+
+**Why `req.body ?? {}`?** The rotation body is optional — an agent may rotate a
+credential without an expiry date, in which case the body may be empty. Passing
+`undefined` to Joi would cause a different error than passing `{}`.
+
+---
+
+### Step 6 — Service: existence checks
+
+**File:** `src/services/CredentialService.ts` lines 163–177
+
+```typescript
+const agent = await this.agentRepository.findById(agentId);
+if (!agent) { throw new AgentNotFoundError(agentId) }
+
+const existing = await this.credentialRepository.findById(credentialId);
+if (!existing || existing.clientId !== agentId) { throw new CredentialNotFoundError(credentialId) }
+
+if (existing.status === 'revoked') {
+  throw new CredentialAlreadyRevokedError(credentialId, existing.revokedAt?.toISOString() ?? ...);
+}
+```
+
+**Why check `existing.clientId !== agentId`?** Even though OPA restricts the agent
+to its own credentials, a malicious actor could craft a request with a valid
+`agentId` in the path but a `credentialId` belonging to another agent. This check
+ensures that a credential is only accessible to the agent it was created for.
+
+---
+
+### Step 7 — Service: generate new secret and write to Vault or bcrypt
+
+**File:** `src/services/CredentialService.ts` lines 180–192
+
+```typescript
+const expiresAt = data.expiresAt !== undefined ? new Date(data.expiresAt) : null;
+const plainSecret = generateClientSecret();  // sk_live_<64 hex chars>
+
+let updated: ICredential | null;
+
+if (this.vaultClient !== null) {
+  // Phase 2: overwrite the existing Vault secret (KV v2 creates a new version)
+  const vaultPath = await this.vaultClient.writeSecret(agentId, credentialId, plainSecret);
+  updated = await this.credentialRepository.updateVaultPath(credentialId, vaultPath, expiresAt);
+} else {
+  // Phase 1: use bcrypt
+  const newHash = await hashSecret(plainSecret);
+  updated = await this.credentialRepository.updateHash(credentialId, newHash, expiresAt);
+}
+```
+
+**Why does Vault rotation write to the same path?** Vault KV v2 is versioned — writing
+to an existing path creates a new version without overwriting previous versions.
+This preserves an audit trail in Vault itself.
+
+**Why does the Vault path stay the same after rotation?** The `vault_path` column
+stores the path, not the secret. The path is deterministic:
+`{mount}/data/agentidp/agents/{agentId}/credentials/{credentialId}`. Since the
+`credentialId` does not change on rotation, the path does not change either.
+Only the Vault version at that path changes.
+
+---
+
+### Step 8 — Repository: UPDATE the credential
+
+**File:** `src/repositories/CredentialRepository.ts` lines 180–218
+
+```typescript
+// Bcrypt path (updateHash):
+UPDATE credentials
+SET secret_hash = $1, vault_path = NULL, expires_at = $2, status = 'active', revoked_at = NULL
+WHERE credential_id = $3
+RETURNING *
+
+// Vault path (updateVaultPath):
+UPDATE credentials
+SET vault_path = $1, secret_hash = '', expires_at = $2, status = 'active', revoked_at = NULL
+WHERE credential_id = $3
+RETURNING *
+```
+
+**Why `status = 'active'` in the UPDATE?** A credential could theoretically be
+in any state when rotated. The UPDATE explicitly sets it to active. This handles
+edge cases where a revoked credential is being "un-revoked" by rotation (though
+the service layer prevents this — revoked credentials throw `CredentialAlreadyRevokedError`).
+The belt-and-suspenders approach at the SQL layer ensures data integrity.
+
+---
+
+### Step 9 — Service: audit event
+
+**File:** `src/services/CredentialService.ts` lines 199–206
+
+```typescript
+await this.auditService.logEvent(
+  agentId,
+  'credential.rotated',
+  'success',
+  ipAddress,
+  userAgent,
+  { credentialId },
+);
+```
+
+The audit event records which credential was rotated. Combined with the timestamp,
+this gives a complete rotation history for each credential.
+
+---
+
+### Step 10 — Response
+
+**File:** `src/controllers/CredentialController.ts` line 161
+
+```typescript
+res.status(200).json(result);
+```
+
+Returns `ICredentialWithSecret` — the updated credential including the new
+`clientSecret`. This is the only time the new secret is ever returned. The caller
+must store it securely.
+
+```json
+{
+  "credentialId": "d4e5f6a7-...",
+  "clientId": "a1b2c3d4-...",
+  "status": "active",
+  "clientSecret": "sk_live_4f8a2e9b...",
+  "createdAt": "2026-01-15T10:00:00Z",
+  "expiresAt": "2027-01-15T10:00:00Z",
+  "revokedAt": null
+}
+```