Files
sentryagent-idp/docs/compliance/audit-log-runbook.md
SentryAgent.ai Developer fd90b2acd1 feat(phase-3): workstream 6 — SOC 2 Type II Preparation
Implements all 22 WS6 tasks completing Phase 3 Enterprise.

Column-level encryption (AES-256-CBC, Vault-backed key) via EncryptionService
applied to credentials.secret_hash, credentials.vault_path,
webhook_subscriptions.vault_secret_path, and agent_did_keys.vault_key_path.
Backward-compatible: isEncrypted() guard skips decryption for existing
plaintext rows until next read-write cycle.

Audit chain integrity (CC7.2): AuditRepository computes SHA-256 Merkle hash
on every INSERT (hash = SHA-256(eventId+timestamp+action+outcome+agentId+orgId+prevHash)).
AuditVerificationService walks the full chain verifying hash continuity.
AuditChainVerificationJob runs hourly; sets agentidp_audit_chain_integrity
Prometheus gauge to 1 (pass) or 0 (fail).

TLS enforcement (CC6.7): TLSEnforcementMiddleware registered as first
middleware in Express stack; 301 redirect on non-https X-Forwarded-Proto
in production.

SecretsRotationJob (CC9.2): hourly scan for credentials expiring within 7
days; increments agentidp_credentials_expiring_soon_total.

ComplianceController + routes: GET /audit/verify (auth+audit:read scope,
30/min rate-limit); GET /compliance/controls (public, Cache-Control 60s).
ComplianceStatusStore: module-level map updated by jobs, consumed by controller.

Prometheus: 2 new metrics (agentidp_credentials_expiring_soon_total,
agentidp_audit_chain_integrity); 6 alerting rules in alerts.yml.

Compliance docs: soc2-controls-matrix.md, encryption-runbook.md,
audit-log-runbook.md, incident-response.md, secrets-rotation.md.

Tests: 557 unit tests passing (35 suites); 26 new tests (EncryptionService,
AuditVerificationService); 19 compliance integration tests. TypeScript clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 00:41:53 +00:00

5.6 KiB

Audit Log Chain Verification Runbook — SentryAgent.ai AgentIdP

Control: SOC 2 CC7.2 — Audit Log Integrity Service: src/services/AuditVerificationService.ts Job: src/jobs/AuditChainVerificationJob.ts Endpoint: GET /api/v1/audit/verify


Overview

Every audit event in the audit_events PostgreSQL table is linked to the previous one via a SHA-256 hash chain. Each event stores:

  • hash — SHA-256 of (eventId + timestamp.toISOString() + action + outcome + agentId + organizationId + previousHash)
  • previous_hash — the hash of the immediately preceding event (ordered by timestamp ASC, event_id ASC)

The first event in the chain uses previous_hash = '' (empty string sentinel).

A PostgreSQL trigger (trg_audit_events_immutable) prevents UPDATE and DELETE operations on audit_events, making the log tamper-evident at the database level.


Running GET /audit/verify

Full chain verification (no date range)

# Requires Bearer token with audit:read scope
curl -s -H "Authorization: Bearer <token>" \
  "https://api.sentryagent.ai/v1/audit/verify"

Response (chain intact):

{
  "verified": true,
  "checkedCount": 18504,
  "brokenAtEventId": null
}

Response (chain break detected):

{
  "verified": false,
  "checkedCount": 1203,
  "brokenAtEventId": "c4d5e6f7-a8b9-0123-cdef-456789012345"
}

Date-ranged verification

curl -s -H "Authorization: Bearer <token>" \
  "https://api.sentryagent.ai/v1/audit/verify?fromDate=2026-03-01T00:00:00.000Z&toDate=2026-03-31T23:59:59.999Z"

Interpreting the response

Field Meaning
verified: true All events in the checked range maintain valid hash chain linkage
verified: false At least one chain break detected — see brokenAtEventId
checkedCount Number of events examined (0 = no events in range)
brokenAtEventId UUID of the first event where the chain fails (null if verified)
fromDate / toDate Echo of the date range parameters (only present if supplied)

AuditChainVerificationJob

The AuditChainVerificationJob runs automatically in the background every hour (default). Configure the interval via AUDIT_CHAIN_VERIFICATION_INTERVAL_MS (milliseconds).

On each tick it calls verifyChain() and:

  • Sets Prometheus gauge agentidp_audit_chain_integrity to 1 (passing)
  • Updates ComplianceStatusStore with CC7.2 = passing

If verification fails:

  • Sets gauge to 0
  • Updates ComplianceStatusStore with CC7.2 = failing
  • Prometheus alert AuditChainIntegrityFailed fires immediately (severity: critical)
  • Application logs: [AuditChainVerificationJob] Chain BROKEN at event <uuid>

What to Do When brokenAtEventId is Returned

Step 1: Preserve Evidence

Immediately capture the full state of the audit log for forensic analysis:

-- Export all events around the break point
SELECT event_id, timestamp, action, outcome, agent_id, organization_id, hash, previous_hash
FROM audit_events
WHERE timestamp >= (
  SELECT timestamp - INTERVAL '1 hour'
  FROM audit_events WHERE event_id = '<brokenAtEventId>'
)
ORDER BY timestamp ASC, event_id ASC;

Save the output to a secure, immutable location (e.g. S3 with object locking).

Step 2: Identify the Break Type

Compare the recomputed hash for the broken event with its stored hash:

# Using Node.js
node -e "
const crypto = require('crypto');
const eventId = '<event_id>';
const timestamp = '<timestamp_from_db>';
const action = '<action>';
const outcome = '<outcome>';
const agentId = '<agent_id>';
const orgId = '<organization_id>';
const prevHash = '<previous_hash_from_db>';
const expected = crypto.createHash('sha256')
  .update(eventId + new Date(timestamp).toISOString() + action + outcome + agentId + orgId + prevHash)
  .digest('hex');
console.log('Expected hash:', expected);
console.log('Stored hash: <hash_from_db>');
console.log('Match:', expected === '<hash_from_db>');
"

Possible break types:

  • Hash mismatch only — event data was modified after insertion
  • previous_hash mismatch — an event was inserted/deleted before this event in the chain
  • Both mismatched — multiple modifications or an injection attack

Step 3: Escalate

A chain break is a critical security incident. Immediately:

  1. Notify the security team and CISO
  2. Engage incident response procedure (docs/compliance/incident-response.md — Audit Chain Integrity Failure section)
  3. Do NOT attempt to "fix" the hash — preserve the broken state as evidence
  4. Consider temporarily suspending API access pending investigation
  5. Notify affected customers per data breach notification obligations

Step 4: Forensic Investigation

Using PostgreSQL audit logs, Vault audit logs, and application logs:

  • Identify which application process or database connection modified the row
  • Correlate with access logs and authentication events
  • Determine the extent of the compromise (single row vs. systematic)

Verification Rate Limiting

GET /audit/verify is rate-limited to 30 requests/minute per client_id. For continuous monitoring, use AuditChainVerificationJob (background job, no rate limit) and poll GET /compliance/controls instead.


SOC 2 Evidence Package

For auditors, provide:

  1. GET /audit/verify response (full chain, no date filter) — save as JSON
  2. Prometheus metric export: agentidp_audit_chain_integrity time series (30/60/90 days)
  3. PostgreSQL trigger definition: \d+ audit_events in psql
  4. src/db/migrations/020_add_audit_chain_columns.sql — shows immutability trigger DDL
  5. docs/openapi/compliance.yaml — endpoint specification