Files

SentryAgent.ai Developer fd90b2acd1 feat(phase-3): workstream 6 — SOC 2 Type II Preparation

Implements all 22 WS6 tasks completing Phase 3 Enterprise.

Column-level encryption (AES-256-CBC, Vault-backed key) via EncryptionService
applied to credentials.secret_hash, credentials.vault_path,
webhook_subscriptions.vault_secret_path, and agent_did_keys.vault_key_path.
Backward-compatible: isEncrypted() guard skips decryption for existing
plaintext rows until next read-write cycle.

Audit chain integrity (CC7.2): AuditRepository computes SHA-256 Merkle hash
on every INSERT (hash = SHA-256(eventId+timestamp+action+outcome+agentId+orgId+prevHash)).
AuditVerificationService walks the full chain verifying hash continuity.
AuditChainVerificationJob runs hourly; sets agentidp_audit_chain_integrity
Prometheus gauge to 1 (pass) or 0 (fail).

TLS enforcement (CC6.7): TLSEnforcementMiddleware registered as first
middleware in Express stack; 301 redirect on non-https X-Forwarded-Proto
in production.

SecretsRotationJob (CC9.2): hourly scan for credentials expiring within 7
days; increments agentidp_credentials_expiring_soon_total.

ComplianceController + routes: GET /audit/verify (auth+audit:read scope,
30/min rate-limit); GET /compliance/controls (public, Cache-Control 60s).
ComplianceStatusStore: module-level map updated by jobs, consumed by controller.

Prometheus: 2 new metrics (agentidp_credentials_expiring_soon_total,
agentidp_audit_chain_integrity); 6 alerting rules in alerts.yml.

Compliance docs: soc2-controls-matrix.md, encryption-runbook.md,
audit-log-runbook.md, incident-response.md, secrets-rotation.md.

Tests: 557 unit tests passing (35 suites); 26 new tests (EncryptionService,
AuditVerificationService); 19 compliance integration tests. TypeScript clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-31 00:41:53 +00:00

5.6 KiB

Raw Blame History

Audit Log Chain Verification Runbook — SentryAgent.ai AgentIdP

Control: SOC 2 CC7.2 — Audit Log Integrity Service: src/services/AuditVerificationService.ts Job: src/jobs/AuditChainVerificationJob.ts Endpoint: GET /api/v1/audit/verify

Overview

Every audit event in the audit_events PostgreSQL table is linked to the previous one via a SHA-256 hash chain. Each event stores:

hash — SHA-256 of (eventId + timestamp.toISOString() + action + outcome + agentId + organizationId + previousHash)
previous_hash — the hash of the immediately preceding event (ordered by timestamp ASC, event_id ASC)

The first event in the chain uses previous_hash = '' (empty string sentinel).

A PostgreSQL trigger (trg_audit_events_immutable) prevents UPDATE and DELETE operations on audit_events, making the log tamper-evident at the database level.

Running GET /audit/verify

Full chain verification (no date range)

# Requires Bearer token with audit:read scope
curl -s -H "Authorization: Bearer <token>" \
  "https://api.sentryagent.ai/v1/audit/verify"

Response (chain intact):

{
  "verified": true,
  "checkedCount": 18504,
  "brokenAtEventId": null
}

Response (chain break detected):

{
  "verified": false,
  "checkedCount": 1203,
  "brokenAtEventId": "c4d5e6f7-a8b9-0123-cdef-456789012345"
}

Date-ranged verification

curl -s -H "Authorization: Bearer <token>" \
  "https://api.sentryagent.ai/v1/audit/verify?fromDate=2026-03-01T00:00:00.000Z&toDate=2026-03-31T23:59:59.999Z"

Interpreting the response

Field	Meaning
`verified: true`	All events in the checked range maintain valid hash chain linkage
`verified: false`	At least one chain break detected — see `brokenAtEventId`
`checkedCount`	Number of events examined (0 = no events in range)
`brokenAtEventId`	UUID of the first event where the chain fails (`null` if verified)
`fromDate` / `toDate`	Echo of the date range parameters (only present if supplied)

AuditChainVerificationJob

The AuditChainVerificationJob runs automatically in the background every hour (default). Configure the interval via AUDIT_CHAIN_VERIFICATION_INTERVAL_MS (milliseconds).

On each tick it calls verifyChain() and:

Sets Prometheus gauge agentidp_audit_chain_integrity to 1 (passing)
Updates ComplianceStatusStore with CC7.2 = passing

If verification fails:

Sets gauge to 0
Updates ComplianceStatusStore with CC7.2 = failing
Prometheus alert AuditChainIntegrityFailed fires immediately (severity: critical)
Application logs: [AuditChainVerificationJob] Chain BROKEN at event <uuid>

What to Do When `brokenAtEventId` is Returned

Step 1: Preserve Evidence

Immediately capture the full state of the audit log for forensic analysis:

-- Export all events around the break point
SELECT event_id, timestamp, action, outcome, agent_id, organization_id, hash, previous_hash
FROM audit_events
WHERE timestamp >= (
  SELECT timestamp - INTERVAL '1 hour'
  FROM audit_events WHERE event_id = '<brokenAtEventId>'
)
ORDER BY timestamp ASC, event_id ASC;

Save the output to a secure, immutable location (e.g. S3 with object locking).

Step 2: Identify the Break Type

Compare the recomputed hash for the broken event with its stored hash:

# Using Node.js
node -e "
const crypto = require('crypto');
const eventId = '<event_id>';
const timestamp = '<timestamp_from_db>';
const action = '<action>';
const outcome = '<outcome>';
const agentId = '<agent_id>';
const orgId = '<organization_id>';
const prevHash = '<previous_hash_from_db>';
const expected = crypto.createHash('sha256')
  .update(eventId + new Date(timestamp).toISOString() + action + outcome + agentId + orgId + prevHash)
  .digest('hex');
console.log('Expected hash:', expected);
console.log('Stored hash: <hash_from_db>');
console.log('Match:', expected === '<hash_from_db>');
"

Possible break types:

Hash mismatch only — event data was modified after insertion
previous_hash mismatch — an event was inserted/deleted before this event in the chain
Both mismatched — multiple modifications or an injection attack

Step 3: Escalate

A chain break is a critical security incident. Immediately:

Notify the security team and CISO
Engage incident response procedure (docs/compliance/incident-response.md — Audit Chain Integrity Failure section)
Do NOT attempt to "fix" the hash — preserve the broken state as evidence
Consider temporarily suspending API access pending investigation
Notify affected customers per data breach notification obligations

Step 4: Forensic Investigation

Using PostgreSQL audit logs, Vault audit logs, and application logs:

Identify which application process or database connection modified the row
Correlate with access logs and authentication events
Determine the extent of the compromise (single row vs. systematic)

Verification Rate Limiting

GET /audit/verify is rate-limited to 30 requests/minute per client_id. For continuous monitoring, use AuditChainVerificationJob (background job, no rate limit) and poll GET /compliance/controls instead.

SOC 2 Evidence Package

For auditors, provide:

GET /audit/verify response (full chain, no date filter) — save as JSON
Prometheus metric export: agentidp_audit_chain_integrity time series (30/60/90 days)
PostgreSQL trigger definition: \d+ audit_events in psql
src/db/migrations/020_add_audit_chain_columns.sql — shows immutability trigger DDL
docs/openapi/compliance.yaml — endpoint specification

5.6 KiB Raw Blame History