feat(phase-3): workstream 6 — SOC 2 Type II Preparation
Implements all 22 WS6 tasks completing Phase 3 Enterprise. Column-level encryption (AES-256-CBC, Vault-backed key) via EncryptionService applied to credentials.secret_hash, credentials.vault_path, webhook_subscriptions.vault_secret_path, and agent_did_keys.vault_key_path. Backward-compatible: isEncrypted() guard skips decryption for existing plaintext rows until next read-write cycle. Audit chain integrity (CC7.2): AuditRepository computes SHA-256 Merkle hash on every INSERT (hash = SHA-256(eventId+timestamp+action+outcome+agentId+orgId+prevHash)). AuditVerificationService walks the full chain verifying hash continuity. AuditChainVerificationJob runs hourly; sets agentidp_audit_chain_integrity Prometheus gauge to 1 (pass) or 0 (fail). TLS enforcement (CC6.7): TLSEnforcementMiddleware registered as first middleware in Express stack; 301 redirect on non-https X-Forwarded-Proto in production. SecretsRotationJob (CC9.2): hourly scan for credentials expiring within 7 days; increments agentidp_credentials_expiring_soon_total. ComplianceController + routes: GET /audit/verify (auth+audit:read scope, 30/min rate-limit); GET /compliance/controls (public, Cache-Control 60s). ComplianceStatusStore: module-level map updated by jobs, consumed by controller. Prometheus: 2 new metrics (agentidp_credentials_expiring_soon_total, agentidp_audit_chain_integrity); 6 alerting rules in alerts.yml. Compliance docs: soc2-controls-matrix.md, encryption-runbook.md, audit-log-runbook.md, incident-response.md, secrets-rotation.md. Tests: 557 unit tests passing (35 suites); 26 new tests (EncryptionService, AuditVerificationService); 19 compliance integration tests. TypeScript clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
172
docs/compliance/audit-log-runbook.md
Normal file
172
docs/compliance/audit-log-runbook.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Audit Log Chain Verification Runbook — SentryAgent.ai AgentIdP
|
||||
|
||||
**Control:** SOC 2 CC7.2 — Audit Log Integrity
|
||||
**Service:** `src/services/AuditVerificationService.ts`
|
||||
**Job:** `src/jobs/AuditChainVerificationJob.ts`
|
||||
**Endpoint:** `GET /api/v1/audit/verify`
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Every audit event in the `audit_events` PostgreSQL table is linked to the previous one
|
||||
via a SHA-256 hash chain. Each event stores:
|
||||
|
||||
- `hash` — SHA-256 of `(eventId + timestamp.toISOString() + action + outcome + agentId + organizationId + previousHash)`
|
||||
- `previous_hash` — the `hash` of the immediately preceding event (ordered by `timestamp ASC, event_id ASC`)
|
||||
|
||||
The first event in the chain uses `previous_hash = ''` (empty string sentinel).
|
||||
|
||||
A PostgreSQL trigger (`trg_audit_events_immutable`) prevents UPDATE and DELETE operations
|
||||
on `audit_events`, making the log tamper-evident at the database level.
|
||||
|
||||
---
|
||||
|
||||
## Running GET /audit/verify
|
||||
|
||||
### Full chain verification (no date range)
|
||||
|
||||
```bash
|
||||
# Requires Bearer token with audit:read scope
|
||||
curl -s -H "Authorization: Bearer <token>" \
|
||||
"https://api.sentryagent.ai/v1/audit/verify"
|
||||
```
|
||||
|
||||
**Response (chain intact):**
|
||||
```json
|
||||
{
|
||||
"verified": true,
|
||||
"checkedCount": 18504,
|
||||
"brokenAtEventId": null
|
||||
}
|
||||
```
|
||||
|
||||
**Response (chain break detected):**
|
||||
```json
|
||||
{
|
||||
"verified": false,
|
||||
"checkedCount": 1203,
|
||||
"brokenAtEventId": "c4d5e6f7-a8b9-0123-cdef-456789012345"
|
||||
}
|
||||
```
|
||||
|
||||
### Date-ranged verification
|
||||
|
||||
```bash
|
||||
curl -s -H "Authorization: Bearer <token>" \
|
||||
"https://api.sentryagent.ai/v1/audit/verify?fromDate=2026-03-01T00:00:00.000Z&toDate=2026-03-31T23:59:59.999Z"
|
||||
```
|
||||
|
||||
### Interpreting the response
|
||||
|
||||
| Field | Meaning |
|
||||
|---|---|
|
||||
| `verified: true` | All events in the checked range maintain valid hash chain linkage |
|
||||
| `verified: false` | At least one chain break detected — see `brokenAtEventId` |
|
||||
| `checkedCount` | Number of events examined (0 = no events in range) |
|
||||
| `brokenAtEventId` | UUID of the first event where the chain fails (`null` if verified) |
|
||||
| `fromDate` / `toDate` | Echo of the date range parameters (only present if supplied) |
|
||||
|
||||
---
|
||||
|
||||
## AuditChainVerificationJob
|
||||
|
||||
The `AuditChainVerificationJob` runs automatically in the background every hour (default).
|
||||
Configure the interval via `AUDIT_CHAIN_VERIFICATION_INTERVAL_MS` (milliseconds).
|
||||
|
||||
On each tick it calls `verifyChain()` and:
|
||||
- Sets Prometheus gauge `agentidp_audit_chain_integrity` to **1** (passing)
|
||||
- Updates `ComplianceStatusStore` with `CC7.2 = passing`
|
||||
|
||||
If verification fails:
|
||||
- Sets gauge to **0**
|
||||
- Updates `ComplianceStatusStore` with `CC7.2 = failing`
|
||||
- Prometheus alert `AuditChainIntegrityFailed` fires immediately (severity: critical)
|
||||
- Application logs: `[AuditChainVerificationJob] Chain BROKEN at event <uuid>`
|
||||
|
||||
---
|
||||
|
||||
## What to Do When `brokenAtEventId` is Returned
|
||||
|
||||
### Step 1: Preserve Evidence
|
||||
|
||||
Immediately capture the full state of the audit log for forensic analysis:
|
||||
|
||||
```sql
|
||||
-- Export all events around the break point
|
||||
SELECT event_id, timestamp, action, outcome, agent_id, organization_id, hash, previous_hash
|
||||
FROM audit_events
|
||||
WHERE timestamp >= (
|
||||
SELECT timestamp - INTERVAL '1 hour'
|
||||
FROM audit_events WHERE event_id = '<brokenAtEventId>'
|
||||
)
|
||||
ORDER BY timestamp ASC, event_id ASC;
|
||||
```
|
||||
|
||||
Save the output to a secure, immutable location (e.g. S3 with object locking).
|
||||
|
||||
### Step 2: Identify the Break Type
|
||||
|
||||
Compare the recomputed hash for the broken event with its stored hash:
|
||||
|
||||
```bash
|
||||
# Using Node.js
|
||||
node -e "
|
||||
const crypto = require('crypto');
|
||||
const eventId = '<event_id>';
|
||||
const timestamp = '<timestamp_from_db>';
|
||||
const action = '<action>';
|
||||
const outcome = '<outcome>';
|
||||
const agentId = '<agent_id>';
|
||||
const orgId = '<organization_id>';
|
||||
const prevHash = '<previous_hash_from_db>';
|
||||
const expected = crypto.createHash('sha256')
|
||||
.update(eventId + new Date(timestamp).toISOString() + action + outcome + agentId + orgId + prevHash)
|
||||
.digest('hex');
|
||||
console.log('Expected hash:', expected);
|
||||
console.log('Stored hash: <hash_from_db>');
|
||||
console.log('Match:', expected === '<hash_from_db>');
|
||||
"
|
||||
```
|
||||
|
||||
Possible break types:
|
||||
- **Hash mismatch only** — event data was modified after insertion
|
||||
- **previous_hash mismatch** — an event was inserted/deleted before this event in the chain
|
||||
- **Both mismatched** — multiple modifications or an injection attack
|
||||
|
||||
### Step 3: Escalate
|
||||
|
||||
A chain break is a **critical security incident**. Immediately:
|
||||
|
||||
1. Notify the security team and CISO
|
||||
2. Engage incident response procedure (`docs/compliance/incident-response.md` — Audit Chain Integrity Failure section)
|
||||
3. Do NOT attempt to "fix" the hash — preserve the broken state as evidence
|
||||
4. Consider temporarily suspending API access pending investigation
|
||||
5. Notify affected customers per data breach notification obligations
|
||||
|
||||
### Step 4: Forensic Investigation
|
||||
|
||||
Using PostgreSQL audit logs, Vault audit logs, and application logs:
|
||||
- Identify which application process or database connection modified the row
|
||||
- Correlate with access logs and authentication events
|
||||
- Determine the extent of the compromise (single row vs. systematic)
|
||||
|
||||
---
|
||||
|
||||
## Verification Rate Limiting
|
||||
|
||||
`GET /audit/verify` is rate-limited to **30 requests/minute** per `client_id`.
|
||||
For continuous monitoring, use `AuditChainVerificationJob` (background job, no rate limit)
|
||||
and poll `GET /compliance/controls` instead.
|
||||
|
||||
---
|
||||
|
||||
## SOC 2 Evidence Package
|
||||
|
||||
For auditors, provide:
|
||||
|
||||
1. `GET /audit/verify` response (full chain, no date filter) — save as JSON
|
||||
2. Prometheus metric export: `agentidp_audit_chain_integrity` time series (30/60/90 days)
|
||||
3. PostgreSQL trigger definition: `\d+ audit_events` in psql
|
||||
4. `src/db/migrations/020_add_audit_chain_columns.sql` — shows immutability trigger DDL
|
||||
5. `docs/openapi/compliance.yaml` — endpoint specification
|
||||
159
docs/compliance/encryption-runbook.md
Normal file
159
docs/compliance/encryption-runbook.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# Encryption Key Rotation Runbook — SentryAgent.ai AgentIdP
|
||||
|
||||
**Control:** SOC 2 CC6.1 — Encryption at Rest
|
||||
**Service:** `src/services/EncryptionService.ts`
|
||||
**Vault path:** Configured via `ENCRYPTION_KEY_VAULT_PATH` env var (default: `secret/data/agentidp/encryption-key`)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
AgentIdP uses AES-256-CBC column-level encryption for sensitive PostgreSQL columns.
|
||||
The encryption key is a 64-character hex string (32 bytes) stored in HashiCorp Vault.
|
||||
The `EncryptionService` fetches the key once and caches it in process memory.
|
||||
|
||||
Encrypted format: `base64(IV):base64(ciphertext)` where IV is 16 random bytes per encryption call.
|
||||
|
||||
---
|
||||
|
||||
## Key Rotation Procedure
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Access to HashiCorp Vault with write permissions to the encryption key path
|
||||
- Access to the production application environment (to trigger restart)
|
||||
- At least one backup of the current key stored securely offline
|
||||
|
||||
### Step 1: Generate a New Key
|
||||
|
||||
Generate a cryptographically strong 32-byte (64-character hex) key:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32
|
||||
# Example output: a1b2c3d4e5f6... (64 hex chars)
|
||||
```
|
||||
|
||||
Record the new key securely.
|
||||
|
||||
### Step 2: Backup the Current Key
|
||||
|
||||
Before overwriting, read and securely store the current key:
|
||||
|
||||
```bash
|
||||
vault kv get -field=encryptionKey secret/agentidp/encryption-key > /secure/backup/encryption-key-$(date +%Y%m%d).txt
|
||||
```
|
||||
|
||||
Store in a hardware security module (HSM) or offline key store.
|
||||
|
||||
### Step 3: Write the New Key to Vault
|
||||
|
||||
```bash
|
||||
vault kv put secret/agentidp/encryption-key encryptionKey="<new-64-char-hex-key>"
|
||||
```
|
||||
|
||||
Verify the write:
|
||||
|
||||
```bash
|
||||
vault kv get secret/agentidp/encryption-key
|
||||
```
|
||||
|
||||
Confirm the `encryptionKey` field contains exactly 64 hex characters.
|
||||
|
||||
### Step 4: Restart the Application
|
||||
|
||||
The `EncryptionService` caches the key in process memory. A restart forces a re-fetch from Vault:
|
||||
|
||||
```bash
|
||||
# Kubernetes rolling restart
|
||||
kubectl rollout restart deployment/agentidp
|
||||
|
||||
# Docker Compose
|
||||
docker-compose restart agentidp
|
||||
|
||||
# PM2
|
||||
pm2 restart agentidp
|
||||
```
|
||||
|
||||
### Step 5: Verify Key Pick-Up
|
||||
|
||||
Check the application logs for:
|
||||
|
||||
```
|
||||
[AgentIdP] EncryptionService enabled — sensitive columns encrypted at rest (SOC 2 CC6.1)
|
||||
```
|
||||
|
||||
Call the compliance controls endpoint to confirm the control is passing:
|
||||
|
||||
```bash
|
||||
curl -s https://api.sentryagent.ai/v1/compliance/controls | jq '.controls[] | select(.id == "CC6.1")'
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```json
|
||||
{ "id": "CC6.1", "name": "Encryption at Rest", "status": "passing", "lastChecked": "..." }
|
||||
```
|
||||
|
||||
### Step 6: Re-encryption of Existing Rows
|
||||
|
||||
Existing rows encrypted with the old key will fail to decrypt after key rotation.
|
||||
Re-encryption happens lazily: the next time each row is read and re-written (e.g. credential rotation,
|
||||
webhook update), the application will decrypt with the old key and re-encrypt with the new one.
|
||||
|
||||
For immediate full re-encryption, use the re-encryption script:
|
||||
|
||||
```bash
|
||||
# Run the re-encryption migration script (reads old key from backup, encrypts with new key)
|
||||
# Note: This script requires both old and new keys to be available
|
||||
ts-node scripts/reencrypt-columns.ts --old-key-file /secure/backup/encryption-key-<date>.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Emergency Rollback
|
||||
|
||||
If the new key causes issues (e.g. test failures, decryption errors), roll back:
|
||||
|
||||
### Step 1: Restore Old Key to Vault
|
||||
|
||||
```bash
|
||||
vault kv put secret/agentidp/encryption-key encryptionKey="<old-64-char-hex-key-from-backup>"
|
||||
```
|
||||
|
||||
### Step 2: Restart the Application
|
||||
|
||||
```bash
|
||||
kubectl rollout restart deployment/agentidp
|
||||
```
|
||||
|
||||
### Step 3: Verify Recovery
|
||||
|
||||
```bash
|
||||
curl -s https://api.sentryagent.ai/v1/compliance/controls | jq '.controls[] | select(.id == "CC6.1")'
|
||||
```
|
||||
|
||||
### Step 4: Investigate Root Cause
|
||||
|
||||
Review application logs for `AES-256-CBC decryption failed` errors and audit the cause before
|
||||
reattempting rotation.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely Cause | Resolution |
|
||||
|---|---|---|
|
||||
| `Invalid encryption key ... expected a 64-character hex string` | Key in Vault is wrong length or encoding | Re-write correct key to Vault, restart |
|
||||
| `AES-256-CBC decryption failed — possible key mismatch` | Key rotated but rows still encrypted with old key | Rollback to old key, then migrate properly |
|
||||
| `CC6.1` status shows `unknown` | Vault unreachable, key fetch failed | Check Vault connectivity, `VAULT_ADDR`, `VAULT_TOKEN` |
|
||||
|
||||
---
|
||||
|
||||
## Audit Evidence
|
||||
|
||||
After rotation, record the following for SOC 2 evidence:
|
||||
|
||||
- Date of rotation
|
||||
- Who performed the rotation (approver + executor)
|
||||
- Vault audit log entry confirming the key write
|
||||
- Application log confirming EncryptionService initialised with new key
|
||||
- `GET /compliance/controls` response showing CC6.1 = passing
|
||||
229
docs/compliance/incident-response.md
Normal file
229
docs/compliance/incident-response.md
Normal file
@@ -0,0 +1,229 @@
|
||||
# Incident Response Runbook — SentryAgent.ai AgentIdP
|
||||
|
||||
**Owner:** Security Engineering
|
||||
**Last updated:** 2026-03-31
|
||||
**Applies to:** Production AgentIdP deployments
|
||||
|
||||
This runbook covers the four incident types most relevant to SOC 2 Type II compliance monitoring.
|
||||
|
||||
---
|
||||
|
||||
## 1. Auth Failure Spike
|
||||
|
||||
### Detection
|
||||
|
||||
**Prometheus alert:** `AuthFailureSpike`
|
||||
```yaml
|
||||
expr: rate(agentidp_http_requests_total{status_code="401"}[5m]) > 0.5
|
||||
for: 2m
|
||||
severity: warning
|
||||
```
|
||||
|
||||
Triggers when the rate of HTTP 401 responses exceeds 0.5 per second sustained over 2 minutes.
|
||||
|
||||
### Immediate Actions
|
||||
|
||||
1. Acknowledge the alert in PagerDuty / alerting system
|
||||
2. Check whether the spike correlates with a scheduled process (e.g. batch agent key rotation, deployment)
|
||||
3. Check Prometheus dashboard for the geographic distribution of the failing requests
|
||||
|
||||
### Investigation Steps
|
||||
|
||||
1. **Identify source agents:**
|
||||
```bash
|
||||
# Query audit log for recent auth failures
|
||||
curl -s -H "Authorization: Bearer <admin-token>" \
|
||||
"https://api.sentryagent.ai/v1/audit?action=auth.failed&limit=100"
|
||||
```
|
||||
|
||||
2. **Check for brute-force patterns:**
|
||||
Look for repeated failures from the same `client_id` or IP address.
|
||||
|
||||
3. **Check if an agent's credentials expired:**
|
||||
```bash
|
||||
# Look for expired credentials
|
||||
psql "$DATABASE_URL" -c "
|
||||
SELECT credential_id, client_id, expires_at
|
||||
FROM credentials
|
||||
WHERE status = 'active' AND expires_at < NOW()
|
||||
ORDER BY expires_at DESC LIMIT 20;"
|
||||
```
|
||||
|
||||
4. **Check for key compromise signals:**
|
||||
- Multiple agents failing simultaneously → possible key store issue
|
||||
- Single agent with high failure rate → possible credential stuffing or misconfiguration
|
||||
|
||||
### Escalation Path
|
||||
|
||||
- **Warning (< 2 req/s):** Engineering on-call investigates within 1 hour
|
||||
- **Critical (> 2 req/s sustained):** CISO notified, potential account compromise investigation
|
||||
- **If credential compromise confirmed:** Revoke affected credentials immediately via `POST /agents/:id/credentials/:credId/revoke`
|
||||
|
||||
---
|
||||
|
||||
## 2. Anomalous Token Issuance
|
||||
|
||||
### Detection
|
||||
|
||||
**Prometheus alert:** `AnomalousTokenIssuance`
|
||||
```yaml
|
||||
expr: rate(agentidp_tokens_issued_total[5m]) > 10
|
||||
for: 5m
|
||||
severity: warning
|
||||
```
|
||||
|
||||
Triggers when token issuance rate exceeds 10 per second for 5 continuous minutes.
|
||||
|
||||
### Immediate Actions
|
||||
|
||||
1. Acknowledge the alert
|
||||
2. Determine if a legitimate mass-scale operation is underway (e.g. new customer onboarding, load test)
|
||||
3. Check the `scope` label breakdown on `agentidp_tokens_issued_total` to identify what scopes are being requested
|
||||
|
||||
### Investigation Steps
|
||||
|
||||
1. **Identify top issuing agents:**
|
||||
```bash
|
||||
# Query audit log for recent token issuances
|
||||
curl -s -H "Authorization: Bearer <admin-token>" \
|
||||
"https://api.sentryagent.ai/v1/audit?action=token.issued&limit=100"
|
||||
```
|
||||
|
||||
2. **Check monthly token budget:**
|
||||
Each agent is limited to 10,000 tokens/month (free tier). A single agent hitting the limit may indicate automation abuse.
|
||||
|
||||
3. **Check for abnormal scope combinations:**
|
||||
If tokens are being issued with `admin:orgs` or `audit:read` at high volume, this warrants immediate investigation.
|
||||
|
||||
4. **Check for valid business reason:**
|
||||
Contact the organization owner for the top-issuing agents.
|
||||
|
||||
### Escalation Path
|
||||
|
||||
- **Warning:** Engineering on-call investigates within 4 hours
|
||||
- **If compromise suspected:** Revoke affected agent tokens via Redis revocation list, rotate credentials
|
||||
- **If systematic abuse confirmed:** Suspend the issuing agent(s) via `PATCH /agents/:id` with `status: suspended`
|
||||
|
||||
---
|
||||
|
||||
## 3. Audit Chain Integrity Failure
|
||||
|
||||
### Detection
|
||||
|
||||
**Prometheus alert:** `AuditChainIntegrityFailed`
|
||||
```yaml
|
||||
expr: agentidp_audit_chain_integrity == 0
|
||||
for: 0m
|
||||
severity: critical
|
||||
```
|
||||
|
||||
Fires immediately when `AuditChainVerificationJob` detects a break in the audit event hash chain.
|
||||
This is a **CRITICAL** security event — possible evidence of log tampering.
|
||||
|
||||
### Immediate Actions
|
||||
|
||||
1. **Do NOT attempt to repair the broken chain** — preserve all evidence
|
||||
2. Notify CISO and security team immediately
|
||||
3. Page the on-call security engineer with P0 priority
|
||||
4. Capture the current state:
|
||||
```bash
|
||||
curl -s -H "Authorization: Bearer <audit-token>" \
|
||||
"https://api.sentryagent.ai/v1/audit/verify" | tee /secure/incident-$(date +%Y%m%d-%H%M).json
|
||||
```
|
||||
|
||||
### Investigation Steps
|
||||
|
||||
1. **Determine the broken event:**
|
||||
The `brokenAtEventId` field in the `/audit/verify` response identifies the first broken event.
|
||||
|
||||
2. **Forensic analysis:**
|
||||
Follow the steps in `docs/compliance/audit-log-runbook.md` — "What to Do When brokenAtEventId is Returned".
|
||||
|
||||
3. **Check database access logs:**
|
||||
Review PostgreSQL `pg_stat_activity` and connection logs for unauthorized direct DB access.
|
||||
|
||||
4. **Check application logs:**
|
||||
Look for any errors from the immutability trigger (`audit_events_immutable`).
|
||||
|
||||
5. **Check Vault audit logs:**
|
||||
Review whether any encryption key access was abnormal.
|
||||
|
||||
### Escalation Path
|
||||
|
||||
- **Immediate:** CISO + Legal + Security Engineering
|
||||
- **Within 1 hour:** Begin forensic preservation per incident response plan
|
||||
- **Within 24 hours:** Determine scope of compromise and notification obligations
|
||||
- **Customer notification:** Per contractual and regulatory obligations (GDPR, SOC 2 requirements)
|
||||
|
||||
---
|
||||
|
||||
## 4. Webhook Dead-Letter Accumulation
|
||||
|
||||
### Detection
|
||||
|
||||
**Prometheus alert:** `WebhookDeadLetterAccumulating`
|
||||
```yaml
|
||||
expr: increase(agentidp_webhook_dead_letters_total[1h]) > 10
|
||||
for: 0m
|
||||
severity: critical
|
||||
```
|
||||
|
||||
Fires when more than 10 webhook deliveries reach dead-letter status within an hour.
|
||||
|
||||
### Immediate Actions
|
||||
|
||||
1. Acknowledge the alert
|
||||
2. Check which `organization_id` labels are accumulating dead-letters:
|
||||
```bash
|
||||
# Prometheus query: top organizations by dead-letter rate
|
||||
# agentidp_webhook_dead_letters_total (by organization_id)
|
||||
```
|
||||
|
||||
3. Check if the destination endpoints are reachable:
|
||||
```bash
|
||||
curl -I https://<webhook-destination-url>/
|
||||
```
|
||||
|
||||
### Investigation Steps
|
||||
|
||||
1. **List affected webhook subscriptions:**
|
||||
```bash
|
||||
# Query delivery records for dead-letter status
|
||||
psql "$DATABASE_URL" -c "
|
||||
SELECT s.id, s.organization_id, s.url, COUNT(d.id) AS dead_letters
|
||||
FROM webhook_subscriptions s
|
||||
JOIN webhook_deliveries d ON d.subscription_id = s.id
|
||||
WHERE d.status = 'dead_letter'
|
||||
AND d.updated_at > NOW() - INTERVAL '2 hours'
|
||||
GROUP BY s.id
|
||||
ORDER BY dead_letters DESC
|
||||
LIMIT 20;"
|
||||
```
|
||||
|
||||
2. **Check delivery failure reasons:**
|
||||
```bash
|
||||
psql "$DATABASE_URL" -c "
|
||||
SELECT http_status_code, COUNT(*) as count
|
||||
FROM webhook_deliveries
|
||||
WHERE status = 'dead_letter'
|
||||
AND updated_at > NOW() - INTERVAL '2 hours'
|
||||
GROUP BY http_status_code;"
|
||||
```
|
||||
|
||||
3. **Common causes and resolutions:**
|
||||
| HTTP Status | Likely Cause | Resolution |
|
||||
|---|---|---|
|
||||
| 0 / null | Network unreachable / DNS failure | Check recipient endpoint availability |
|
||||
| 401 / 403 | HMAC signature validation failing | Customer to verify HMAC secret |
|
||||
| 404 | Endpoint URL changed | Customer to update webhook URL |
|
||||
| 5xx | Recipient server error | Customer to investigate their endpoint |
|
||||
| Timeout | Slow recipient endpoint | Customer to optimize endpoint response time |
|
||||
|
||||
4. **Notify affected customers:**
|
||||
Contact the organization owner for high-volume dead-letter subscriptions.
|
||||
|
||||
### Escalation Path
|
||||
|
||||
- **Warning (10-50/hr):** Engineering notifies affected customers, investigates endpoint health
|
||||
- **Critical (> 50/hr):** Engineering on-call + Platform reliability team engaged
|
||||
- **If systemic delivery infrastructure failure:** Activate incident bridge, escalate to VP Engineering
|
||||
142
docs/compliance/secrets-rotation.md
Normal file
142
docs/compliance/secrets-rotation.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# Secrets Rotation Runbook — SentryAgent.ai AgentIdP
|
||||
|
||||
**Control:** SOC 2 CC9.2 — Secrets Rotation
|
||||
**Last updated:** 2026-03-31
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
AgentIdP manages three categories of secrets that require periodic rotation:
|
||||
|
||||
1. **Agent client secrets** — Per-credential client secrets used for OAuth 2.0 token issuance
|
||||
2. **OIDC signing keys** — RSA/EC keys used to sign ID tokens
|
||||
3. **AES-256-CBC encryption key** — Column-level database encryption key (see `encryption-runbook.md`)
|
||||
|
||||
---
|
||||
|
||||
## 1. Agent Credential (Client Secret) Rotation
|
||||
|
||||
### API endpoint
|
||||
|
||||
```
|
||||
POST /api/v1/agents/:agentId/credentials/:credentialId/rotate
|
||||
```
|
||||
|
||||
Requires Bearer token with `agents:write` scope.
|
||||
|
||||
### Procedure
|
||||
|
||||
```bash
|
||||
# 1. List active credentials for the agent
|
||||
curl -s -H "Authorization: Bearer <token>" \
|
||||
"https://api.sentryagent.ai/v1/agents/<agentId>/credentials?status=active"
|
||||
|
||||
# 2. Rotate the credential (generate new secret)
|
||||
curl -s -X POST \
|
||||
-H "Authorization: Bearer <token>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"expiresAt": "2027-03-31T00:00:00.000Z"}' \
|
||||
"https://api.sentryagent.ai/v1/agents/<agentId>/credentials/<credentialId>/rotate"
|
||||
|
||||
# Response includes the new clientSecret — store it immediately; it is never shown again
|
||||
```
|
||||
|
||||
### Key points
|
||||
|
||||
- The new `clientSecret` is returned **once only** — store it securely before the response is discarded
|
||||
- The agent's previous secret is immediately invalidated (Vault KV v2 version overwritten)
|
||||
- An audit event `credential.rotated` is logged to the immutable audit chain
|
||||
- A `credential.rotated` webhook event is dispatched to all active subscriptions
|
||||
|
||||
### Recommended rotation schedule
|
||||
|
||||
| Credential type | Recommended rotation interval |
|
||||
|---|---|
|
||||
| Production agent credentials | 90 days |
|
||||
| Staging / development credentials | 180 days |
|
||||
| Service account credentials | 365 days (annual) |
|
||||
| Credentials involved in a security incident | Immediately |
|
||||
|
||||
### Automated expiry detection
|
||||
|
||||
`SecretsRotationJob` runs hourly and queries credentials expiring within 7 days.
|
||||
Prometheus alert `CredentialExpiryApproaching` fires immediately when any are detected.
|
||||
Respond to this alert by rotating the flagged credential(s) before the expiry date.
|
||||
|
||||
---
|
||||
|
||||
## 2. OIDC Signing Key Rotation
|
||||
|
||||
### Overview
|
||||
|
||||
OIDC signing keys are managed by `OIDCKeyService` (`src/services/OIDCKeyService.ts`).
|
||||
Keys are stored in the `oidc_keys` PostgreSQL table. The current active key is used to
|
||||
sign all new ID tokens; public keys are exposed via `GET /.well-known/jwks.json`.
|
||||
|
||||
### When to rotate
|
||||
|
||||
- Key compromise or suspected exposure
|
||||
- Scheduled rotation (recommended every 90 days for production)
|
||||
- Algorithm upgrade (e.g. RS256 → ES256)
|
||||
|
||||
### Rotation procedure
|
||||
|
||||
OIDC key rotation is handled automatically by `OIDCKeyService.ensureCurrentKey()`:
|
||||
|
||||
```bash
|
||||
# Force generation of a new signing key by calling the internal rotate endpoint
|
||||
# (or trigger by redeploying with OIDC_FORCE_KEY_ROTATION=true)
|
||||
|
||||
# 1. Mark current key as inactive (if manual rotation is required)
|
||||
psql "$DATABASE_URL" -c "
|
||||
UPDATE oidc_keys
|
||||
SET active = false
|
||||
WHERE active = true;"
|
||||
|
||||
# 2. Restart the application — ensureCurrentKey() will generate a new key on startup
|
||||
kubectl rollout restart deployment/agentidp
|
||||
```
|
||||
|
||||
### JWKS update behavior
|
||||
|
||||
- Old public keys remain in `GET /.well-known/jwks.json` for **24 hours** after rotation
|
||||
(grace period for in-flight tokens)
|
||||
- After the grace period, old keys are removed from the JWKS endpoint
|
||||
- Redis JWKS cache TTL is configured by `JWKS_CACHE_TTL_SECONDS` (default: 3600)
|
||||
|
||||
### Impact on existing tokens
|
||||
|
||||
Existing valid tokens signed with the old key **continue to work** until they expire,
|
||||
as long as the old public key remains in JWKS. After the grace period, old tokens
|
||||
will fail verification.
|
||||
|
||||
---
|
||||
|
||||
## 3. Encryption Key Rotation
|
||||
|
||||
See `docs/compliance/encryption-runbook.md` for the full AES-256-CBC encryption key rotation procedure.
|
||||
|
||||
**Summary:** Generate new 32-byte hex key → write to Vault at `ENCRYPTION_KEY_VAULT_PATH` → restart app → existing rows re-encrypted lazily on next read-write cycle.
|
||||
|
||||
---
|
||||
|
||||
## Schedule Recommendations
|
||||
|
||||
| Secret Type | Production Interval | Staging Interval | Trigger for Immediate Rotation |
|
||||
|---|---|---|---|
|
||||
| Agent client secrets | 90 days | 180 days | Credential suspected compromised |
|
||||
| OIDC signing keys | 90 days | 180 days | Key file exposed, algorithm upgrade |
|
||||
| AES-256-CBC encryption key | 365 days (annual) | On demand | Key exposed, Vault breach, compliance audit requirement |
|
||||
| Webhook HMAC secrets | Per customer policy | N/A | Webhook endpoint compromised |
|
||||
|
||||
---
|
||||
|
||||
## Compliance Evidence
|
||||
|
||||
For SOC 2 CC9.2 evidence collection:
|
||||
|
||||
- Prometheus metric history: `agentidp_credentials_expiring_soon_total`
|
||||
- Audit log entries with `action: credential.rotated` — query via `GET /audit?action=credential.rotated`
|
||||
- Key rotation records from Vault audit log
|
||||
- This runbook + sign-off from Security Engineering
|
||||
42
docs/compliance/soc2-controls-matrix.md
Normal file
42
docs/compliance/soc2-controls-matrix.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# SOC 2 Type II Controls Matrix — SentryAgent.ai AgentIdP
|
||||
|
||||
This document maps the five in-scope SOC 2 Trust Services Criteria (TSC) controls to their
|
||||
corresponding implementation artefacts, mechanisms, and automated verification methods.
|
||||
|
||||
---
|
||||
|
||||
## Controls Matrix
|
||||
|
||||
| Control ID | TSC Criterion Name | Implementation File | Mechanism | Automated Check |
|
||||
|---|---|---|---|---|
|
||||
| **CC6.1** | Encryption at Rest | `src/services/EncryptionService.ts` | AES-256-CBC column-level encryption on `credentials.secret_hash`, `credentials.vault_path`, `webhook_subscriptions.vault_secret_path`, `agent_did_keys.vault_key_path`. Key is stored in HashiCorp Vault KV v2 at path configured by `ENCRYPTION_KEY_VAULT_PATH`. IV is randomised per encryption call. Backward-compat: `isEncrypted()` gate allows plaintext rows to coexist during migration. | `GET /api/v1/compliance/controls` returns `CC6.1` status. Status is set to `passing` on service startup when `EncryptionService` initialises. |
|
||||
| **CC6.7** | TLS Enforcement | `src/middleware/TLSEnforcementMiddleware.ts` | Express middleware registered as the **first** middleware in the app stack (before all routes and body parsers). In `NODE_ENV=production`, checks `X-Forwarded-Proto` header set by the upstream load balancer/reverse proxy. Any non-HTTPS request receives a `301 Moved Permanently` redirect to `https://`. | `GET /api/v1/compliance/controls` returns `CC6.7` status. TLS enforcement is a static configuration control; status is set to `passing` on application startup. |
|
||||
| **CC7.2** | Audit Log Integrity | `src/services/AuditVerificationService.ts`, `src/repositories/AuditRepository.ts`, `src/jobs/AuditChainVerificationJob.ts` | Each audit event (`audit_events` table) stores a `hash` (SHA-256 of `eventId + timestamp + action + outcome + agentId + organizationId + previousHash`) and `previous_hash` linking it to the prior event. An immutability trigger prevents UPDATE/DELETE on `audit_events`. `AuditChainVerificationJob` re-walks the entire chain every hour. | Prometheus gauge `agentidp_audit_chain_integrity` (1 = passing, 0 = failing). Prometheus alert `AuditChainIntegrityFailed` fires when gauge = 0. `GET /api/v1/audit/verify` triggers an on-demand verification. `GET /api/v1/compliance/controls` returns `CC7.2` status. |
|
||||
| **CC9.2** | Secrets Rotation | `src/jobs/SecretsRotationJob.ts` | `SecretsRotationJob` runs every hour (configurable via `SECRETS_ROTATION_CHECK_INTERVAL_MS`) and queries `credentials` for `active` credentials expiring within 7 days. For each, it increments the `agentidp_credentials_expiring_soon_total` Prometheus counter with the owning `agent_id`. Operators are expected to act on the alert within the 7-day window. | Prometheus counter `agentidp_credentials_expiring_soon_total` per `agent_id`. Prometheus alert `CredentialExpiryApproaching` fires when any increase is detected. `GET /api/v1/compliance/controls` returns `CC9.2` status. |
|
||||
| **CC7.1** | Webhook Dead-Letter Monitoring | `src/workers/WebhookDeliveryWorker.ts` | `WebhookDeliveryWorker` processes webhook deliveries from a Redis queue. After exhausting all retry attempts (configurable `WEBHOOK_MAX_RETRIES`), the delivery is moved to dead-letter status and `agentidp_webhook_dead_letters_total` is incremented. | Prometheus counter `agentidp_webhook_dead_letters_total` per `organization_id`. Prometheus alert `WebhookDeadLetterAccumulating` fires when > 10 dead-letters accumulate in 1 hour. `GET /api/v1/compliance/controls` returns `CC7.1` status. |
|
||||
|
||||
---
|
||||
|
||||
## Evidence Collection
|
||||
|
||||
For a SOC 2 Type II audit, the following evidence should be collected:
|
||||
|
||||
| Evidence Type | Collection Method |
|
||||
|---|---|
|
||||
| Encryption at rest configuration | Export Vault KV v2 policy + `_encryption_migration_log` table contents |
|
||||
| TLS certificate and enforcement logs | Load balancer access logs + `X-Forwarded-Proto` middleware responses |
|
||||
| Audit chain integrity report | `GET /api/v1/audit/verify` with full date range |
|
||||
| Secrets rotation compliance | Prometheus metric history for `agentidp_credentials_expiring_soon_total` |
|
||||
| Webhook dead-letter rate | Prometheus metric history for `agentidp_webhook_dead_letters_total` |
|
||||
| Immutable audit log dump | Direct PostgreSQL export of `audit_events` table with hash verification |
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- SOC 2 Trust Services Criteria: [AICPA TSC 2017](https://www.aicpa.org/resources/article/trust-services-criteria)
|
||||
- OpenAPI spec: `docs/openapi/compliance.yaml`
|
||||
- Encryption runbook: `docs/compliance/encryption-runbook.md`
|
||||
- Audit log runbook: `docs/compliance/audit-log-runbook.md`
|
||||
- Incident response: `docs/compliance/incident-response.md`
|
||||
- Secrets rotation: `docs/compliance/secrets-rotation.md`
|
||||
548
docs/openapi/compliance.yaml
Normal file
548
docs/openapi/compliance.yaml
Normal file
@@ -0,0 +1,548 @@
|
||||
openapi: 3.0.3
|
||||
|
||||
info:
|
||||
title: SentryAgent.ai — Compliance & SOC 2 Type II Service
|
||||
version: 1.0.0
|
||||
description: |
|
||||
The Compliance Service exposes endpoints supporting SentryAgent.ai's
|
||||
**SOC 2 Type II** audit readiness programme.
|
||||
|
||||
Two categories of control are surfaced:
|
||||
|
||||
**Audit chain verification** (`GET /audit/verify`) — Confirms cryptographic
|
||||
integrity of the immutable audit log chain across an optional date range.
|
||||
This endpoint provides auditors and compliance tooling with a single call to
|
||||
assert that no audit events have been tampered with, deleted, or reordered
|
||||
after initial capture.
|
||||
|
||||
**SOC 2 control status** (`GET /compliance/controls`) — Returns a live status
|
||||
snapshot for each of the five in-scope SOC 2 Trust Services Criteria controls
|
||||
monitored by the platform. Designed as a lightweight, public health-style
|
||||
endpoint so that monitoring infrastructure can poll without bearer credentials.
|
||||
|
||||
**In-scope SOC 2 controls:**
|
||||
| Control ID | Name | Description |
|
||||
|------------|------|-------------|
|
||||
| `CC6.1` | Encryption at Rest | Verifies database and secrets store encryption is active |
|
||||
| `CC6.7` | TLS Enforcement | Confirms TLS 1.2+ is enforced on all inbound connections |
|
||||
| `CC7.2` | Audit Log Integrity | Validates audit chain hash continuity |
|
||||
| `CC9.2` | Secrets Rotation | Checks that all managed secrets are within rotation policy |
|
||||
| `CC7.1` | Webhook Dead-Letter Monitoring | Asserts dead-letter queue depth is within threshold |
|
||||
|
||||
**Required scope (audit chain verify only):** `audit:read`
|
||||
|
||||
servers:
|
||||
- url: http://localhost:3000/api/v1
|
||||
description: Local development server
|
||||
- url: https://api.sentryagent.ai/v1
|
||||
description: Production server
|
||||
|
||||
tags:
|
||||
- name: Audit Chain
|
||||
description: Cryptographic integrity verification of the immutable audit event chain
|
||||
- name: Compliance Controls
|
||||
description: SOC 2 Type II control status — public health-style monitoring endpoint
|
||||
|
||||
components:
|
||||
securitySchemes:
|
||||
BearerAuth:
|
||||
type: http
|
||||
scheme: bearer
|
||||
bearerFormat: JWT
|
||||
description: |
|
||||
JWT access token with `audit:read` scope, obtained via `POST /token`.
|
||||
Include as: `Authorization: Bearer <token>`
|
||||
|
||||
schemas:
|
||||
ChainVerificationResult:
|
||||
type: object
|
||||
description: |
|
||||
Result of an audit event chain integrity verification run.
|
||||
|
||||
The audit log is structured as a hash-linked chain. Each event stores a
|
||||
reference to the hash of the preceding event. `verified: true` means every
|
||||
event in the requested window was checked and no breaks in the chain were
|
||||
detected.
|
||||
|
||||
When `verified` is `false`, `brokenAtEventId` identifies the first event
|
||||
where the chain integrity check failed, enabling targeted forensic investigation.
|
||||
required:
|
||||
- verified
|
||||
- checkedCount
|
||||
- brokenAtEventId
|
||||
properties:
|
||||
verified:
|
||||
type: boolean
|
||||
description: >
|
||||
`true` if every audit event in the checked range maintains an unbroken
|
||||
cryptographic hash chain; `false` if at least one chain break was detected.
|
||||
example: true
|
||||
checkedCount:
|
||||
type: integer
|
||||
description: Total number of audit events examined during this verification run.
|
||||
minimum: 0
|
||||
example: 2847
|
||||
brokenAtEventId:
|
||||
type: string
|
||||
format: uuid
|
||||
nullable: true
|
||||
description: >
|
||||
UUID of the first audit event where chain continuity failed, or `null`
|
||||
when `verified` is `true`. Only the first detected break is reported;
|
||||
subsequent events are not checked after a break is found.
|
||||
example: null
|
||||
fromDate:
|
||||
type: string
|
||||
format: date-time
|
||||
description: >
|
||||
The ISO 8601 lower bound of the date range that was verified.
|
||||
Present only when a `fromDate` query parameter was supplied.
|
||||
example: "2026-03-01T00:00:00.000Z"
|
||||
toDate:
|
||||
type: string
|
||||
format: date-time
|
||||
description: >
|
||||
The ISO 8601 upper bound of the date range that was verified.
|
||||
Present only when a `toDate` query parameter was supplied.
|
||||
example: "2026-03-31T23:59:59.999Z"
|
||||
|
||||
ControlStatus:
|
||||
type: string
|
||||
description: Operational status of a SOC 2 control at the time of the last check.
|
||||
enum:
|
||||
- passing
|
||||
- failing
|
||||
- unknown
|
||||
example: passing
|
||||
|
||||
ComplianceControl:
|
||||
type: object
|
||||
description: Status record for a single SOC 2 Trust Services Criteria control.
|
||||
required:
|
||||
- id
|
||||
- name
|
||||
- status
|
||||
- lastChecked
|
||||
properties:
|
||||
id:
|
||||
type: string
|
||||
description: SOC 2 Trust Services Criteria control identifier.
|
||||
enum:
|
||||
- CC6.1
|
||||
- CC6.7
|
||||
- CC7.2
|
||||
- CC9.2
|
||||
- CC7.1
|
||||
example: "CC6.1"
|
||||
name:
|
||||
type: string
|
||||
description: Human-readable name of the control.
|
||||
example: "Encryption at Rest"
|
||||
status:
|
||||
$ref: '#/components/schemas/ControlStatus'
|
||||
lastChecked:
|
||||
type: string
|
||||
format: date-time
|
||||
description: ISO 8601 timestamp of the most recent automated check for this control.
|
||||
example: "2026-03-31T06:00:00.000Z"
|
||||
|
||||
ComplianceControlsResponse:
|
||||
type: object
|
||||
description: SOC 2 compliance control status summary for all in-scope controls.
|
||||
required:
|
||||
- controls
|
||||
properties:
|
||||
controls:
|
||||
type: array
|
||||
description: Status record for each of the five in-scope SOC 2 controls.
|
||||
minItems: 5
|
||||
maxItems: 5
|
||||
items:
|
||||
$ref: '#/components/schemas/ComplianceControl'
|
||||
example:
|
||||
- id: "CC6.1"
|
||||
name: "Encryption at Rest"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC6.7"
|
||||
name: "TLS Enforcement"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.2"
|
||||
name: "Audit Log Integrity"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC9.2"
|
||||
name: "Secrets Rotation"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.1"
|
||||
name: "Webhook Dead-Letter Monitoring"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
|
||||
ErrorResponse:
|
||||
type: object
|
||||
description: Standard error response envelope used across all SentryAgent.ai APIs.
|
||||
required:
|
||||
- code
|
||||
- message
|
||||
properties:
|
||||
code:
|
||||
type: string
|
||||
description: Machine-readable error code.
|
||||
example: "UNAUTHORIZED"
|
||||
message:
|
||||
type: string
|
||||
description: Human-readable description of the error.
|
||||
example: "A valid Bearer token is required."
|
||||
details:
|
||||
type: object
|
||||
description: Optional structured details providing additional context.
|
||||
additionalProperties: true
|
||||
example: {}
|
||||
|
||||
responses:
|
||||
Unauthorized:
|
||||
description: Missing or invalid Bearer token.
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
code: "UNAUTHORIZED"
|
||||
message: "A valid Bearer token is required to access this resource."
|
||||
|
||||
Forbidden:
|
||||
description: Valid token but insufficient permissions. Requires `audit:read` scope.
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
code: "INSUFFICIENT_SCOPE"
|
||||
message: "The 'audit:read' scope is required to verify the audit chain."
|
||||
|
||||
TooManyRequests:
|
||||
description: |
|
||||
Rate limit exceeded. Retry after the reset time indicated in `X-RateLimit-Reset`.
|
||||
headers:
|
||||
X-RateLimit-Limit:
|
||||
schema:
|
||||
type: integer
|
||||
description: Maximum requests allowed per minute.
|
||||
example: 30
|
||||
X-RateLimit-Remaining:
|
||||
schema:
|
||||
type: integer
|
||||
description: Requests remaining in the current window.
|
||||
example: 0
|
||||
X-RateLimit-Reset:
|
||||
schema:
|
||||
type: integer
|
||||
description: Unix timestamp when the rate limit window resets.
|
||||
example: 1743155400
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
code: "RATE_LIMIT_EXCEEDED"
|
||||
message: "Too many requests. Please retry after the rate limit window resets."
|
||||
|
||||
InternalServerError:
|
||||
description: Unexpected server error.
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
code: "INTERNAL_SERVER_ERROR"
|
||||
message: "An unexpected error occurred. Please try again later."
|
||||
|
||||
paths:
|
||||
/audit/verify:
|
||||
get:
|
||||
operationId: verifyAuditChain
|
||||
tags:
|
||||
- Audit Chain
|
||||
summary: Verify audit log chain integrity
|
||||
description: |
|
||||
Triggers a full integrity verification pass over the immutable audit event
|
||||
chain. Each event in the log contains a cryptographic hash of the previous
|
||||
event; this endpoint traverses the chain and confirms no breaks exist.
|
||||
|
||||
**Use cases:**
|
||||
- Auditor evidence collection for SOC 2 Type II assessment
|
||||
- Continuous compliance monitoring (cron-driven)
|
||||
- Incident response — confirm audit log has not been tampered with
|
||||
|
||||
**Requires:** Bearer token with `audit:read` scope.
|
||||
|
||||
**Rate limit:** 30 requests/minute per `client_id`. Audit chain verification
|
||||
is a computationally intensive operation and is rate-limited more aggressively
|
||||
than standard read endpoints. For continuous monitoring, poll no more than
|
||||
once per minute.
|
||||
|
||||
**Date range filtering:** Supply `fromDate` and/or `toDate` to restrict
|
||||
verification to a specific window. When omitted, the entire retained audit
|
||||
log is verified. `fromDate` must be before or equal to `toDate` when both
|
||||
are provided.
|
||||
|
||||
**Result interpretation:**
|
||||
- `verified: true` — chain is intact across all checked events
|
||||
- `verified: false` — at least one chain break detected; `brokenAtEventId`
|
||||
identifies the first affected event
|
||||
security:
|
||||
- BearerAuth: []
|
||||
parameters:
|
||||
- name: fromDate
|
||||
in: query
|
||||
description: |
|
||||
ISO 8601 date-time lower bound for the verification window (inclusive).
|
||||
When omitted, verification starts from the earliest available audit event.
|
||||
Must be before or equal to `toDate` when both are supplied.
|
||||
required: false
|
||||
schema:
|
||||
type: string
|
||||
format: date-time
|
||||
example: "2026-03-01T00:00:00.000Z"
|
||||
- name: toDate
|
||||
in: query
|
||||
description: |
|
||||
ISO 8601 date-time upper bound for the verification window (inclusive).
|
||||
When omitted, verification runs up to and including the most recent
|
||||
audit event. Must be after or equal to `fromDate` when both are supplied.
|
||||
required: false
|
||||
schema:
|
||||
type: string
|
||||
format: date-time
|
||||
example: "2026-03-31T23:59:59.999Z"
|
||||
responses:
|
||||
'200':
|
||||
description: |
|
||||
Audit chain verification completed. Inspect `verified` to determine
|
||||
whether chain integrity is intact. A `200` is returned regardless of
|
||||
whether verification passed or failed — check the response body.
|
||||
headers:
|
||||
X-RateLimit-Limit:
|
||||
schema:
|
||||
type: integer
|
||||
description: Maximum requests allowed per minute for this endpoint.
|
||||
example: 30
|
||||
X-RateLimit-Remaining:
|
||||
schema:
|
||||
type: integer
|
||||
description: Requests remaining in the current rate limit window.
|
||||
example: 29
|
||||
X-RateLimit-Reset:
|
||||
schema:
|
||||
type: integer
|
||||
description: Unix timestamp when the rate limit window resets.
|
||||
example: 1743155400
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ChainVerificationResult'
|
||||
examples:
|
||||
chainIntact:
|
||||
summary: Verification passed — chain is intact
|
||||
value:
|
||||
verified: true
|
||||
checkedCount: 2847
|
||||
brokenAtEventId: null
|
||||
fromDate: "2026-03-01T00:00:00.000Z"
|
||||
toDate: "2026-03-31T23:59:59.999Z"
|
||||
chainBroken:
|
||||
summary: Verification failed — chain break detected
|
||||
value:
|
||||
verified: false
|
||||
checkedCount: 1203
|
||||
brokenAtEventId: "c4d5e6f7-a8b9-0123-cdef-456789012345"
|
||||
fromDate: "2026-03-01T00:00:00.000Z"
|
||||
toDate: "2026-03-31T23:59:59.999Z"
|
||||
noDateRange:
|
||||
summary: Full log verified (no date range supplied)
|
||||
value:
|
||||
verified: true
|
||||
checkedCount: 18504
|
||||
brokenAtEventId: null
|
||||
'400':
|
||||
description: Invalid query parameter value or date range.
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
examples:
|
||||
invalidFromDate:
|
||||
summary: fromDate is not a valid ISO 8601 date-time
|
||||
value:
|
||||
code: "VALIDATION_ERROR"
|
||||
message: "Invalid query parameter value."
|
||||
details:
|
||||
field: "fromDate"
|
||||
reason: "Must be a valid ISO 8601 date-time string (e.g. 2026-03-01T00:00:00.000Z)."
|
||||
invalidToDate:
|
||||
summary: toDate is not a valid ISO 8601 date-time
|
||||
value:
|
||||
code: "VALIDATION_ERROR"
|
||||
message: "Invalid query parameter value."
|
||||
details:
|
||||
field: "toDate"
|
||||
reason: "Must be a valid ISO 8601 date-time string (e.g. 2026-03-31T23:59:59.999Z)."
|
||||
invalidDateRange:
|
||||
summary: fromDate is after toDate
|
||||
value:
|
||||
code: "VALIDATION_ERROR"
|
||||
message: "Invalid date range."
|
||||
details:
|
||||
reason: "fromDate must be before or equal to toDate."
|
||||
'401':
|
||||
$ref: '#/components/responses/Unauthorized'
|
||||
'403':
|
||||
$ref: '#/components/responses/Forbidden'
|
||||
'429':
|
||||
$ref: '#/components/responses/TooManyRequests'
|
||||
'500':
|
||||
$ref: '#/components/responses/InternalServerError'
|
||||
|
||||
/compliance/controls:
|
||||
get:
|
||||
operationId: getComplianceControls
|
||||
tags:
|
||||
- Compliance Controls
|
||||
summary: Get SOC 2 control status summary
|
||||
description: |
|
||||
Returns a live status snapshot for each of the five in-scope SOC 2 Type II
|
||||
Trust Services Criteria controls monitored by the SentryAgent.ai platform.
|
||||
|
||||
**No authentication required.** This endpoint is intentionally public
|
||||
(analogous to a health check) so that external monitoring infrastructure,
|
||||
status pages, and audit tooling can poll it without bearer credentials.
|
||||
|
||||
**Controls monitored:**
|
||||
| Control ID | Name | What is checked |
|
||||
|------------|------|-----------------|
|
||||
| `CC6.1` | Encryption at Rest | Database and secrets store encryption is active and configured |
|
||||
| `CC6.7` | TLS Enforcement | TLS 1.2+ is enforced on all platform inbound connections |
|
||||
| `CC7.2` | Audit Log Integrity | Audit chain hash continuity — shorthand of `/audit/verify` |
|
||||
| `CC9.2` | Secrets Rotation | All managed secrets are within the rotation policy window |
|
||||
| `CC7.1` | Webhook Dead-Letter Monitoring | Dead-letter queue depth is within the acceptable threshold |
|
||||
|
||||
**Status values:**
|
||||
- `passing` — control is operating within policy
|
||||
- `failing` — control has breached policy; immediate attention required
|
||||
- `unknown` — automated check could not complete (e.g. dependency unavailable)
|
||||
|
||||
**Caching note:** Responses may be cached for up to 60 seconds by
|
||||
intermediate proxies. The `lastChecked` field on each control indicates
|
||||
the timestamp of the most recent automated evaluation.
|
||||
|
||||
**Rate limit:** 120 requests/minute per IP address.
|
||||
security: []
|
||||
responses:
|
||||
'200':
|
||||
description: SOC 2 control status summary returned successfully.
|
||||
headers:
|
||||
Cache-Control:
|
||||
schema:
|
||||
type: string
|
||||
description: >
|
||||
Downstream caches may serve this response for up to 60 seconds.
|
||||
example: "public, max-age=60"
|
||||
X-RateLimit-Limit:
|
||||
schema:
|
||||
type: integer
|
||||
description: Maximum requests allowed per minute for this endpoint.
|
||||
example: 120
|
||||
X-RateLimit-Remaining:
|
||||
schema:
|
||||
type: integer
|
||||
description: Requests remaining in the current rate limit window.
|
||||
example: 119
|
||||
X-RateLimit-Reset:
|
||||
schema:
|
||||
type: integer
|
||||
description: Unix timestamp when the rate limit window resets.
|
||||
example: 1743155400
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ComplianceControlsResponse'
|
||||
examples:
|
||||
allPassing:
|
||||
summary: All controls passing
|
||||
value:
|
||||
controls:
|
||||
- id: "CC6.1"
|
||||
name: "Encryption at Rest"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC6.7"
|
||||
name: "TLS Enforcement"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.2"
|
||||
name: "Audit Log Integrity"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC9.2"
|
||||
name: "Secrets Rotation"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.1"
|
||||
name: "Webhook Dead-Letter Monitoring"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
oneControlFailing:
|
||||
summary: One control failing (secrets rotation overdue)
|
||||
value:
|
||||
controls:
|
||||
- id: "CC6.1"
|
||||
name: "Encryption at Rest"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC6.7"
|
||||
name: "TLS Enforcement"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.2"
|
||||
name: "Audit Log Integrity"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC9.2"
|
||||
name: "Secrets Rotation"
|
||||
status: "failing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.1"
|
||||
name: "Webhook Dead-Letter Monitoring"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
unknownControl:
|
||||
summary: One control in unknown state (dependency unavailable)
|
||||
value:
|
||||
controls:
|
||||
- id: "CC6.1"
|
||||
name: "Encryption at Rest"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC6.7"
|
||||
name: "TLS Enforcement"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.2"
|
||||
name: "Audit Log Integrity"
|
||||
status: "unknown"
|
||||
lastChecked: "2026-03-31T05:00:00.000Z"
|
||||
- id: "CC9.2"
|
||||
name: "Secrets Rotation"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
- id: "CC7.1"
|
||||
name: "Webhook Dead-Letter Monitoring"
|
||||
status: "passing"
|
||||
lastChecked: "2026-03-31T06:00:00.000Z"
|
||||
'429':
|
||||
$ref: '#/components/responses/TooManyRequests'
|
||||
'500':
|
||||
$ref: '#/components/responses/InternalServerError'
|
||||
Reference in New Issue
Block a user