Files
sentryagent-idp/docs/devops/field-trial.md
SentryAgent.ai Developer f9a6a8aafb docs(devops): update all documentation for DockerSpec compliance
- Replace all docker-compose.yml/docker-compose.monitoring.yml references with
  compose.yaml/compose.monitoring.yaml (modern Compose Spec naming)
- Replace all `docker-compose` CLI commands with `docker compose` (plugin syntax)
- Update Dockerfile stage descriptions: node:18-alpine → node:20.11-bookworm-slim,
  built-in node user → explicit nodeapp:1001 non-root user
- Update image version references: postgres:14-alpine → postgres:14.12-alpine3.19,
  redis:7-alpine → redis:7.2-alpine3.19
- Externalize postgres credentials: hardcoded values → POSTGRES_USER/PASSWORD/DB env vars
- Externalize Grafana admin password: hardcoded 'agentidp' → GF_ADMIN_PASSWORD env var
- Add Docker Compose Variables section to environment-variables.md (POSTGRES_*, GF_ADMIN_PASSWORD)
- Update local-development.md Step 3: cp .env.example .env, document POSTGRES_* purpose
- Update quick-start.md: cp .env.example .env, use awk/sed for JWT key injection
- Update 07-dev-setup.md: remove 'no .env.example' claim, reference cp .env.example
- Update docker-compose.yml key file description in 04-codebase-structure.md
- Update monitoring overlay launch commands across all docs (compose.yaml + compose.monitoring.yaml)
- Update volume names to kebab-case: postgres_data → postgres-data, redis_data → redis-data
- Fix compliance encryption-runbook: docker-compose restart agentidp → docker compose restart app

All docs now consistent with compose.yaml in repo root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 08:27:37 +00:00

25 KiB
Raw Blame History

SentryAgent.ai AgentIdP — In-House Field Trial Guide

This guide is the execution playbook for in-house Docker Compose field trials of SentryAgent.ai AgentIdP. Follow each phase in order. All commands are exact — copy and paste them directly.

Estimated time to complete all phases: 4560 minutes.

Prerequisites must be satisfied before Section 0.

Prerequisites

Docker 24+ and Docker Compose 2.20+

docker --version
# Expected: Docker version 24.x.x or higher

docker compose version
# Expected: Docker Compose version v2.20.x or higher

Node.js 18+ via nvm

export NVM_DIR="$HOME/.nvm" && source "$NVM_DIR/nvm.sh"
node --version
# Expected: v18.x.x or higher

openssl

openssl version
# Expected: OpenSSL 1.1.x or higher (any version)

Git repo cloned

git clone https://git.sentryagent.ai/vijay_admin/sentryagent-idp.git
cd sentryagent-idp

Ports free

The following ports must be free on the machine before starting:

Port Service
3000 AgentIdP backend
3001 Next.js portal
5432 PostgreSQL
6379 Redis

Check all ports:

lsof -i :3000 -i :3001 -i :5432 -i :6379
# Expected: no output (all ports free)

If any port is in use, kill the occupying process:

lsof -ti:<port> | xargs kill

Section 0 — Environment Setup

This section guides the engineer through creating a valid .env file for field trial use.

Step 0.1 — Copy .env.example

cp .env.example .env

Step 0.2 — Generate RSA-2048 keypair

Generate the JWT signing keys:

openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -pubout -out public.pem

Verify the keys are valid:

openssl rsa -in private.pem -check -noout
# Expected: RSA key ok

openssl rsa -in public.pem -pubin -noout -text 2>&1 | head -3
# Expected: Public-Key: (2048 bit)

Step 0.3 — Write keys into .env

Write the private key as a single-line PEM with \n separators:

PRIVATE_KEY_LINE=$(awk 'NF {sub(/\r/, ""); printf "%s\\n",$0;}' private.pem)
sed -i "s|JWT_PRIVATE_KEY=.*|JWT_PRIVATE_KEY=\"${PRIVATE_KEY_LINE}\"|" .env

Write the public key:

PUBLIC_KEY_LINE=$(awk 'NF {sub(/\r/, ""); printf "%s\\n",$0;}' public.pem)
sed -i "s|JWT_PUBLIC_KEY=.*|JWT_PUBLIC_KEY=\"${PUBLIC_KEY_LINE}\"|" .env

Verify both keys are present and non-empty:

grep -c "BEGIN RSA PRIVATE KEY" .env
# Expected: 1

grep -c "BEGIN PUBLIC KEY" .env
# Expected: 1

Step 0.4 — Configure field trial values

Set the following values in .env. These are the correct values for an in-house field trial (no real Stripe, no Kafka, no Vault):

# Disable real Stripe billing for field trial
sed -i "s|BILLING_ENABLED=.*|BILLING_ENABLED=false|" .env
sed -i "s|STRIPE_SECRET_KEY=.*|STRIPE_SECRET_KEY=sk_test_placeholder|" .env
sed -i "s|STRIPE_WEBHOOK_SECRET=.*|STRIPE_WEBHOOK_SECRET=whsec_placeholder|" .env
sed -i "s|STRIPE_PRICE_ID=.*|STRIPE_PRICE_ID=price_placeholder|" .env

# Keep feature flags at defaults
sed -i "s|ANALYTICS_ENABLED=.*|ANALYTICS_ENABLED=true|" .env
sed -i "s|TIER_ENFORCEMENT=.*|TIER_ENFORCEMENT=true|" .env
sed -i "s|COMPLIANCE_ENABLED=.*|COMPLIANCE_ENABLED=true|" .env

# Allow portal CORS
sed -i "s|CORS_ORIGIN=.*|CORS_ORIGIN=http://localhost:3001|" .env

Step 0.5 — Verify final .env

grep -E "^(DATABASE_URL|REDIS_URL|JWT_PRIVATE_KEY|JWT_PUBLIC_KEY|BILLING_ENABLED|ANALYTICS_ENABLED|TIER_ENFORCEMENT|COMPLIANCE_ENABLED|CORS_ORIGIN)=" .env

Expected output (values abbreviated):

POSTGRES_USER=sentryagent
POSTGRES_PASSWORD=sentryagent
POSTGRES_DB=sentryagent_idp
DATABASE_URL=postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp
REDIS_URL=redis://localhost:6379
JWT_PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----\n...
JWT_PUBLIC_KEY="-----BEGIN PUBLIC KEY-----\n...
BILLING_ENABLED=false
ANALYTICS_ENABLED=true
TIER_ENFORCEMENT=true
COMPLIANCE_ENABLED=true
CORS_ORIGIN=http://localhost:3001

Phase A — Stack Startup

Step A.1 — Build and start the full stack

docker compose up --build -d

This builds the app container image and starts all three services. The app service waits for postgres and redis to pass their health checks before starting.

Step A.2 — Verify all services are healthy

docker compose ps

Expected output — all three services must show healthy:

NAME                          IMAGE                        STATUS
sentryagent-idp-app-1         sentryagent-idp-app          running (healthy)
sentryagent-idp-postgres-1    postgres:14.12-alpine3.19    running (healthy)
sentryagent-idp-redis-1       redis:7.2-alpine3.19         running (healthy)

If any service shows starting or unhealthy, wait 15 seconds and run docker compose ps again. If a service remains unhealthy after 60 seconds, see Troubleshooting.

Step A.3 — Run database migrations

docker compose exec app npm run db:migrate

Expected output:

Running database migrations...
  ✓ Applied: 001_create_agents.sql
  ✓ Applied: 002_create_credentials.sql
  ...
  ✓ Applied: 025_add_analytics_events.sql
  ✓ Applied: 026_add_tenant_tiers.sql

Migrations complete. 26 migration(s) applied.

All 26 migrations must apply without error before proceeding.

Step A.4 — Verify application health

curl -s http://localhost:3000/health | jq .

Expected response:

{"status":"ok"}

Step A.5 — Verify Prometheus metrics

curl -s http://localhost:3000/metrics | head -20

Expected: Prometheus text output beginning with # HELP lines. Verify these specific metrics are present:

curl -s http://localhost:3000/metrics | grep -E "^# HELP agentidp_"

Expected: at least 19 lines matching # HELP agentidp_*.


Phase B — Core Product Journeys

This phase tests the end-to-end agent identity lifecycle. Run each step in order. Each step depends on the output of the previous step.

Note on tokens: The steps below use shell variables to pass values between commands. Run all commands in the same terminal session.

Step B.1 — Create an organisation

ORG_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/organizations \
  -H "Content-Type: application/json" \
  -d '{"name":"Field Trial Org","slug":"field-trial"}')

echo $ORG_RESPONSE | jq .
ORG_ID=$(echo $ORG_RESPONSE | jq -r '.org_id')
echo "ORG_ID: $ORG_ID"

Expected: HTTP 201 response body containing an org_id UUID. ORG_ID must be a non-empty UUID.

Step B.2 — Register an agent

AGENT_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/agents \
  -H "Content-Type: application/json" \
  -d "{
    \"email\": \"trial-agent@field-trial.sentryagent.ai\",
    \"agent_type\": \"classifier\",
    \"version\": \"1.0.0\",
    \"capabilities\": [\"documents:read\", \"documents:classify\"],
    \"owner\": \"field-trial-team\",
    \"deployment_env\": \"development\",
    \"organization_id\": \"$ORG_ID\"
  }")

echo $AGENT_RESPONSE | jq .
AGENT_ID=$(echo $AGENT_RESPONSE | jq -r '.agent_id')
echo "AGENT_ID: $AGENT_ID"

Expected: HTTP 201 response body containing an agent_id UUID.

Step B.3 — Generate credentials

CRED_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/credentials \
  -H "Content-Type: application/json" \
  -d "{\"agent_id\": \"$AGENT_ID\"}")

echo $CRED_RESPONSE | jq .
CLIENT_ID=$(echo $CRED_RESPONSE | jq -r '.client_id')
CLIENT_SECRET=$(echo $CRED_RESPONSE | jq -r '.client_secret')
echo "CLIENT_ID: $CLIENT_ID"
echo "CLIENT_SECRET: $CLIENT_SECRET"

Expected: HTTP 201 response body containing client_id and client_secret. The client_secret is only returned once — save it now.

Step B.4 — Issue an OAuth 2.0 access token

TOKEN_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials&client_id=$CLIENT_ID&client_secret=$CLIENT_SECRET&scope=read")

echo $TOKEN_RESPONSE | jq .
ACCESS_TOKEN=$(echo $TOKEN_RESPONSE | jq -r '.access_token')
echo "ACCESS_TOKEN obtained: ${ACCESS_TOKEN:0:30}..."

Expected: HTTP 200 response body with access_token, token_type: "Bearer", expires_in: 3600, scope: "read".

Step B.5 — Use the token on a protected endpoint

curl -s -H "Authorization: Bearer $ACCESS_TOKEN" \
  http://localhost:3000/api/v1/agents | jq .

Expected: HTTP 200 with a JSON array of agents including the agent registered in Step B.2.

Step B.6 — Inspect JWT claims

Decode and inspect the access token structure (without verifying signature):

echo $ACCESS_TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq .

Expected claims:

{
  "sub": "<client_id>",
  "iss": "https://sentryagent.ai",
  "aud": "sentryagent-api",
  "scope": "read",
  "agent_id": "<agent_id>",
  "organization_id": "<org_id>",
  "iat": "<issued-at-timestamp>",
  "exp": "<expiry-timestamp>",
  "jti": "<unique-jwt-id>"
}

Verify exp - iat = 3600 (1 hour TTL).

Step B.7 — Rotate credentials and verify old token is rejected

Rotate the credentials (generates a new client_secret, revokes the old one):

ROTATE_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/credentials \
  -H "Content-Type: application/json" \
  -d "{\"agent_id\": \"$AGENT_ID\"}")

NEW_CLIENT_ID=$(echo $ROTATE_RESPONSE | jq -r '.client_id')
NEW_CLIENT_SECRET=$(echo $ROTATE_RESPONSE | jq -r '.client_secret')
echo "New credential: $NEW_CLIENT_ID"

Attempt to use the old token (must be rejected):

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  http://localhost:3000/api/v1/agents
# Expected: 401

Issue a new token with the new credentials:

NEW_TOKEN_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials&client_id=$NEW_CLIENT_ID&client_secret=$NEW_CLIENT_SECRET&scope=read")

NEW_ACCESS_TOKEN=$(echo $NEW_TOKEN_RESPONSE | jq -r '.access_token')
echo "New token obtained."

Verify the new token works:

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
  http://localhost:3000/api/v1/agents
# Expected: 200

Step B.8 — Check audit log

curl -s -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
  "http://localhost:3000/api/v1/audit?limit=10" | jq .

Expected: JSON array of audit events. Verify these action types are present from Steps B.1B.7: agent.created, credential.generated, token.issued, credential.rotated, token.revoked.


Phase C — Guardrails

This phase tests security boundaries. Each test case must be run with the exact command shown and must produce the specified HTTP status code.

Setup: Ensure $NEW_ACCESS_TOKEN is still set from Phase B. Use export NEW_ACCESS_TOKEN if switching terminals.

Test C.1 — No Authorization header → 401

curl -s -o /dev/null -w "%{http_code}" \
  http://localhost:3000/api/v1/agents

Expected HTTP status: 401

Test C.2 — Malformed JWT → 401

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer notavalidjwt" \
  http://localhost:3000/api/v1/agents

Expected HTTP status: 401

Test C.3 — Expired JWT → 401

Use a known-expired token. Generate one with a 1-second TTL (requires a test helper or manually craft an expired JWT). For field trial purposes, use this pre-constructed expired token (signed with a different key — will fail signature verification and return 401):

EXPIRED_TOKEN="eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ0ZXN0IiwiZXhwIjoxfQ.invalid"

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $EXPIRED_TOKEN" \
  http://localhost:3000/api/v1/agents

Expected HTTP status: 401

Test C.4 — Valid JWT, wrong scope → 403

Issue a token with scope read, then attempt to access an endpoint requiring scope write:

# The NEW_ACCESS_TOKEN has scope "read"
# Attempt an action requiring "write" scope (create agent)
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -X POST http://localhost:3000/api/v1/agents \
  -d '{"email":"scope-test@example.com","agent_type":"custom","version":"1.0.0","capabilities":[],"owner":"test","deployment_env":"development"}'

Expected HTTP status: 403

Test C.5 — Rate limit: 101 requests → 429 on the 101st

Send 101 requests in rapid succession. The 101st must return 429.

for i in $(seq 1 101); do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
    -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
    http://localhost:3000/api/v1/agents)
  if [ "$STATUS" = "429" ]; then
    echo "Request $i returned 429 (PASS)"
    break
  fi
done

Expected: Output shows Request 101 returned 429 (PASS) (or earlier if previous requests in the session have already counted toward the window).

After this test, wait 60 seconds for the rate limit window to reset, or use a fresh client_id for subsequent tests.

Test C.6 — Tier limit: exceed free-tier API call limit → 429 with tier_limit_exceeded

The free tier allows 1,000 API calls per day. For field trial, manually set the counter to the limit value to trigger the guard without making 1,000 real requests:

# Get the org_id from the token
ORG_ID=$(echo $NEW_ACCESS_TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq -r '.organization_id')

# Force the counter to the limit via Redis CLI
docker compose exec redis redis-cli SET "rate:tier:calls:$ORG_ID" 1001 EX 86400

# The next API call must be rejected
TIER_RESPONSE=$(curl -s -w "\n%{http_code}" \
  -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
  http://localhost:3000/api/v1/agents)

echo "$TIER_RESPONSE"

Expected: HTTP status 429. Response body must contain "code":"tier_limit_exceeded".

Reset the counter after this test:

docker compose exec redis redis-cli DEL "rate:tier:calls:$ORG_ID"

Test C.7 — Tenant isolation: Org A token cannot access Org B agents → 403

Create a second organisation and agent:

ORG_B_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/organizations \
  -H "Content-Type: application/json" \
  -d '{"name":"Org B","slug":"org-b"}')

ORG_B_ID=$(echo $ORG_B_RESPONSE | jq -r '.org_id')
echo "ORG_B_ID: $ORG_B_ID"

AGENT_B_RESPONSE=$(curl -s -X POST http://localhost:3000/api/v1/agents \
  -H "Content-Type: application/json" \
  -d "{
    \"email\": \"org-b-agent@org-b.sentryagent.ai\",
    \"agent_type\": \"monitor\",
    \"version\": \"1.0.0\",
    \"capabilities\": [],
    \"owner\": \"org-b\",
    \"deployment_env\": \"development\",
    \"organization_id\": \"$ORG_B_ID\"
  }")

AGENT_B_ID=$(echo $AGENT_B_RESPONSE | jq -r '.agent_id')
echo "AGENT_B_ID: $AGENT_B_ID"

Attempt to access Org B's agent using Org A's token:

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
  http://localhost:3000/api/v1/agents/$AGENT_B_ID

Expected HTTP status: 403


Phase D — Portal

Step D.1 — Install portal dependencies

cd portal && npm install && cd ..

Step D.2 — Start the portal development server

cd portal && npm run dev &

Wait 5 seconds for Next.js to compile, then verify it is listening:

curl -s -o /dev/null -w "%{http_code}" http://localhost:3001
# Expected: 200 or 307 (redirect to /login)

Step D.3 — Verify each portal route loads

Open a browser and navigate to each of the following URLs. Each must load without a JavaScript error in the browser console:

URL Expected
http://localhost:3001/login Login page renders
http://localhost:3001/agents Agent list renders (may be empty or show auth redirect)
http://localhost:3001/credentials Credentials page renders
http://localhost:3001/audit Audit log page renders
http://localhost:3001/analytics Analytics dashboard renders
http://localhost:3001/settings/tier Tier status page renders
http://localhost:3001/compliance Compliance report page renders
http://localhost:3001/webhooks Webhooks page renders
http://localhost:3001/marketplace Marketplace page renders

All 9 routes must load without a blank page or unhandled error.

Step D.4 — Verify analytics charts render

Navigate to http://localhost:3001/analytics.

Verify both of the following chart components are present in the page DOM:

curl -s http://localhost:3001/analytics | grep -c "recharts"
# Expected: 1 or more (recharts is used for TokenTrendChart and AgentHeatmap)

Step D.5 — Verify tier status page

Navigate to http://localhost:3001/settings/tier.

The page must display the current tier (expected: free for a new organisation).

Step D.6 — Stop the portal

kill $(lsof -ti:3001)

Phase E — AGNTCY Conformance

Step E.1 — Activate nvm

export NVM_DIR="$HOME/.nvm" && source "$NVM_DIR/nvm.sh"

Step E.2 — Run the AGNTCY conformance suite

npm run test:agntcy-conformance

Step E.3 — Expected output

AGNTCY Conformance Suite
  Agent Card Export
    ✓ exports valid AGNTCY agent card format
    ✓ agent card contains required identity fields
  Compliance Report
    ✓ generates SOC2-aligned compliance report
    ✓ compliance report includes all required control domains
  
4 passing (Xs)

All 4 tests must pass. A failure indicates a regression in AGNTCY conformance.

What each test validates:

Test What it validates
exports valid AGNTCY agent card format The /api/v1/compliance/agent-cards endpoint returns an array where each card has id, name, version, capabilities, did fields in AGNTCY format
agent card contains required identity fields Each agent card's identity block includes agent_id, organization_id, did, and deployment_env
generates SOC2-aligned compliance report The /api/v1/compliance/report endpoint returns a report with generated_at, controls, summary top-level keys
compliance report includes all required control domains The controls array in the report includes entries for access_control, audit_logging, credential_management, and tenant_isolation

Phase F — Performance Baseline

Prerequisite: Apache Bench (ab) must be installed. On Ubuntu: sudo apt install apache2-utils. Verify: ab -V

Step F.1 — Create a token payload file

cat > /tmp/token_payload.json << 'EOF'
grant_type=client_credentials&client_id=REPLACE_CLIENT_ID&client_secret=REPLACE_CLIENT_SECRET&scope=read
EOF

Replace REPLACE_CLIENT_ID and REPLACE_CLIENT_SECRET with $NEW_CLIENT_ID and $NEW_CLIENT_SECRET from Phase B:

cat > /tmp/token_payload.txt << EOF
grant_type=client_credentials&client_id=${NEW_CLIENT_ID}&client_secret=${NEW_CLIENT_SECRET}&scope=read
EOF

Step F.2 — Benchmark token endpoint

ab -n 100 -c 10 \
  -p /tmp/token_payload.txt \
  -T "application/x-www-form-urlencoded" \
  http://localhost:3000/api/v1/token

Pass criteria for token endpoint:

  • Requests per second > 10
  • Time per request (mean) < 100 ms
  • p95 (95th percentile, shown as 95% in the Percentage of requests table) < 100 ms
  • Zero non-2xx responses

Step F.3 — Benchmark agent list endpoint

Ensure $NEW_ACCESS_TOKEN is still set and valid. Issue a fresh token if needed:

NEW_ACCESS_TOKEN=$(curl -s -X POST http://localhost:3000/api/v1/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials&client_id=${NEW_CLIENT_ID}&client_secret=${NEW_CLIENT_SECRET}&scope=read" \
  | jq -r '.access_token')

Run the benchmark:

ab -n 100 -c 10 \
  -H "Authorization: Bearer $NEW_ACCESS_TOKEN" \
  http://localhost:3000/api/v1/agents

Pass criteria for agent list endpoint:

  • Time per request (mean) < 200 ms
  • p95 (95% row in the Percentage of requests table) < 200 ms
  • Zero non-2xx responses

Step F.4 — Record results

Record the following values from each ab output for the field trial report:

Endpoint Metric Value
/api/v1/token Requests per second
/api/v1/token Mean time per request (ms)
/api/v1/token p95 (ms)
/api/v1/agents Requests per second
/api/v1/agents Mean time per request (ms)
/api/v1/agents p95 (ms)

A field trial passes Phase F if all p95 values are within the pass criteria above.


Troubleshooting

Each entry follows the pattern: SymptomCauseFix with exact commands.


Port already in use

Symptom:

Error response from daemon: driver failed programming external connectivity on endpoint
sentryagent-idp-app-1: Bind for 0.0.0.0:3000 failed: port is already allocated

Fix: Kill the process occupying the port, then restart:

lsof -ti:3000 | xargs kill
lsof -ti:5432 | xargs kill
lsof -ti:6379 | xargs kill
docker compose up --build -d

Container shows unhealthy

Symptom: docker compose ps shows unhealthy for a service.

Fix: Check logs for the unhealthy service:

docker compose logs postgres
docker compose logs redis
docker compose logs app

Common causes:

Service Cause Fix
postgres Wrong database credentials Verify POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB in .env match values in compose.yaml
redis Port conflict Check lsof -ti:6379 and kill occupying process
app Missing env var Check docker compose logs app for Failed to start server message

Migration fails — connection refused

Symptom:

Migration failed: Error: connect ECONNREFUSED 127.0.0.1:5432

Cause: Running npm run db:migrate directly on the host (not inside the container) while PostgreSQL is running inside Docker.

Fix: Always run migrations inside the container during a field trial:

docker compose exec app npm run db:migrate

Migration fails — relation already exists

Symptom:

Migration failed: Error: relation "agents" already exists

Cause: A previous partial migration run left the database in an inconsistent state.

Fix: Check which migrations have been applied:

docker compose exec postgres psql -U sentryagent -d sentryagent_idp \
  -c "SELECT name FROM schema_migrations ORDER BY name;"

If the database state cannot be repaired, reset it:

docker compose down -v
docker compose up --build -d
docker compose exec app npm run db:migrate

docker compose down -v destroys all data. Use only when a clean slate is acceptable.


JWT error — invalid signature or key format

Symptom:

Failed to start server: Error: JWT_PRIVATE_KEY and JWT_PUBLIC_KEY environment variables are required

Or: All tokens return 401 Token signature is invalid.

Cause: JWT keys in .env have incorrect PEM format — literal newlines instead of \n sequences, or trailing whitespace.

Fix: Regenerate the keys and re-write them using the exact commands from Step 0.2 and 0.3.

Verify the key format in .env:

grep "JWT_PRIVATE_KEY" .env | head -c 100
# Expected: JWT_PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----\nMII...
# NOT:      JWT_PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----
#           MII...

The entire key must be on a single line with \n as literal backslash-n characters, not actual newlines.


Portal CORS error

Symptom: Browser console shows:

Access to XMLHttpRequest at 'http://localhost:3000/api/v1/...' from origin 'http://localhost:3001'
has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present

Cause: CORS_ORIGIN in .env does not include http://localhost:3001, or is set to a different value.

Fix:

sed -i "s|CORS_ORIGIN=.*|CORS_ORIGIN=http://localhost:3001|" .env
docker compose up --build -d

Wait for the app container to become healthy before retrying.


Tier counter not resetting

Symptom: All API calls return 429 tier_limit_exceeded even after waiting.

Cause: The Redis tier counter was manually set in Test C.6 and not deleted.

Fix:

# Get your org_id from the token
ORG_ID=$(echo $NEW_ACCESS_TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq -r '.organization_id')

docker compose exec redis redis-cli DEL "rate:tier:calls:$ORG_ID"
docker compose exec redis redis-cli DEL "rate:tier:tokens:$ORG_ID"

ab not found

Symptom: ab: command not found

Fix:

sudo apt-get update && sudo apt-get install -y apache2-utils
# or on macOS:
brew install httpd

AGNTCY conformance test fails

Symptom: One or more tests in npm run test:agntcy-conformance fail.

Diagnosis steps:

  1. Ensure the backend is running and healthy: curl -s http://localhost:3000/health
  2. Ensure COMPLIANCE_ENABLED=true in .env (check with grep COMPLIANCE_ENABLED .env)
  3. Ensure at least one agent has been registered (Phase B must have been completed)
  4. Check the test output for the specific assertion that failed
  5. Check docker compose logs app for errors around compliance report generation

If the issue is a Redis cache hit returning stale data:

docker compose exec redis redis-cli KEYS "compliance:*" | xargs docker compose exec redis redis-cli DEL

Then re-run the conformance suite.