sentryagent-idp/docs/engineering/09-testing.md

# 09 — Testing Strategy

---

## 10.1 Test Types and Purposes

This codebase uses two types of tests. Understanding when to use each prevents
you from writing integration tests for things that should be unit tests (slow)
and unit tests for things that need a real database (misleading).

### Unit Tests

**Location:** `tests/unit/`

**What they test:** A single class or function in complete isolation. All
dependencies (repositories, services, external clients) are replaced with Jest mocks.

**When to use:**
- Testing service business logic (free-tier limits, status transitions, error cases)
- Testing utility functions (crypto, jwt, validators)
- Testing error hierarchy behaviour
- Any code that has conditional logic you want to test exhaustively

**What they do NOT test:**
- Whether the SQL queries are correct
- Whether the HTTP routing works
- Whether middleware chains execute in the right order

**Speed:** Milliseconds. Hundreds of unit tests should complete in under 10 seconds.

### Integration Tests

**Location:** `tests/integration/`

**What they test:** A full HTTP request through the Express application against
a real PostgreSQL database and real Redis instance.

**When to use:**
- Testing that a route is correctly wired to the right controller method
- Testing authentication and authorisation middleware in combination
- Testing database operations end-to-end (INSERT → read back → verify)
- Testing response shapes match the OpenAPI spec exactly

**What they require:**
- Running PostgreSQL (at `TEST_DATABASE_URL` or default)
- Running Redis (at `TEST_REDIS_URL` or default)
- The test creates its own tables and cleans up after every test case

**Speed:** Seconds. Expect 2–5 seconds per integration test file.

---

## 10.2 Test Framework Stack

| Tool | Role |
|------|------|
| **Jest 29.7** | Test runner. `describe`, `it`, `expect`, `beforeEach`, `afterAll`. Also provides mocking via `jest.mock()`, `jest.fn()`, `jest.spyOn()`. |
| **ts-jest** | Transforms TypeScript test files for Jest without a separate compilation step. Configured in `jest.config.ts`. |
| **Supertest 6.3** | HTTP testing library. Used in integration tests to make real HTTP requests against the Express app without opening a network port. Works by passing the `Application` object directly. |

**Jest configuration** (`jest.config.ts`):
```typescript
export default {
  preset: 'ts-jest',
  testEnvironment: 'node',
  roots: ['<rootDir>/tests'],
  testPathPattern: ['tests/unit', 'tests/integration'],
  collectCoverageFrom: ['src/**/*.ts', '!src/server.ts'],
};
```

---

## 10.3 Coverage Gates

All four coverage metrics must be above 80% before a feature is considered complete:

| Metric | Gate | What it means |
|--------|------|---------------|
| Statements | >80% | Each statement was executed at least once |
| Branches | >80% | Each `if`/`else`/`switch` branch was taken at least once |
| Functions | >80% | Each function was called at least once |
| Lines | >80% | Each line was executed at least once |

**Enforcement:**

Coverage is checked in the PR process:
```bash
npm run test:unit -- --coverage
# Fails if any metric is below 80%
```

Coverage reports are output to `coverage/lcov-report/index.html` for visual inspection.

The coverage threshold configuration is in `jest.config.ts`:
```typescript
coverageThreshold: {
  global: {
    statements: 80,
    branches: 80,
    functions: 80,
    lines: 80,
  },
},
```

---

## 10.4 How to Run the Test Suite

```bash
# Run all tests (unit + integration)
npm test

# Run only unit tests
npm run test:unit

# Run only integration tests
npm run test:integration

# Run unit tests with coverage report
npm run test:unit -- --coverage
# HTML report: coverage/lcov-report/index.html

# Run a single test file
npx jest tests/unit/services/AgentService.test.ts

# Run tests matching a name pattern
npx jest --testNamePattern="should throw FreeTierLimitError"

# Run tests in watch mode (re-runs on file changes)
npx jest --watch

# Run with verbose output (shows each test name)
npx jest --verbose
```

**Integration test environment variables:**
```bash
export TEST_DATABASE_URL=postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp_test
export TEST_REDIS_URL=redis://localhost:6379/1
npm run test:integration
```

Using database index `/1` for Redis in tests prevents test runs from polluting
the main database (index `0`) used for local development.

---

## 10.5 Unit Test Writing Conventions

Unit tests follow a strict pattern. Study this example carefully — it shows every
convention in use.

**Real example from `tests/unit/services/AgentService.test.ts`:**

```typescript
/**
 * Unit tests for src/services/AgentService.ts
 */

import { AgentService } from '../../../src/services/AgentService';
import { AgentRepository } from '../../../src/repositories/AgentRepository';
import { CredentialRepository } from '../../../src/repositories/CredentialRepository';
import { AuditService } from '../../../src/services/AuditService';
import {
  AgentAlreadyExistsError,
  FreeTierLimitError,
} from '../../../src/utils/errors';
import { IAgent, ICreateAgentRequest } from '../../../src/types/index';

// Mock all dependencies — none of them execute real code
jest.mock('../../../src/repositories/AgentRepository');
jest.mock('../../../src/repositories/CredentialRepository');
jest.mock('../../../src/services/AuditService');

// Get typed mock constructors so we can call .mockResolvedValue() on them
const MockAgentRepository = AgentRepository as jest.MockedClass<typeof AgentRepository>;
const MockCredentialRepository = CredentialRepository as jest.MockedClass<typeof CredentialRepository>;
const MockAuditService = AuditService as jest.MockedClass<typeof AuditService>;

// Define a complete test fixture — reuse this instead of duplicating object literals
const MOCK_AGENT: IAgent = {
  agentId: 'a1b2c3d4-e5f6-7890-abcd-ef1234567890',
  email: 'agent@sentryagent.ai',
  agentType: 'screener',
  version: '1.0.0',
  capabilities: ['resume:read'],
  owner: 'team-a',
  deploymentEnv: 'production',
  status: 'active',
  createdAt: new Date('2026-03-28T09:00:00Z'),
  updatedAt: new Date('2026-03-28T09:00:00Z'),
};

describe('AgentService', () => {
  let agentService: AgentService;
  let agentRepo: jest.Mocked<AgentRepository>;
  let credentialRepo: jest.Mocked<CredentialRepository>;
  let auditService: jest.Mocked<AuditService>;

  beforeEach(() => {
    // Clear all mocks before each test — prevents state leakage
    jest.clearAllMocks();
    // Create fresh mock instances for each test
    agentRepo = new MockAgentRepository({} as never) as jest.Mocked<AgentRepository>;
    credentialRepo = new MockCredentialRepository({} as never) as jest.Mocked<CredentialRepository>;
    auditService = new MockAuditService({} as never) as jest.Mocked<AuditService>;
    // Inject mocks into the system under test
    agentService = new AgentService(agentRepo, credentialRepo, auditService);
  });

  describe('registerAgent()', () => {
    const createData: ICreateAgentRequest = {
      email: 'agent@sentryagent.ai',
      agentType: 'screener',
      version: '1.0.0',
      capabilities: ['resume:read'],
      owner: 'team-a',
      deploymentEnv: 'production',
    };

    it('should create and return a new agent', async () => {
      // Arrange — set up mock return values
      agentRepo.countActive.mockResolvedValue(0);
      agentRepo.findByEmail.mockResolvedValue(null);
      agentRepo.create.mockResolvedValue(MOCK_AGENT);
      auditService.logEvent.mockResolvedValue({} as never);

      // Act — call the method under test
      const result = await agentService.registerAgent(createData, '127.0.0.1', 'test/1.0');

      // Assert — verify the result
      expect(result).toEqual(MOCK_AGENT);
      // Also verify the mock was called with the right arguments
      expect(agentRepo.create).toHaveBeenCalledWith(createData);
    });

    it('should throw FreeTierLimitError when 100 agents already registered', async () => {
      // Arrange — simulate limit reached
      agentRepo.countActive.mockResolvedValue(100);

      // Assert error — rejects.toThrow checks the error type
      await expect(agentService.registerAgent(createData, '127.0.0.1', 'test/1.0'))
        .rejects.toThrow(FreeTierLimitError);
    });

    it('should throw AgentAlreadyExistsError if email is already registered', async () => {
      agentRepo.countActive.mockResolvedValue(0);
      agentRepo.findByEmail.mockResolvedValue(MOCK_AGENT); // Simulate existing agent

      await expect(agentService.registerAgent(createData, '127.0.0.1', 'test/1.0'))
        .rejects.toThrow(AgentAlreadyExistsError);
    });
  });
});
```

### Conventions explained:

1. **One test file per source file.** `AgentService.test.ts` tests `AgentService.ts`.
2. **`jest.mock()` before any imports from the mocked module.** Jest hoists mock declarations.
3. **`jest.clearAllMocks()` in `beforeEach`.** Prevents mock call counts from leaking between tests.
4. **AAA pattern (Arrange, Act, Assert).** Every `it` block follows this order.
5. **Test both the happy path and every error case.** A service with 3 error conditions
   needs at least 4 tests (1 success + 3 failures).
6. **Verify mock calls for side effects.** Use `.toHaveBeenCalledWith()` to verify that
   `auditService.logEvent` was called with the right arguments, not just that it was called.
7. **Use typed error assertions.** `.rejects.toThrow(FreeTierLimitError)` verifies the
   error type, not just a message string.

---

## 10.6 Integration Test Writing Conventions

Integration tests use Supertest to make real HTTP requests against a live Express app.

**Real example from `tests/integration/agents.test.ts`:**

```typescript
/**
 * Integration tests for Agent Registry endpoints.
 */

import crypto from 'crypto';
import request from 'supertest';
import { Application } from 'express';
import { v4 as uuidv4 } from 'uuid';
import { Pool } from 'pg';

// Generate RSA keys for test tokens — done once per test module
const { privateKey, publicKey } = crypto.generateKeyPairSync('rsa', {
  modulusLength: 2048,
  publicKeyEncoding: { type: 'spki', format: 'pem' },
  privateKeyEncoding: { type: 'pkcs8', format: 'pem' },
});

// Set environment variables BEFORE importing the app
process.env['DATABASE_URL'] = process.env['TEST_DATABASE_URL'] ?? 'postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp_test';
process.env['REDIS_URL'] = process.env['TEST_REDIS_URL'] ?? 'redis://localhost:6379/1';
process.env['JWT_PRIVATE_KEY'] = privateKey;
process.env['JWT_PUBLIC_KEY'] = publicKey;
process.env['NODE_ENV'] = 'test';

import { createApp } from '../../src/app';
import { signToken } from '../../src/utils/jwt';
import { closePool } from '../../src/db/pool';
import { closeRedisClient } from '../../src/cache/redis';

// Helper: mint a valid test token
function makeToken(sub: string = uuidv4(), scope: string = 'agents:read agents:write'): string {
  return signToken({ sub, client_id: sub, scope, jti: uuidv4() }, privateKey);
}

describe('Agent Registry Integration Tests', () => {
  let app: Application;
  let pool: Pool;

  beforeAll(async () => {
    // Boot the real Express app
    app = await createApp();
    pool = new Pool({ connectionString: process.env['DATABASE_URL'] });

    // Create test tables (idempotent)
    await pool.query(`CREATE TABLE IF NOT EXISTS agents (...)`);
  });

  afterEach(async () => {
    // Clean up after each test — order matters (foreign key constraints)
    await pool.query('DELETE FROM audit_events');
    await pool.query('DELETE FROM credentials');
    await pool.query('DELETE FROM agents');
  });

  afterAll(async () => {
    // Close all connections — prevents Jest from hanging
    await pool.end();
    await closePool();
    await closeRedisClient();
  });

  describe('POST /api/v1/agents', () => {
    it('should register a new agent and return 201', async () => {
      const token = makeToken();

      const res = await request(app)
        .post('/api/v1/agents')
        .set('Authorization', `Bearer ${token}`)
        .send({
          email: 'test-agent@sentryagent.ai',
          agentType: 'screener',
          version: '1.0.0',
          capabilities: ['resume:read'],
          owner: 'test-team',
          deploymentEnv: 'development',
        });

      expect(res.status).toBe(201);
      expect(res.body.agentId).toBeDefined();
      expect(res.body.email).toBe('test-agent@sentryagent.ai');
      expect(res.body.status).toBe('active');
    });

    it('should return 401 without a token', async () => {
      const res = await request(app)
        .post('/api/v1/agents')
        .send({ email: 'test@sentryagent.ai' });

      expect(res.status).toBe(401);
    });

    it('should return 409 for duplicate email', async () => {
      const token = makeToken();
      const body = { email: 'dup@sentryagent.ai', agentType: 'screener', version: '1.0', capabilities: [], owner: 'team', deploymentEnv: 'development' };

      await request(app).post('/api/v1/agents').set('Authorization', `Bearer ${token}`).send(body);
      const res = await request(app).post('/api/v1/agents').set('Authorization', `Bearer ${token}`).send(body);

      expect(res.status).toBe(409);
      expect(res.body.code).toBe('AGENT_ALREADY_EXISTS');
    });
  });
});
```

### Conventions explained:

1. **Set `process.env` before importing the app.** The app reads env vars at import
   time (`getPool()`, JWT keys). Setting them after import does nothing.
2. **`afterEach` cleanup.** Delete all rows after each test so tests are independent.
   Always delete in child-to-parent order (audit_events → credentials → agents)
   to respect foreign key constraints.
3. **`afterAll` close connections.** Always close the pool and Redis client at the end
   of the suite. Jest will hang if connections remain open.
4. **Test both success and failure status codes.** Every endpoint test must include
   an unauthenticated request (401) and an invalid request (400).
5. **Verify response body shape.** Check `res.body.code` for error responses to
   verify the correct error type, not just the status code.
6. **Use `makeToken()` for test tokens.** A helper function keeps token creation
   consistent across all integration test files.

---

## 10.7 OWASP Top 10 Security Testing Reference

These are the security concerns most relevant to an identity provider. For each,
here is what AgentIdP does to mitigate the risk and how to test it.

| OWASP Category | Relevant risk | Mitigation | Test approach |
|---------------|--------------|-----------|---------------|
| **A01 Broken Access Control** | Agent A accesses agent B's credentials | `req.user.sub !== agentId` check in all credential endpoints | Test: send credential request with a token for agent A but agentId for agent B in the path — expect 403 |
| **A02 Cryptographic Failures** | Weak credential secrets or JWT algorithm | `sk_live_<64 hex>` = 256-bit entropy; RS256 signing; bcrypt 10 rounds | Test: verify generated secrets are 72 chars; verify JWT header shows `alg: RS256` |
| **A03 Injection** | SQL injection via input fields | Parameterised queries (`$1, $2, ...`) in all repositories | Test: send `'; DROP TABLE agents; --` as `owner` field — expect 400 from Joi validation |
| **A05 Security Misconfiguration** | Server leaking stack traces | `errorHandler` returns generic 500 for unknown errors | Test: trigger an unexpected error (mock a repository to throw `new Error()`) — verify response body does not contain stack trace |
| **A06 Vulnerable Components** | Outdated dependencies with CVEs | Regular `npm audit` | Run: `npm audit` in CI; fail on high/critical findings |
| **A07 Auth Failures** | Timing attack on credential verification | `crypto.timingSafeEqual` in `VaultClient.verifySecret()`; bcrypt inherently timing-safe | Test: measure multiple failed verification attempts with wrong secrets of varying lengths — timing should not increase linearly with shared prefix length |
| **A08 Integrity Failures** | Forged JWT tokens | RS256 verification rejects tokens signed with wrong key | Test: create a token signed with a different private key — expect 401 |
| **A09 Logging Failures** | Auth failures not logged | `auth.failed` audit events written for every authentication failure | Test: attempt token issuance with wrong secret — verify `auth_events` table contains `auth.failed` row |
| **A10 SSRF** | Not applicable to current API surface | No outbound HTTP from user-supplied URLs | N/A — no URL-accepting fields in current API |

**JWT algorithm confusion (bonus):**
Test that the server rejects tokens with `alg: none` or `alg: HS256`. The
`verifyToken()` function specifies `algorithms: ['RS256']`, which causes jsonwebtoken
to reject any token with a different algorithm header.

---

## 10.8 AGNTCY Conformance Test Suite

**Location:** `tests/agntcy-conformance/conformance.test.ts`

**Purpose:** Verifies that the AgentIdP platform conforms to the AGNTCY agent identity specification. These tests exercise live HTTP requests through the Express application against real PostgreSQL and Redis instances, exactly like integration tests — but they validate AGNTCY-specific protocol guarantees rather than individual endpoint correctness.

**How to run:**

```bash
# Run the conformance suite (separate Jest config)
npm run test:agntcy-conformance

# Equivalent long form
npx jest --config tests/agntcy-conformance/jest.config.cjs

# Run with TEST_DATABASE_URL and TEST_REDIS_URL overrides
TEST_DATABASE_URL=postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp_test \
TEST_REDIS_URL=redis://localhost:6379/1 \
npm run test:agntcy-conformance

# Enable A2A delegation conformance tests (gated by env var)
A2A_ENABLED=true npm run test:agntcy-conformance
```

The conformance suite uses its own `jest.config.cjs` (located in `tests/agntcy-conformance/`) so it does not run with `npm test` by default. This is intentional — the suite requires `COMPLIANCE_ENABLED=true` and optionally `A2A_ENABLED=true`, which should not be required for the standard unit/integration test run.

**What each test validates:**

| Conformance Test | What it validates | AGNTCY Domain |
|-----------------|-------------------|---------------|
| **Conformance 1 — Agent registration creates DID:WEB identifier** | `POST /api/v1/agents` returns a `did` field matching `did:web:*` pattern when `DID_WEB_DOMAIN` is set. The `did` field is optional in the response (test is conditional on presence) — but when present, it must conform to the `did:web:` scheme. | Non-Human Identity |
| **Conformance 2 — Token issuance via `client_credentials` grant** | Registers an agent, generates credentials via API, then exercises the full OAuth 2.0 Client Credentials flow. Validates that `POST /api/v1/token` returns a 200 response with `access_token` (string), `token_type: 'Bearer'`, and a JWT with 3 dot-separated parts. | Authentication |
| **Conformance 3 — A2A delegation chain create + verify** | _(Gated by `A2A_ENABLED=true`.)_ Creates a delegation chain between two agents via `POST /api/v1/oauth2/token/delegate`. If a token is returned, verifies it via `POST /api/v1/oauth2/token/verify-delegation`. Accepts 200 or 201 on creation and 200 or 204 on verification. | Agent-to-Agent Trust |
| **Conformance 4 — Compliance report returns valid AGNTCY structure** | Calls `GET /api/v1/compliance/report` and validates all required AGNTCY fields: `generated_at` (valid ISO 8601), `tenant_id` (string), `agntcy_schema_version: '1.0'`, `sections` (array with `name`, `status`, `details` per entry), `overall_status` (one of `pass/fail/warn`). Also verifies the `agent-identity` and `audit-trail` section names are present. A second request verifies the Redis cache (`X-Cache: HIT` header and `from_cache: true` body field). | Audit, Compliance |

**Schema tables created by conformance suite:** The suite creates its own tables using `CREATE TABLE IF NOT EXISTS` before tests run. The tables match the production schema and include: `organizations`, `agents`, `credentials`, `audit_events`, `token_revocations`, `agent_did_keys`, `delegation_chains`. These are cleaned up via `DELETE` in `afterEach` (child-to-parent order respecting FK constraints) and dropped implicitly when the test database is reset.

**Environment variables used:**

| Variable | Required | Purpose |
|---|---|---|
| `TEST_DATABASE_URL` | Yes (or default) | PostgreSQL connection string for the test database |
| `TEST_REDIS_URL` | Yes (or default) | Redis connection string (index 1 recommended) |
| `COMPLIANCE_ENABLED` | Yes (`'true'`) | Enables the compliance report endpoint |
| `A2A_ENABLED` | No (default `'true'`) | Set to `'false'` to skip Conformance 3 (A2A delegation) |
| `DID_WEB_DOMAIN` | No | When set, Conformance 1 validates the `did:web:` format |

---

## 10.9 Tier Enforcement Tests

**Location:** `tests/unit/services/TierService.test.ts` and `tests/integration/`

**The TierService has the following test cases that must all pass:**

### Unit tests (`tests/unit/services/TierService.test.ts`)

The unit tests mock PostgreSQL (`Pool`) and Redis (`RedisClientType`) and Stripe. Key scenarios:

| Test | Description |
|------|-------------|
| `getStatus() — returns correct tier and limits` | Mocks `SELECT tier FROM organizations` returning `'pro'`; mocks Redis GET calls for `rate:tier:calls` and `rate:tier:tokens`; verifies `ITierStatus.limits` matches `TIER_CONFIG['pro']`. |
| `getStatus() — falls back to 0 when Redis unavailable` | Redis GET throws; verifies `usage.callsToday = 0` and `usage.tokensToday = 0` with no error thrown. |
| `getStatus() — returns 'free' when org not found` | `SELECT` returns 0 rows; verifies `tier === 'free'`. |
| `initiateUpgrade() — throws ValidationError on downgrade attempt` | `targetTier = 'free'` when current is `'pro'`; verifies `ValidationError` is thrown with `TIER_RANK` comparison failure message. |
| `initiateUpgrade() — calls Stripe with correct metadata` | Verifies `stripe.checkout.sessions.create` is called with `metadata: { orgId, targetTier }` and `mode: 'subscription'`. |
| `applyUpgrade() — executes UPDATE organizations SET tier` | Verifies parameterized SQL is called with `[targetTier, orgId]`. |
| `enforceAgentLimit() — throws TierLimitError when limit reached` | Mock agent count equals `TIER_CONFIG[tier].maxAgents`; verifies `TierLimitError` with `limit` and `current` details. |
| `enforceAgentLimit() — no-op for Enterprise tier` | `TIER_CONFIG['enterprise'].maxAgents = Infinity`; verifies no SQL query for agent count and no error. |
| `fetchTier() — returns 'free' for unknown tier string in DB` | DB returns unrecognised string; verifies `isTierName` guard returns `'free'`. |

### Integration (middleware) tests

When writing integration tests for the tier enforcement middleware (`src/middleware/tier.ts`), the following scenarios must be covered:

| Scenario | Expected behaviour |
|----------|-------------------|
| Request with org on `free` tier, under daily call limit | Request proceeds normally (2xx from downstream handler) |
| Request that would exceed `maxCallsPerDay` for the org's tier | `429 TierLimitError` — body contains `code: 'TIER_LIMIT_EXCEEDED'` |
| Request to `/health` or `/metrics` (unprotected routes) | Tier middleware not applied — always 200 |
| Org not found in `organizations` table | Defaults to `free` tier limits |

---

## 10.10 Analytics Service Tests

**Location:** `tests/unit/services/AnalyticsService.test.ts`

The AnalyticsService unit tests mock the PostgreSQL `Pool`. Key scenarios that must be covered:

| Test | Description |
|------|-------------|
| `recordEvent() — executes UPSERT without throwing` | Verifies `pool.query` is called with the `INSERT ... ON CONFLICT DO UPDATE` SQL pattern and the correct `[tenantId, metricType]` parameters. |
| `recordEvent() — catches and swallows pool errors` | Pool `query` throws; verifies `recordEvent` resolves (not rejects) and the error does not propagate. This is the fire-and-forget contract. |
| `getTokenTrend() — clamps days to 90` | Calls with `days = 200`; verifies `pool.query` receives `clampedDays = 90` as the first parameter. |
| `getTokenTrend() — maps rows to ITokenTrendEntry[]` | Mock returns rows with `date: '2026-03-01', count: '42'`; verifies the result is `[{ date: '2026-03-01', count: 42 }]` (count coerced to number). |
| `getAgentActivity() — maps rows to IAgentActivityEntry[]` | Mock returns rows with string-typed `dow`, `hour`, `count`; verifies all are coerced to numbers in the result. |
| `getAgentUsageSummary() — maps rows to IAgentUsageSummaryEntry[]` | Mock returns rows with `token_count: '150'`; verifies `token_count: 150` (number) in the result. |
| `getAgentUsageSummary() — joins with agents table on organization_id` | Verifies the SQL query joins `agents` with `LEFT JOIN analytics_events` and filters `a.organization_id = $1`. |

**Coverage gate:** `AnalyticsService` must maintain >80% statement, branch, function, and line coverage. Run:

```bash
npm run test:unit -- --coverage --testPathPattern=AnalyticsService
```

---

## 10.11 Running the Complete Phase 6 Test Matrix

All of the following must pass before any Phase 6 feature is considered complete:

```bash
# 1. Unit tests (all services including Phase 3–6)
npm run test:unit -- --coverage
# Must exit 0 with all 4 coverage metrics ≥ 80%

# 2. Integration tests (requires PostgreSQL + Redis running)
npm run test:integration

# 3. AGNTCY conformance suite
COMPLIANCE_ENABLED=true \
A2A_ENABLED=true \
npm run test:agntcy-conformance

# 4. Dependency security audit
npm audit --audit-level=high
# Must exit 0 — no high or critical vulnerabilities

# 5. TypeScript compilation
npx tsc --noEmit
# Must exit 0 — zero type errors
```

**Current test file inventory** (as of Phase 6 completion):

Unit test files in `tests/unit/services/`:

| File | Service tested |
|------|---------------|
| `AgentService.test.ts` | `AgentService` |
| `AnalyticsService.test.ts` | `AnalyticsService` |
| `AuditService.test.ts` | `AuditService` |
| `AuditVerificationService.test.ts` | `AuditVerificationService` |
| `BillingService.test.ts` | `BillingService` |
| `ComplianceService.test.ts` | `ComplianceService` |
| `CredentialService.test.ts` | `CredentialService` |
| `DIDService.test.ts` | `DIDService` |
| `DelegationService.test.ts` | `DelegationService` |
| `EncryptionService.test.ts` | `EncryptionService` |
| `FederationService.test.ts` | `FederationService` |
| `IDTokenService.test.ts` | `IDTokenService` |
| `OAuth2Service.test.ts` | `OAuth2Service` |
| `OIDCKeyService.test.ts` | `OIDCKeyService` |
| `OrgService.test.ts` | `OrgService` |
| `ScaffoldService.test.ts` | `ScaffoldService` |
| `ScaffoldService.errors.test.ts` | `ScaffoldService` error cases |
| `TierService.test.ts` | `TierService` |
| `WebhookService.test.ts` | `WebhookService` |