Files
sentryagent-idp/docs/engineering/09-testing.md
SentryAgent.ai Developer eced5f8699 docs: engineering knowledge base for new hires
Complete docs/engineering/ suite — 12 documents covering company overview,
system architecture, tech stack ADRs, codebase structure, service deep dives,
annotated code walkthroughs, dev setup, engineering workflow, testing strategy,
deployment/ops, SDK guide, and README index. All content verified against
source files. All 82 tasks in openspec/changes/engineering-docs/tasks.md
marked complete.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:38:42 +00:00

16 KiB
Raw Blame History

09 — Testing Strategy


10.1 Test Types and Purposes

This codebase uses two types of tests. Understanding when to use each prevents you from writing integration tests for things that should be unit tests (slow) and unit tests for things that need a real database (misleading).

Unit Tests

Location: tests/unit/

What they test: A single class or function in complete isolation. All dependencies (repositories, services, external clients) are replaced with Jest mocks.

When to use:

  • Testing service business logic (free-tier limits, status transitions, error cases)
  • Testing utility functions (crypto, jwt, validators)
  • Testing error hierarchy behaviour
  • Any code that has conditional logic you want to test exhaustively

What they do NOT test:

  • Whether the SQL queries are correct
  • Whether the HTTP routing works
  • Whether middleware chains execute in the right order

Speed: Milliseconds. Hundreds of unit tests should complete in under 10 seconds.

Integration Tests

Location: tests/integration/

What they test: A full HTTP request through the Express application against a real PostgreSQL database and real Redis instance.

When to use:

  • Testing that a route is correctly wired to the right controller method
  • Testing authentication and authorisation middleware in combination
  • Testing database operations end-to-end (INSERT → read back → verify)
  • Testing response shapes match the OpenAPI spec exactly

What they require:

  • Running PostgreSQL (at TEST_DATABASE_URL or default)
  • Running Redis (at TEST_REDIS_URL or default)
  • The test creates its own tables and cleans up after every test case

Speed: Seconds. Expect 25 seconds per integration test file.


10.2 Test Framework Stack

Tool Role
Jest 29.7 Test runner. describe, it, expect, beforeEach, afterAll. Also provides mocking via jest.mock(), jest.fn(), jest.spyOn().
ts-jest Transforms TypeScript test files for Jest without a separate compilation step. Configured in jest.config.ts.
Supertest 6.3 HTTP testing library. Used in integration tests to make real HTTP requests against the Express app without opening a network port. Works by passing the Application object directly.

Jest configuration (jest.config.ts):

export default {
  preset: 'ts-jest',
  testEnvironment: 'node',
  roots: ['<rootDir>/tests'],
  testPathPattern: ['tests/unit', 'tests/integration'],
  collectCoverageFrom: ['src/**/*.ts', '!src/server.ts'],
};

10.3 Coverage Gates

All four coverage metrics must be above 80% before a feature is considered complete:

Metric Gate What it means
Statements >80% Each statement was executed at least once
Branches >80% Each if/else/switch branch was taken at least once
Functions >80% Each function was called at least once
Lines >80% Each line was executed at least once

Enforcement:

Coverage is checked in the PR process:

npm run test:unit -- --coverage
# Fails if any metric is below 80%

Coverage reports are output to coverage/lcov-report/index.html for visual inspection.

The coverage threshold configuration is in jest.config.ts:

coverageThreshold: {
  global: {
    statements: 80,
    branches: 80,
    functions: 80,
    lines: 80,
  },
},

10.4 How to Run the Test Suite

# Run all tests (unit + integration)
npm test

# Run only unit tests
npm run test:unit

# Run only integration tests
npm run test:integration

# Run unit tests with coverage report
npm run test:unit -- --coverage
# HTML report: coverage/lcov-report/index.html

# Run a single test file
npx jest tests/unit/services/AgentService.test.ts

# Run tests matching a name pattern
npx jest --testNamePattern="should throw FreeTierLimitError"

# Run tests in watch mode (re-runs on file changes)
npx jest --watch

# Run with verbose output (shows each test name)
npx jest --verbose

Integration test environment variables:

export TEST_DATABASE_URL=postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp_test
export TEST_REDIS_URL=redis://localhost:6379/1
npm run test:integration

Using database index /1 for Redis in tests prevents test runs from polluting the main database (index 0) used for local development.


10.5 Unit Test Writing Conventions

Unit tests follow a strict pattern. Study this example carefully — it shows every convention in use.

Real example from tests/unit/services/AgentService.test.ts:

/**
 * Unit tests for src/services/AgentService.ts
 */

import { AgentService } from '../../../src/services/AgentService';
import { AgentRepository } from '../../../src/repositories/AgentRepository';
import { CredentialRepository } from '../../../src/repositories/CredentialRepository';
import { AuditService } from '../../../src/services/AuditService';
import {
  AgentAlreadyExistsError,
  FreeTierLimitError,
} from '../../../src/utils/errors';
import { IAgent, ICreateAgentRequest } from '../../../src/types/index';

// Mock all dependencies — none of them execute real code
jest.mock('../../../src/repositories/AgentRepository');
jest.mock('../../../src/repositories/CredentialRepository');
jest.mock('../../../src/services/AuditService');

// Get typed mock constructors so we can call .mockResolvedValue() on them
const MockAgentRepository = AgentRepository as jest.MockedClass<typeof AgentRepository>;
const MockCredentialRepository = CredentialRepository as jest.MockedClass<typeof CredentialRepository>;
const MockAuditService = AuditService as jest.MockedClass<typeof AuditService>;

// Define a complete test fixture — reuse this instead of duplicating object literals
const MOCK_AGENT: IAgent = {
  agentId: 'a1b2c3d4-e5f6-7890-abcd-ef1234567890',
  email: 'agent@sentryagent.ai',
  agentType: 'screener',
  version: '1.0.0',
  capabilities: ['resume:read'],
  owner: 'team-a',
  deploymentEnv: 'production',
  status: 'active',
  createdAt: new Date('2026-03-28T09:00:00Z'),
  updatedAt: new Date('2026-03-28T09:00:00Z'),
};

describe('AgentService', () => {
  let agentService: AgentService;
  let agentRepo: jest.Mocked<AgentRepository>;
  let credentialRepo: jest.Mocked<CredentialRepository>;
  let auditService: jest.Mocked<AuditService>;

  beforeEach(() => {
    // Clear all mocks before each test — prevents state leakage
    jest.clearAllMocks();
    // Create fresh mock instances for each test
    agentRepo = new MockAgentRepository({} as never) as jest.Mocked<AgentRepository>;
    credentialRepo = new MockCredentialRepository({} as never) as jest.Mocked<CredentialRepository>;
    auditService = new MockAuditService({} as never) as jest.Mocked<AuditService>;
    // Inject mocks into the system under test
    agentService = new AgentService(agentRepo, credentialRepo, auditService);
  });

  describe('registerAgent()', () => {
    const createData: ICreateAgentRequest = {
      email: 'agent@sentryagent.ai',
      agentType: 'screener',
      version: '1.0.0',
      capabilities: ['resume:read'],
      owner: 'team-a',
      deploymentEnv: 'production',
    };

    it('should create and return a new agent', async () => {
      // Arrange — set up mock return values
      agentRepo.countActive.mockResolvedValue(0);
      agentRepo.findByEmail.mockResolvedValue(null);
      agentRepo.create.mockResolvedValue(MOCK_AGENT);
      auditService.logEvent.mockResolvedValue({} as never);

      // Act — call the method under test
      const result = await agentService.registerAgent(createData, '127.0.0.1', 'test/1.0');

      // Assert — verify the result
      expect(result).toEqual(MOCK_AGENT);
      // Also verify the mock was called with the right arguments
      expect(agentRepo.create).toHaveBeenCalledWith(createData);
    });

    it('should throw FreeTierLimitError when 100 agents already registered', async () => {
      // Arrange — simulate limit reached
      agentRepo.countActive.mockResolvedValue(100);

      // Assert error — rejects.toThrow checks the error type
      await expect(agentService.registerAgent(createData, '127.0.0.1', 'test/1.0'))
        .rejects.toThrow(FreeTierLimitError);
    });

    it('should throw AgentAlreadyExistsError if email is already registered', async () => {
      agentRepo.countActive.mockResolvedValue(0);
      agentRepo.findByEmail.mockResolvedValue(MOCK_AGENT); // Simulate existing agent

      await expect(agentService.registerAgent(createData, '127.0.0.1', 'test/1.0'))
        .rejects.toThrow(AgentAlreadyExistsError);
    });
  });
});

Conventions explained:

  1. One test file per source file. AgentService.test.ts tests AgentService.ts.
  2. jest.mock() before any imports from the mocked module. Jest hoists mock declarations.
  3. jest.clearAllMocks() in beforeEach. Prevents mock call counts from leaking between tests.
  4. AAA pattern (Arrange, Act, Assert). Every it block follows this order.
  5. Test both the happy path and every error case. A service with 3 error conditions needs at least 4 tests (1 success + 3 failures).
  6. Verify mock calls for side effects. Use .toHaveBeenCalledWith() to verify that auditService.logEvent was called with the right arguments, not just that it was called.
  7. Use typed error assertions. .rejects.toThrow(FreeTierLimitError) verifies the error type, not just a message string.

10.6 Integration Test Writing Conventions

Integration tests use Supertest to make real HTTP requests against a live Express app.

Real example from tests/integration/agents.test.ts:

/**
 * Integration tests for Agent Registry endpoints.
 */

import crypto from 'crypto';
import request from 'supertest';
import { Application } from 'express';
import { v4 as uuidv4 } from 'uuid';
import { Pool } from 'pg';

// Generate RSA keys for test tokens — done once per test module
const { privateKey, publicKey } = crypto.generateKeyPairSync('rsa', {
  modulusLength: 2048,
  publicKeyEncoding: { type: 'spki', format: 'pem' },
  privateKeyEncoding: { type: 'pkcs8', format: 'pem' },
});

// Set environment variables BEFORE importing the app
process.env['DATABASE_URL'] = process.env['TEST_DATABASE_URL'] ?? 'postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp_test';
process.env['REDIS_URL'] = process.env['TEST_REDIS_URL'] ?? 'redis://localhost:6379/1';
process.env['JWT_PRIVATE_KEY'] = privateKey;
process.env['JWT_PUBLIC_KEY'] = publicKey;
process.env['NODE_ENV'] = 'test';

import { createApp } from '../../src/app';
import { signToken } from '../../src/utils/jwt';
import { closePool } from '../../src/db/pool';
import { closeRedisClient } from '../../src/cache/redis';

// Helper: mint a valid test token
function makeToken(sub: string = uuidv4(), scope: string = 'agents:read agents:write'): string {
  return signToken({ sub, client_id: sub, scope, jti: uuidv4() }, privateKey);
}

describe('Agent Registry Integration Tests', () => {
  let app: Application;
  let pool: Pool;

  beforeAll(async () => {
    // Boot the real Express app
    app = await createApp();
    pool = new Pool({ connectionString: process.env['DATABASE_URL'] });

    // Create test tables (idempotent)
    await pool.query(`CREATE TABLE IF NOT EXISTS agents (...)`);
  });

  afterEach(async () => {
    // Clean up after each test — order matters (foreign key constraints)
    await pool.query('DELETE FROM audit_events');
    await pool.query('DELETE FROM credentials');
    await pool.query('DELETE FROM agents');
  });

  afterAll(async () => {
    // Close all connections — prevents Jest from hanging
    await pool.end();
    await closePool();
    await closeRedisClient();
  });

  describe('POST /api/v1/agents', () => {
    it('should register a new agent and return 201', async () => {
      const token = makeToken();

      const res = await request(app)
        .post('/api/v1/agents')
        .set('Authorization', `Bearer ${token}`)
        .send({
          email: 'test-agent@sentryagent.ai',
          agentType: 'screener',
          version: '1.0.0',
          capabilities: ['resume:read'],
          owner: 'test-team',
          deploymentEnv: 'development',
        });

      expect(res.status).toBe(201);
      expect(res.body.agentId).toBeDefined();
      expect(res.body.email).toBe('test-agent@sentryagent.ai');
      expect(res.body.status).toBe('active');
    });

    it('should return 401 without a token', async () => {
      const res = await request(app)
        .post('/api/v1/agents')
        .send({ email: 'test@sentryagent.ai' });

      expect(res.status).toBe(401);
    });

    it('should return 409 for duplicate email', async () => {
      const token = makeToken();
      const body = { email: 'dup@sentryagent.ai', agentType: 'screener', version: '1.0', capabilities: [], owner: 'team', deploymentEnv: 'development' };

      await request(app).post('/api/v1/agents').set('Authorization', `Bearer ${token}`).send(body);
      const res = await request(app).post('/api/v1/agents').set('Authorization', `Bearer ${token}`).send(body);

      expect(res.status).toBe(409);
      expect(res.body.code).toBe('AGENT_ALREADY_EXISTS');
    });
  });
});

Conventions explained:

  1. Set process.env before importing the app. The app reads env vars at import time (getPool(), JWT keys). Setting them after import does nothing.
  2. afterEach cleanup. Delete all rows after each test so tests are independent. Always delete in child-to-parent order (audit_events → credentials → agents) to respect foreign key constraints.
  3. afterAll close connections. Always close the pool and Redis client at the end of the suite. Jest will hang if connections remain open.
  4. Test both success and failure status codes. Every endpoint test must include an unauthenticated request (401) and an invalid request (400).
  5. Verify response body shape. Check res.body.code for error responses to verify the correct error type, not just the status code.
  6. Use makeToken() for test tokens. A helper function keeps token creation consistent across all integration test files.

10.7 OWASP Top 10 Security Testing Reference

These are the security concerns most relevant to an identity provider. For each, here is what AgentIdP does to mitigate the risk and how to test it.

OWASP Category Relevant risk Mitigation Test approach
A01 Broken Access Control Agent A accesses agent B's credentials req.user.sub !== agentId check in all credential endpoints Test: send credential request with a token for agent A but agentId for agent B in the path — expect 403
A02 Cryptographic Failures Weak credential secrets or JWT algorithm sk_live_<64 hex> = 256-bit entropy; RS256 signing; bcrypt 10 rounds Test: verify generated secrets are 72 chars; verify JWT header shows alg: RS256
A03 Injection SQL injection via input fields Parameterised queries ($1, $2, ...) in all repositories Test: send '; DROP TABLE agents; -- as owner field — expect 400 from Joi validation
A05 Security Misconfiguration Server leaking stack traces errorHandler returns generic 500 for unknown errors Test: trigger an unexpected error (mock a repository to throw new Error()) — verify response body does not contain stack trace
A06 Vulnerable Components Outdated dependencies with CVEs Regular npm audit Run: npm audit in CI; fail on high/critical findings
A07 Auth Failures Timing attack on credential verification crypto.timingSafeEqual in VaultClient.verifySecret(); bcrypt inherently timing-safe Test: measure multiple failed verification attempts with wrong secrets of varying lengths — timing should not increase linearly with shared prefix length
A08 Integrity Failures Forged JWT tokens RS256 verification rejects tokens signed with wrong key Test: create a token signed with a different private key — expect 401
A09 Logging Failures Auth failures not logged auth.failed audit events written for every authentication failure Test: attempt token issuance with wrong secret — verify auth_events table contains auth.failed row
A10 SSRF Not applicable to current API surface No outbound HTTP from user-supplied URLs N/A — no URL-accepting fields in current API

JWT algorithm confusion (bonus): Test that the server rejects tokens with alg: none or alg: HS256. The verifyToken() function specifies algorithms: ['RS256'], which causes jsonwebtoken to reject any token with a different algorithm header.