Files
sentryagent-idp/openspec/changes/archive/engineering-docs-phase6-update/specs/ws4-testing/spec.md
SentryAgent.ai Developer 8cabc0191c docs: commit all Phase 6 documentation updates and OpenSpec archives
- devops docs: 8 files updated for Phase 6 state; field-trial.md added (946-line runbook)
- developer docs: api-reference (50+ endpoints), quick-start, 5 existing guides updated, 5 new guides added
- engineering docs: all 12 files updated (services, architecture, SDK guide, testing, overview)
- OpenSpec archives: phase-7-devops-field-trial, developer-docs-phase6-update, engineering-docs-phase6-update
- VALIDATOR.md + scripts/start-validator.sh: V&V Architect tooling added
- .gitignore: exclude session artifacts, build artifacts, and agent workspaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 02:24:24 +00:00

10 KiB
Raw Blame History

WS4 — Testing Documentation Updates

Target file: docs/engineering/09-testing.md

Operation: Append four new subsections to the end of docs/engineering/09-testing.md. Do not modify any existing content.


Instructions to Developer

Append the following Markdown verbatim to the end of docs/engineering/09-testing.md, after the final line of ## 10.7 OWASP Top 10 Security Testing Reference.


Content to Append

---

## 10.8 AGNTCY Conformance Test Suite

**Location:** `tests/agntcy-conformance/conformance.test.ts`

**Purpose:** Verifies that the AgentIdP platform conforms to the AGNTCY agent identity specification. These tests exercise live HTTP requests through the Express application against real PostgreSQL and Redis instances, exactly like integration tests — but they validate AGNTCY-specific protocol guarantees rather than individual endpoint correctness.

**How to run:**

```bash
# Run the conformance suite (separate Jest config)
npm run test:agntcy-conformance

# Equivalent long form
npx jest --config tests/agntcy-conformance/jest.config.cjs

# Run with TEST_DATABASE_URL and TEST_REDIS_URL overrides
TEST_DATABASE_URL=postgresql://sentryagent:sentryagent@localhost:5432/sentryagent_idp_test \
TEST_REDIS_URL=redis://localhost:6379/1 \
npm run test:agntcy-conformance

# Enable A2A delegation conformance tests (gated by env var)
A2A_ENABLED=true npm run test:agntcy-conformance

The conformance suite uses its own jest.config.cjs (located in tests/agntcy-conformance/) so it does not run with npm test by default. This is intentional — the suite requires COMPLIANCE_ENABLED=true and optionally A2A_ENABLED=true, which should not be required for the standard unit/integration test run.

What each test validates:

Conformance Test What it validates AGNTCY Domain
Conformance 1 — Agent registration creates DID:WEB identifier POST /api/v1/agents returns a did field matching did:web:* pattern when DID_WEB_DOMAIN is set. The did field is optional in the response (test is conditional on presence) — but when present, it must conform to the did:web: scheme. Non-Human Identity
Conformance 2 — Token issuance via client_credentials grant Registers an agent, generates credentials via API, then exercises the full OAuth 2.0 Client Credentials flow. Validates that POST /api/v1/token returns a 200 response with access_token (string), token_type: 'Bearer', and a JWT with 3 dot-separated parts. Authentication
Conformance 3 — A2A delegation chain create + verify (Gated by A2A_ENABLED=true.) Creates a delegation chain between two agents via POST /api/v1/oauth2/token/delegate. If a token is returned, verifies it via POST /api/v1/oauth2/token/verify-delegation. Accepts 200 or 201 on creation and 200 or 204 on verification. Agent-to-Agent Trust
Conformance 4 — Compliance report returns valid AGNTCY structure Calls GET /api/v1/compliance/report and validates all required AGNTCY fields: generated_at (valid ISO 8601), tenant_id (string), agntcy_schema_version: '1.0', sections (array with name, status, details per entry), overall_status (one of pass/fail/warn). Also verifies the agent-identity and audit-trail section names are present. A second request verifies the Redis cache (X-Cache: HIT header and from_cache: true body field). Audit, Compliance

Schema tables created by conformance suite: The suite creates its own tables using CREATE TABLE IF NOT EXISTS before tests run. The tables match the production schema and include: organizations, agents, credentials, audit_events, token_revocations, agent_did_keys, delegation_chains. These are cleaned up via DELETE in afterEach (child-to-parent order respecting FK constraints) and dropped implicitly when the test database is reset.

Environment variables used:

Variable Required Purpose
TEST_DATABASE_URL Yes (or default) PostgreSQL connection string for the test database
TEST_REDIS_URL Yes (or default) Redis connection string (index 1 recommended)
COMPLIANCE_ENABLED Yes ('true') Enables the compliance report endpoint
A2A_ENABLED No (default 'true') Set to 'false' to skip Conformance 3 (A2A delegation)
DID_WEB_DOMAIN No When set, Conformance 1 validates the did:web: format

10.9 Tier Enforcement Tests

Location: tests/unit/services/TierService.test.ts and tests/integration/

The TierService has the following test cases that must all pass:

Unit tests (tests/unit/services/TierService.test.ts)

The unit tests mock PostgreSQL (Pool) and Redis (RedisClientType) and Stripe. Key scenarios:

Test Description
getStatus() — returns correct tier and limits Mocks SELECT tier FROM organizations returning 'pro'; mocks Redis GET calls for rate:tier:calls and rate:tier:tokens; verifies ITierStatus.limits matches TIER_CONFIG['pro'].
getStatus() — falls back to 0 when Redis unavailable Redis GET throws; verifies usage.callsToday = 0 and usage.tokensToday = 0 with no error thrown.
getStatus() — returns 'free' when org not found SELECT returns 0 rows; verifies tier === 'free'.
initiateUpgrade() — throws ValidationError on downgrade attempt targetTier = 'free' when current is 'pro'; verifies ValidationError is thrown with TIER_RANK comparison failure message.
initiateUpgrade() — calls Stripe with correct metadata Verifies stripe.checkout.sessions.create is called with metadata: { orgId, targetTier } and mode: 'subscription'.
applyUpgrade() — executes UPDATE organizations SET tier Verifies parameterized SQL is called with [targetTier, orgId].
enforceAgentLimit() — throws TierLimitError when limit reached Mock agent count equals TIER_CONFIG[tier].maxAgents; verifies TierLimitError with limit and current details.
enforceAgentLimit() — no-op for Enterprise tier TIER_CONFIG['enterprise'].maxAgents = Infinity; verifies no SQL query for agent count and no error.
fetchTier() — returns 'free' for unknown tier string in DB DB returns unrecognised string; verifies isTierName guard returns 'free'.

Integration (middleware) tests

When writing integration tests for the tier enforcement middleware (src/middleware/tier.ts), the following scenarios must be covered:

Scenario Expected behaviour
Request with org on free tier, under daily call limit Request proceeds normally (2xx from downstream handler)
Request that would exceed maxCallsPerDay for the org's tier 429 TierLimitError — body contains code: 'TIER_LIMIT_EXCEEDED'
Request to /health or /metrics (unprotected routes) Tier middleware not applied — always 200
Org not found in organizations table Defaults to free tier limits

10.10 Analytics Service Tests

Location: tests/unit/services/AnalyticsService.test.ts

The AnalyticsService unit tests mock the PostgreSQL Pool. Key scenarios that must be covered:

Test Description
recordEvent() — executes UPSERT without throwing Verifies pool.query is called with the INSERT ... ON CONFLICT DO UPDATE SQL pattern and the correct [tenantId, metricType] parameters.
recordEvent() — catches and swallows pool errors Pool query throws; verifies recordEvent resolves (not rejects) and the error does not propagate. This is the fire-and-forget contract.
getTokenTrend() — clamps days to 90 Calls with days = 200; verifies pool.query receives clampedDays = 90 as the first parameter.
getTokenTrend() — maps rows to ITokenTrendEntry[] Mock returns rows with date: '2026-03-01', count: '42'; verifies the result is [{ date: '2026-03-01', count: 42 }] (count coerced to number).
getAgentActivity() — maps rows to IAgentActivityEntry[] Mock returns rows with string-typed dow, hour, count; verifies all are coerced to numbers in the result.
getAgentUsageSummary() — maps rows to IAgentUsageSummaryEntry[] Mock returns rows with token_count: '150'; verifies token_count: 150 (number) in the result.
getAgentUsageSummary() — joins with agents table on organization_id Verifies the SQL query joins agents with LEFT JOIN analytics_events and filters a.organization_id = $1.

Coverage gate: AnalyticsService must maintain >80% statement, branch, function, and line coverage. Run:

npm run test:unit -- --coverage --testPathPattern=AnalyticsService

10.11 Running the Complete Phase 6 Test Matrix

All of the following must pass before any Phase 6 feature is considered complete:

# 1. Unit tests (all services including Phase 36)
npm run test:unit -- --coverage
# Must exit 0 with all 4 coverage metrics ≥ 80%

# 2. Integration tests (requires PostgreSQL + Redis running)
npm run test:integration

# 3. AGNTCY conformance suite
COMPLIANCE_ENABLED=true \
A2A_ENABLED=true \
npm run test:agntcy-conformance

# 4. Dependency security audit
npm audit --audit-level=high
# Must exit 0 — no high or critical vulnerabilities

# 5. TypeScript compilation
npx tsc --noEmit
# Must exit 0 — zero type errors

Current test file inventory (as of Phase 6 completion):

Unit test files in tests/unit/services/:

File Service tested
AgentService.test.ts AgentService
AnalyticsService.test.ts AnalyticsService
AuditService.test.ts AuditService
AuditVerificationService.test.ts AuditVerificationService
BillingService.test.ts BillingService
ComplianceService.test.ts ComplianceService
CredentialService.test.ts CredentialService
DIDService.test.ts DIDService
DelegationService.test.ts DelegationService
EncryptionService.test.ts EncryptionService
FederationService.test.ts FederationService
IDTokenService.test.ts IDTokenService
OAuth2Service.test.ts OAuth2Service
OIDCKeyService.test.ts OIDCKeyService
OrgService.test.ts OrgService
ScaffoldService.test.ts ScaffoldService
ScaffoldService.errors.test.ts ScaffoldService error cases
TierService.test.ts TierService
WebhookService.test.ts WebhookService