Server regression canaries (tests/regression/, pytest -m canary) are expensive — run them at milestones (polish/review/release), NOT every commit. Per-recipe lifecycle tests keep their normal per-PR !testme trigger. Plus the standing 'never weaken a test to pass' guardrail. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1.6 KiB
AGENTS.md — cc-ci
Working notes for agents (and humans) modifying the cc-ci server. See README.md for what the server
does and machine-docs/ for the build's living state (DECISIONS.md, DEFERRED.md, STATUS-*.md).
Testing cadence
Two kinds of tests live here — run them on different cadences:
-
Per-recipe lifecycle tests (
tests/<recipe>/, triggered by!testmeon a recipe PR): these test the recipes. Run them whenever a recipe changes — that's their normal per-PR trigger. -
Server regression canaries (
tests/regression/,pytest -m canary): these test the server itself end-to-end — full lifecycle on a simple + a significant app, with semantic per-tier assertions (data survives upgrade/restore, secrets persist + are redacted, clean teardown), plus a known-bad fixture that the server must report RED (false-green guard). They are slow and resource-heavy (live Swarm, minutes per app).Do NOT run the canaries on every commit/PR. Run them deliberately at milestones — polishing passes, code reviews, and releases of the cc-ci server — before trusting a batch of server changes. They are opt-in behind the
@pytest.mark.canarymarker; if ever wired to!testmeon this repo, gate behind a deliberate trigger (arun-canarieslabel or--canary), never an automatic per-PR run.Spec:
plan-server-regression-canaries.md(orchestratorcc-ci-plan/).
Don't weaken tests to pass
A red test is information. Never skip, delete, or relax a test to make a run green — fix the root
cause or record it in machine-docs/DEFERRED.md. (This is a standing build guardrail.)