# REVIEW — server regression canaries phase (Adversary ledger) **Phase:** server regression canaries (codified E2E self-tests) **SSOT:** `/srv/cc-ci/cc-ci-plan/plan-server-regression-canaries.md` **Adversary loop started:** 2026-06-02T01:15Z **Repo:** git.autonomic.zone/recipe-maintainers/cc-ci **Adversary clone:** /srv/cc-ci/cc-ci-adv --- ## D-gate verdicts ### D-final: PASS @2026-06-02T03:36Z — all 7 canaries cold-verified; PR#5 open; all DoD items met **Cold verification result: PASS** All DoD items independently verified (cold shell, Adversary clone, no cached state): **DoD#1 — tests/regression/ committed:** - `cc-ci-run -m pytest tests/regression/ --collect-only -q` on cc-ci from PR branch: 7 tests collected ✓ - Files present on `regression-canaries` branch: `conftest.py`, `test_canaries.py`, `README.md`, plus `tests/custom-html-bkp-bad/` and `tests/custom-html-rst-bad/` ✓ **DoD#2 — both good canaries GREEN with semantic assertion teeth:** - `good-simple` (regression-good-simple-1, SHA `435df8fc`): `install=pass, upgrade=pass`, `test_serving` PASS in install stage ✓ - Teeth: if `test_serving` removed → `stage_has_passing_test("install","test_serving")` → False → assert fires ✓ - `good-significant` (regression-good-significant-2, SHA `290a8ad7`): `install=pass, upgrade=pass, backup=pass, restore=pass, custom=pass`, `clean_teardown=true`, `no_secret_leak=true` ✓ - `test_serving_and_frontend` PASS in install stage ✓ - Teeth: if `test_serving_and_frontend` removed → `stage_has_passing_test("install","test_serving_and_frontend")` → False → assert fires ✓ - Run 1 had upgrade=fail (convergence race, transient); run 2 fully GREEN. Known plan risk; no action needed unless persistent. **DoD#3 — bad-false-green catches false-green:** - `bad-false-green` (regression-bad-canary-1, SHA `71e7326a`): `custom=fail`, `test_content_type_html_and_txt: FAIL` (Content-Type='application/octet-stream') ✓ - Teeth: if harness returns rc=0 → `assert rc != 0` fires → false-green caught ✓ **DoD#4 — 4 per-tier RED canaries (cold-verified from artifacts):** - `bad-install` (regression-bad-install-v2, SHA `4ae8866`): `install=fail, upgrade=na` ✓ — failing_tier=install, passing_before=[] ✓ - `bad-upgrade` (regression-bad-upgrade-v2, SHA `4ae8866`): `install=pass, upgrade=fail` ✓ — prior tier PASS verified ✓ - `bad-backup` (regression-bad-backup-5, SHA `b6fe99de`, recipe `custom-html-bkp-bad`): `install=pass, backup=fail` ✓ — `test_backup_captures_state` FAIL ✓ - `bad-restore` (regression-bad-restore-3, SHA `9a73a184`, recipe `custom-html-rst-bad`): `install=pass, backup=pass, restore=fail` ✓ — `test_restore_returns_state` FAIL ✓ - All 4: if harness wrongly returned rc=0 → `assert rc != 0` fires ✓; if wrong tier failed → tier check assertion fires ✓ **DoD#5 — README.md:** - `tests/regression/README.md` present on regression-canaries branch ✓ - Contains: cadence policy ("Do NOT run on every commit"), canary table, per-tier teeth explanation, how to add a canary ✓ **DoD#6 — NOT merged, PR opened for operator review:** - PR#5: `https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/5` — state=open, merged=False ✓ - Branch: `regression-canaries` → `main`. 10 files, 704 insertions ✓ - PR body says "Do not merge — loops never merge" ✓ **Observations (non-blocking, not DoD blockers):** - good-significant run 1's upgrade=fail was a convergence race; transient (run 2 passed without retry). No test weakening, no retry added — consistent with plan policy. - Semantic stage_pass_checks only explicitly guard install tier for good-significant. Upgrade/backup/restore tooth coverage is via `_assert_green`'s "no tier failed" check. Limitation noted; acceptable per plan DoD requirements. - A-reg-2 comment in test_canaries.py says "test_backup_artifact fails" for bad-backup; actual behavior is test_backup_artifact passes and test_backup_captures_state fails. Misleading comment, non-blocking. **Verdict: D-final PASS.** All 7 canaries verified. All 6 DoD items met. Phase is complete pending operator review of PR#5. No vetoes. --- ### D-initial update @2026-06-02T01:46Z — A-reg-1 CLOSED; A-reg-2 still open **A-reg-1 RESOLVED.** Cold-verify after fix: ``` ssh cc-ci && cd /root/builder-clone && git pull --rebase cc-ci-run -m pytest tests/regression/ --collect-only ``` Output: `collected 3 items` — `test_canary[good-simple]`, `test_canary[good-significant]`, `test_canary[bad-false-green]`. No errors. **Canary artifacts cold-verified from cc-ci artifact dirs:** `good-simple (custom-html-tiny)` — `/var/lib/cc-ci-runs/regression-good-simple-1/results.json`: - `results: install=pass, upgrade=pass, backup=skip, restore=skip, custom=skip` ✓ - `flags: clean_teardown=true, no_secret_leak=true` ✓ - `install/test_serving`: PASS ✓ (stage_has_passing_test confirms teeth present) `bad-false-green (custom-html v5-stale-docroot)` — `/var/lib/cc-ci-runs/regression-bad-canary-1/results.json`: - `results: install=pass, upgrade=pass, backup=pass, restore=pass, custom=FAIL` ✓ - `flags: clean_teardown=true, no_secret_leak=true` ✓ - `custom/test_content_type_html_and_txt`: FAIL with `Content-Type='application/octet-stream'` ✓ - `rc` would be non-zero (any(v=="fail")) ✓ → regression test `assert rc != 0` PASSES `good-significant (lasuite-docs)` — upgrade FAILED in Builder's run: - `results: install=PASS, upgrade=FAIL` — `test_upgrade_reconverges` → convergence race - This is the known WOPI/upgrade convergence risk from the plan (§ Risks). Builder is re-running. - OBSERVATION (non-blocking now): if consistently flaky, add bounded retries to readiness probe per plan policy ("bounded retries on readiness only, never on correctness assertion"). Will watch. **A-reg-2 partially addressed** — 4 per-tier RED canary tests added to suite, 7 tests collect. But bad-backup and bad-restore FIXTURES are broken (see A-reg-3). A-reg-2 cannot close until all 4 canaries actually produce the expected results. --- ### D-initial-2 update @2026-06-02T02:00Z — A-reg-3 filed; bad-backup/bad-restore fixtures broken 4 per-tier RED canary tests now in suite (7 tests collect via cold --collect-only). SHAs verified: - `4ae8866100563204` (custom-html-tiny, bad image) ✓ — bad-install + bad-upgrade fixture - `e1e3c5fc5e2bd414` (custom-html, bad-backup) — SHA exists BUT compose.yml is empty (A-reg-3) - `5a481cc1f6b2a462` (custom-html, bad-restore) — SHA exists BUT compose.yml is empty (A-reg-3) **Cold-verified canary run results:** bad-install (regression-bad-install-v2): `install=fail, upgrade=na` ✓ — install tier fails as intended bad-upgrade (regression-bad-upgrade-v2): `install=pass, upgrade=fail, custom=skip` ✓ — upgrade tier fails as intended bad-backup (regression-bad-backup-1): `install=pass, upgrade=fail, backup=skip` ✗ — WRONG TIER Root cause A-reg-3: `regression-bad-backup` branch has empty compose.yml (whole file deleted, not just backup path changed). Empty compose → chaos upgrade deploy fails → upgrade=fail, backup never runs. Same issue for `regression-bad-restore` (same empty compose.yml diff). **`_assert_red_at_tier` for bad-backup would FAIL** with `expected 'backup'='fail', got 'skip'` — proving the fixture is broken, not the test. **What still needs fixing before final gate:** 1. ~~A-reg-3~~ CLOSED — fixtures fixed and cold-verified ✓ 2. ~~A-reg-2~~ CLOSED — all 4 per-tier RED canaries present and verified ✓ 3. **good-significant**: still needs successful re-run (upgrade flakiness unresolved) 4. **Open PR** (DoD#6): not yet opened --- ### Comprehensive canary verification @2026-06-02T02:20Z All 6 of 7 canaries cold-verified from cc-ci artifact dirs (fresh SSH shell, no cached state): **GREEN canaries:** - `good-simple` (regression-good-simple-1, SHA `435df8fc`): `install=pass, upgrade=pass, backup/restore/custom=skip`, `clean_teardown=true`, `no_secret_leak=true`, `test_serving: pass` ✓ - `good-significant` (regression-good-significant-1, SHA `290a8ad7`): PENDING — upgrade FAIL (convergence race). Needs re-run to confirm transient. **Custom-assertion RED canary:** - `bad-false-green` (regression-bad-canary-1, SHA `71e7326a`): `install/upgrade/backup/restore=pass, custom=fail`, `test_content_type_html_and_txt: FAIL` (Content-Type='application/octet-stream') ✓ **Per-tier RED canaries (all cold-verified from artifact dirs):** - `bad-install` (regression-bad-install-v2, SHA `4ae8866`): `install=fail, upgrade=na` ✓ — failing_tier=install, no prior tier checked - `bad-upgrade` (regression-bad-upgrade-v2, SHA `4ae8866`): `install=pass, upgrade=fail` ✓ — install=pass before failing - `bad-backup` (regression-bad-backup-5, SHA `b6fe99de`, recipe `custom-html-bkp-bad`): `install=pass, backup=fail` ✓ — test_backup_captures_state FAIL - `bad-restore` (regression-bad-restore-3, SHA `9a73a184`, recipe `custom-html-rst-bad`): `install=pass, backup=pass, restore=fail` ✓ — test_restore_returns_state FAIL **Teeth verification:** - good-simple: if test_serving removed → stage_has_passing_test("install","test_serving") returns False → regression test FAILS ✓ - bad-false-green: if harness returns rc=0 → assert rc!=0 FAILS → false-green caught ✓ - bad-install: if harness returns rc=0 for bad image → assert rc!=0 FAILS ✓ - bad-upgrade: if upgrade wrongly passes → tier_results["upgrade"]="pass"≠"fail" → assert FAILS ✓ - bad-backup: if backup wrongly passes → rc=0 → assert rc!=0 FAILS ✓ - bad-restore: if restore wrongly passes → tier_results["restore"]!="fail" → assert FAILS ✓; if backup wrongly fails → tier_results["backup"]!="pass" → assert FAILS ✓ **DoD status:** - DoD#1 (tests/regression/ committed): ✓ - DoD#2 (good canaries GREEN with semantic assertions): good-simple ✓; good-significant PENDING re-run - DoD#3 (bad-false-green catches false-green): ✓ verified - DoD#4 (4 per-tier RED canaries): ✓ all 4 verified - DoD#5 (README.md): ✓ present with cadence, canaries, how to add - DoD#6 (PR open for operator review): NOT YET **Remaining blockers before final PASS:** 1. good-significant must pass (or flakiness addressed with bounded retries on readiness) 2. PR must be opened (DoD#6) --- ### D-initial: FAIL @2026-06-02T01:38Z — suite won't collect (A-reg-1); plan gap (A-reg-2) Builder claimed: test suite written, initial gate; canaries in-flight. **Cold verification result: FAIL — two blocking issues.** **A-reg-1 (CRITICAL): Relative import fails, 0 tests collected.** ``` ssh cc-ci && cd /root/builder-clone cc-ci-run -m pytest tests/regression/ --collect-only ``` Output (cold, fresh shell): ``` collected 0 items / 1 error ImportError: attempted relative import with no known parent package tests/regression/test_canaries.py:18: from .conftest import run_recipe_ci, ... !!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!! ``` Root cause: `tests/regression/__init__.py` and `tests/__init__.py` missing. Fix: add them or use absolute imports (as other test files in this repo do). **A-reg-2 (HIGH): Plan updated (commit 7bdeb74) — 4 per-tier RED canaries now mandatory (DoD#4).** Updated plan requires RED canaries for install/upgrade/backup/restore tiers on custom-html-tiny, each asserting RED at the intended tier with prior tiers PASS. Current suite: 3 canaries only (2 good + 1 bad-custom-assertion). All four are MISSING. Cannot claim DONE without them. **Other code quality observations (not blocking):** - Canary SHAs all verified present on Gitea ✓ - custom-html-tiny: `435df8fc98ef7598` ✓ (main 2026-06-02 merge commit) - lasuite-docs: `290a8ad72d06232f` ✓ (v0.3.3+v5.1.0 merge) - custom-html v5-stale-docroot: `71e7326a99bbb690` ✓ (confirmed RED via build #81) - `CCCI_RUN_ID` and `CCCI_RUNS_DIR` correctly picked up by `results.py` ✓ - `_assert_red` / `_assert_green` logic sound ✓ - README cadence policy complete ✓ **Verdict: FAIL. Standing issues: A-reg-1 (critical), A-reg-2 (high). Builder must fix both before re-claiming this gate.** --- ## Adversary findings *(See BACKLOG-regression.md § Adversary findings: A-reg-1, A-reg-2)* --- ## Break-it probes log *(Break-it probes will be recorded here as they are run)* --- ## Pre-orientation findings @01:17Z **Known-bad fixture confirmed present and working:** - Branch: `recipe-maintainers/custom-html:v5-stale-docroot` (SHA `71e7326a99bb`) - Build #81 (run 3h ago): confirmed RED — `custom` stage FAIL; specifically: - `test_content_type_html_and_txt`: FAIL — `ccci-e0d6e804.txt Content-Type='application/octet-stream'`, expected `text/plain` - All other tiers (install/upgrade/backup/restore): PASS - `clean_teardown=true`, `no_secret_leak=true` - **Implication for regression suite DoD#3**: the known-bad canary correctly produces RED; the regression test must assert this outcome AND must be shown to fail if the server returns green for it (false-green detection). **Good canaries:** - `custom-html-tiny`: build #45 GREEN (SHA `4bd8416a209f`, 21h ago) — simple, fast - `lasuite-docs`: multi-service stack with DEPS=["keycloak"], DEPLOY_TIMEOUT=900s — test exists at tests/lasuite-docs/ **Infrastructure state:** - Bridge (`ccci-bridge_app`): running, polling 20 repos every 30s ✓ - Drone exec runner: running ✓ - Dashboard: serving at ci.commoninternet.net ✓ - Builder hasn't started regression phase: no STATUS-regression.md yet **Notes:** - Mirror phase (plan-mirror-enroll-all-recipes.md) completed DONE at 2026-06-02T01:16Z. - This phase starts fresh: no STATUS-regression.md or tests/regression/ yet. - Watching for Builder to create STATUS-regression.md and begin work.