From 49a56e873ecf33638352edbd822f789726f11ddc Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Tue, 2 Jun 2026 02:18:34 +0000 Subject: [PATCH] review(regression): A-reg-2+A-reg-3 CLOSED; 6/7 canaries cold-verified; good-significant+PR still pending --- machine-docs/BACKLOG-regression.md | 29 ++++++++++++++++++ machine-docs/REVIEW-regression.md | 47 +++++++++++++++++++++++++++--- 2 files changed, 72 insertions(+), 4 deletions(-) diff --git a/machine-docs/BACKLOG-regression.md b/machine-docs/BACKLOG-regression.md index 45e7776..a160de2 100644 --- a/machine-docs/BACKLOG-regression.md +++ b/machine-docs/BACKLOG-regression.md @@ -44,6 +44,18 @@ cc-ci-run -m pytest tests/regression/ --collect-only --- +### A-reg-3 [adversary] CLOSED @2026-06-02T02:20Z — fixtures fixed; cold-verified correct tier failures + +**Resolved:** Builder created separate recipes (`custom-html-bkp-bad`, `custom-html-rst-bad`) with +correct fixture structure. Cold-verified from cc-ci artifact dirs (no harness re-run needed). + +**Evidence:** +- bad-backup-5 (`b6fe99de`, custom-html-bkp-bad): `install=pass, backup=fail` ✓ + - `test_backup_artifact: pass` (snapshot IS produced) + - `test_backup_captures_state: fail` ("MISSING" not "original") ✓ — backup=RED +- bad-restore-3 (`9a73a184e739`, custom-html-rst-bad): `install=pass, backup=pass, restore=fail` ✓ + - `test_restore_returns_state: fail` ("mutated" not "original") ✓ — restore=RED + ### A-reg-3 [adversary] OPEN — CRITICAL: bad-backup and bad-restore fixtures broken (empty compose.yml) **Filed:** 2026-06-02T01:58Z **Severity:** CRITICAL — both fixtures fail at upgrade instead of their intended tier @@ -80,6 +92,23 @@ The compose.yml should be identical to main EXCEPT for the single label/config c --- +### A-reg-2 [adversary] CLOSED @2026-06-02T02:20Z — 4 per-tier RED canaries cold-verified + +**Resolved:** All 4 per-tier RED canaries added, artifacts cold-verified on cc-ci. + +| Canary | Run artifact | failing_tier | passing_before | verdict | +|--------|-------------|-------------|---------------|---------| +| bad-install | regression-bad-install-v2 | install=fail ✓ | [] | CORRECT ✓ | +| bad-upgrade | regression-bad-upgrade-v2 | upgrade=fail ✓ | install=pass ✓ | CORRECT ✓ | +| bad-backup | regression-bad-backup-5 | backup=fail ✓ | install=pass ✓ | CORRECT ✓ | +| bad-restore | regression-bad-restore-3 | restore=fail ✓ | install=pass, backup=pass ✓ | CORRECT ✓ | + +`@pytest.mark.canary_fast` marker added ✓. 7 tests collect ✓. + +**Note:** bad-backup comment in test_canaries.py says "test_backup_artifact fails" but actual +behavior is test_backup_artifact PASSES and test_backup_captures_state FAILS. Functional result +(backup=fail) is correct; comment is misleading but non-blocking. + ### A-reg-2 [adversary] OPEN — Plan gap: 4 per-tier RED canaries required by updated DoD **Filed:** 2026-06-02T01:37Z **Severity:** HIGH — DoD#4 unmet; Builder cannot claim DONE without these diff --git a/machine-docs/REVIEW-regression.md b/machine-docs/REVIEW-regression.md index 677d788..1f533b3 100644 --- a/machine-docs/REVIEW-regression.md +++ b/machine-docs/REVIEW-regression.md @@ -65,10 +65,49 @@ runs. Same issue for `regression-bad-restore` (same empty compose.yml diff). proving the fixture is broken, not the test. **What still needs fixing before final gate:** -1. A-reg-3: Recreate bad-backup and bad-restore fixtures with correct compose.yml (only targeted change) -2. Run bad-backup and bad-restore to confirm correct tier failures -3. Re-run good-significant (lasuite-docs) to confirm upgrade race is transient -4. Open PR +1. ~~A-reg-3~~ CLOSED — fixtures fixed and cold-verified ✓ +2. ~~A-reg-2~~ CLOSED — all 4 per-tier RED canaries present and verified ✓ +3. **good-significant**: still needs successful re-run (upgrade flakiness unresolved) +4. **Open PR** (DoD#6): not yet opened + +--- + +### Comprehensive canary verification @2026-06-02T02:20Z + +All 6 of 7 canaries cold-verified from cc-ci artifact dirs (fresh SSH shell, no cached state): + +**GREEN canaries:** +- `good-simple` (regression-good-simple-1, SHA `435df8fc`): `install=pass, upgrade=pass, backup/restore/custom=skip`, `clean_teardown=true`, `no_secret_leak=true`, `test_serving: pass` ✓ +- `good-significant` (regression-good-significant-1, SHA `290a8ad7`): PENDING — upgrade FAIL (convergence race). Needs re-run to confirm transient. + +**Custom-assertion RED canary:** +- `bad-false-green` (regression-bad-canary-1, SHA `71e7326a`): `install/upgrade/backup/restore=pass, custom=fail`, `test_content_type_html_and_txt: FAIL` (Content-Type='application/octet-stream') ✓ + +**Per-tier RED canaries (all cold-verified from artifact dirs):** +- `bad-install` (regression-bad-install-v2, SHA `4ae8866`): `install=fail, upgrade=na` ✓ — failing_tier=install, no prior tier checked +- `bad-upgrade` (regression-bad-upgrade-v2, SHA `4ae8866`): `install=pass, upgrade=fail` ✓ — install=pass before failing +- `bad-backup` (regression-bad-backup-5, SHA `b6fe99de`, recipe `custom-html-bkp-bad`): `install=pass, backup=fail` ✓ — test_backup_captures_state FAIL +- `bad-restore` (regression-bad-restore-3, SHA `9a73a184`, recipe `custom-html-rst-bad`): `install=pass, backup=pass, restore=fail` ✓ — test_restore_returns_state FAIL + +**Teeth verification:** +- good-simple: if test_serving removed → stage_has_passing_test("install","test_serving") returns False → regression test FAILS ✓ +- bad-false-green: if harness returns rc=0 → assert rc!=0 FAILS → false-green caught ✓ +- bad-install: if harness returns rc=0 for bad image → assert rc!=0 FAILS ✓ +- bad-upgrade: if upgrade wrongly passes → tier_results["upgrade"]="pass"≠"fail" → assert FAILS ✓ +- bad-backup: if backup wrongly passes → rc=0 → assert rc!=0 FAILS ✓ +- bad-restore: if restore wrongly passes → tier_results["restore"]!="fail" → assert FAILS ✓; if backup wrongly fails → tier_results["backup"]!="pass" → assert FAILS ✓ + +**DoD status:** +- DoD#1 (tests/regression/ committed): ✓ +- DoD#2 (good canaries GREEN with semantic assertions): good-simple ✓; good-significant PENDING re-run +- DoD#3 (bad-false-green catches false-green): ✓ verified +- DoD#4 (4 per-tier RED canaries): ✓ all 4 verified +- DoD#5 (README.md): ✓ present with cadence, canaries, how to add +- DoD#6 (PR open for operator review): NOT YET + +**Remaining blockers before final PASS:** +1. good-significant must pass (or flakiness addressed with bounded retries on readiness) +2. PR must be opened (DoD#6) ---