Files
cc-ci/machine-docs/REVIEW-regression.md
autonomic-bot 0dea3410ee
Some checks failed
continuous-integration/drone/push Build is failing
review(regression): D-final PASS — all 7 canaries cold-verified; PR#5 open; DoD complete
Cold-verified from cc-ci artifact dirs + PR branch collect:
- DoD#1: 7 tests collect from regression-canaries branch ✓
- DoD#2: good-simple (install/upgrade=pass, test_serving) ✓; good-significant run-2 (all tiers pass, test_serving_and_frontend) ✓
- DoD#3: bad-false-green RED, rc!=0 false-green guard has teeth ✓
- DoD#4: all 4 per-tier RED canaries at correct tiers (install/upgrade/backup/restore) ✓
- DoD#5: README cadence+canaries+add-instructions ✓
- DoD#6: PR#5 state=open, merged=False ✓

Inbox consumed; no vetoes; phase DONE pending operator PR review.
2026-06-02 03:37:18 +00:00

14 KiB

REVIEW — server regression canaries phase (Adversary ledger)

Phase: server regression canaries (codified E2E self-tests) SSOT: /srv/cc-ci/cc-ci-plan/plan-server-regression-canaries.md Adversary loop started: 2026-06-02T01:15Z Repo: git.autonomic.zone/recipe-maintainers/cc-ci Adversary clone: /srv/cc-ci/cc-ci-adv


D-gate verdicts

D-final: PASS @2026-06-02T03:36Z — all 7 canaries cold-verified; PR#5 open; all DoD items met

Cold verification result: PASS

All DoD items independently verified (cold shell, Adversary clone, no cached state):

DoD#1 — tests/regression/ committed:

  • cc-ci-run -m pytest tests/regression/ --collect-only -q on cc-ci from PR branch: 7 tests collected ✓
  • Files present on regression-canaries branch: conftest.py, test_canaries.py, README.md, plus tests/custom-html-bkp-bad/ and tests/custom-html-rst-bad/

DoD#2 — both good canaries GREEN with semantic assertion teeth:

  • good-simple (regression-good-simple-1, SHA 435df8fc): install=pass, upgrade=pass, test_serving PASS in install stage ✓
    • Teeth: if test_serving removed → stage_has_passing_test("install","test_serving") → False → assert fires ✓
  • good-significant (regression-good-significant-2, SHA 290a8ad7): install=pass, upgrade=pass, backup=pass, restore=pass, custom=pass, clean_teardown=true, no_secret_leak=true
    • test_serving_and_frontend PASS in install stage ✓
    • Teeth: if test_serving_and_frontend removed → stage_has_passing_test("install","test_serving_and_frontend") → False → assert fires ✓
    • Run 1 had upgrade=fail (convergence race, transient); run 2 fully GREEN. Known plan risk; no action needed unless persistent.

DoD#3 — bad-false-green catches false-green:

  • bad-false-green (regression-bad-canary-1, SHA 71e7326a): custom=fail, test_content_type_html_and_txt: FAIL (Content-Type='application/octet-stream') ✓
  • Teeth: if harness returns rc=0 → assert rc != 0 fires → false-green caught ✓

DoD#4 — 4 per-tier RED canaries (cold-verified from artifacts):

  • bad-install (regression-bad-install-v2, SHA 4ae8866): install=fail, upgrade=na ✓ — failing_tier=install, passing_before=[] ✓
  • bad-upgrade (regression-bad-upgrade-v2, SHA 4ae8866): install=pass, upgrade=fail ✓ — prior tier PASS verified ✓
  • bad-backup (regression-bad-backup-5, SHA b6fe99de, recipe custom-html-bkp-bad): install=pass, backup=fail ✓ — test_backup_captures_state FAIL ✓
  • bad-restore (regression-bad-restore-3, SHA 9a73a184, recipe custom-html-rst-bad): install=pass, backup=pass, restore=fail ✓ — test_restore_returns_state FAIL ✓
  • All 4: if harness wrongly returned rc=0 → assert rc != 0 fires ✓; if wrong tier failed → tier check assertion fires ✓

DoD#5 — README.md:

  • tests/regression/README.md present on regression-canaries branch ✓
  • Contains: cadence policy ("Do NOT run on every commit"), canary table, per-tier teeth explanation, how to add a canary ✓

DoD#6 — NOT merged, PR opened for operator review:

  • PR#5: https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/5 — state=open, merged=False ✓
  • Branch: regression-canariesmain. 10 files, 704 insertions ✓
  • PR body says "Do not merge — loops never merge" ✓

Observations (non-blocking, not DoD blockers):

  • good-significant run 1's upgrade=fail was a convergence race; transient (run 2 passed without retry). No test weakening, no retry added — consistent with plan policy.
  • Semantic stage_pass_checks only explicitly guard install tier for good-significant. Upgrade/backup/restore tooth coverage is via _assert_green's "no tier failed" check. Limitation noted; acceptable per plan DoD requirements.
  • A-reg-2 comment in test_canaries.py says "test_backup_artifact fails" for bad-backup; actual behavior is test_backup_artifact passes and test_backup_captures_state fails. Misleading comment, non-blocking.

Verdict: D-final PASS. All 7 canaries verified. All 6 DoD items met. Phase is complete pending operator review of PR#5. No vetoes.


D-initial update @2026-06-02T01:46Z — A-reg-1 CLOSED; A-reg-2 still open

A-reg-1 RESOLVED. Cold-verify after fix:

ssh cc-ci && cd /root/builder-clone && git pull --rebase
cc-ci-run -m pytest tests/regression/ --collect-only

Output: collected 3 itemstest_canary[good-simple], test_canary[good-significant], test_canary[bad-false-green]. No errors.

Canary artifacts cold-verified from cc-ci artifact dirs:

good-simple (custom-html-tiny)/var/lib/cc-ci-runs/regression-good-simple-1/results.json:

  • results: install=pass, upgrade=pass, backup=skip, restore=skip, custom=skip
  • flags: clean_teardown=true, no_secret_leak=true
  • install/test_serving: PASS ✓ (stage_has_passing_test confirms teeth present)

bad-false-green (custom-html v5-stale-docroot)/var/lib/cc-ci-runs/regression-bad-canary-1/results.json:

  • results: install=pass, upgrade=pass, backup=pass, restore=pass, custom=FAIL
  • flags: clean_teardown=true, no_secret_leak=true
  • custom/test_content_type_html_and_txt: FAIL with Content-Type='application/octet-stream'
  • rc would be non-zero (any(v=="fail")) ✓ → regression test assert rc != 0 PASSES

good-significant (lasuite-docs) — upgrade FAILED in Builder's run:

  • results: install=PASS, upgrade=FAILtest_upgrade_reconverges → convergence race
  • This is the known WOPI/upgrade convergence risk from the plan (§ Risks). Builder is re-running.
  • OBSERVATION (non-blocking now): if consistently flaky, add bounded retries to readiness probe per plan policy ("bounded retries on readiness only, never on correctness assertion"). Will watch.

A-reg-2 partially addressed — 4 per-tier RED canary tests added to suite, 7 tests collect. But bad-backup and bad-restore FIXTURES are broken (see A-reg-3). A-reg-2 cannot close until all 4 canaries actually produce the expected results.


D-initial-2 update @2026-06-02T02:00Z — A-reg-3 filed; bad-backup/bad-restore fixtures broken

4 per-tier RED canary tests now in suite (7 tests collect via cold --collect-only). SHAs verified:

  • 4ae8866100563204 (custom-html-tiny, bad image) ✓ — bad-install + bad-upgrade fixture
  • e1e3c5fc5e2bd414 (custom-html, bad-backup) — SHA exists BUT compose.yml is empty (A-reg-3)
  • 5a481cc1f6b2a462 (custom-html, bad-restore) — SHA exists BUT compose.yml is empty (A-reg-3)

Cold-verified canary run results:

bad-install (regression-bad-install-v2): install=fail, upgrade=na ✓ — install tier fails as intended bad-upgrade (regression-bad-upgrade-v2): install=pass, upgrade=fail, custom=skip ✓ — upgrade tier fails as intended bad-backup (regression-bad-backup-1): install=pass, upgrade=fail, backup=skip ✗ — WRONG TIER

Root cause A-reg-3: regression-bad-backup branch has empty compose.yml (whole file deleted, not just backup path changed). Empty compose → chaos upgrade deploy fails → upgrade=fail, backup never runs. Same issue for regression-bad-restore (same empty compose.yml diff).

_assert_red_at_tier for bad-backup would FAIL with expected 'backup'='fail', got 'skip' — proving the fixture is broken, not the test.

What still needs fixing before final gate:

  1. A-reg-3 CLOSED — fixtures fixed and cold-verified ✓
  2. A-reg-2 CLOSED — all 4 per-tier RED canaries present and verified ✓
  3. good-significant: still needs successful re-run (upgrade flakiness unresolved)
  4. Open PR (DoD#6): not yet opened

Comprehensive canary verification @2026-06-02T02:20Z

All 6 of 7 canaries cold-verified from cc-ci artifact dirs (fresh SSH shell, no cached state):

GREEN canaries:

  • good-simple (regression-good-simple-1, SHA 435df8fc): install=pass, upgrade=pass, backup/restore/custom=skip, clean_teardown=true, no_secret_leak=true, test_serving: pass
  • good-significant (regression-good-significant-1, SHA 290a8ad7): PENDING — upgrade FAIL (convergence race). Needs re-run to confirm transient.

Custom-assertion RED canary:

  • bad-false-green (regression-bad-canary-1, SHA 71e7326a): install/upgrade/backup/restore=pass, custom=fail, test_content_type_html_and_txt: FAIL (Content-Type='application/octet-stream') ✓

Per-tier RED canaries (all cold-verified from artifact dirs):

  • bad-install (regression-bad-install-v2, SHA 4ae8866): install=fail, upgrade=na ✓ — failing_tier=install, no prior tier checked
  • bad-upgrade (regression-bad-upgrade-v2, SHA 4ae8866): install=pass, upgrade=fail ✓ — install=pass before failing
  • bad-backup (regression-bad-backup-5, SHA b6fe99de, recipe custom-html-bkp-bad): install=pass, backup=fail ✓ — test_backup_captures_state FAIL
  • bad-restore (regression-bad-restore-3, SHA 9a73a184, recipe custom-html-rst-bad): install=pass, backup=pass, restore=fail ✓ — test_restore_returns_state FAIL

Teeth verification:

  • good-simple: if test_serving removed → stage_has_passing_test("install","test_serving") returns False → regression test FAILS ✓
  • bad-false-green: if harness returns rc=0 → assert rc!=0 FAILS → false-green caught ✓
  • bad-install: if harness returns rc=0 for bad image → assert rc!=0 FAILS ✓
  • bad-upgrade: if upgrade wrongly passes → tier_results["upgrade"]="pass"≠"fail" → assert FAILS ✓
  • bad-backup: if backup wrongly passes → rc=0 → assert rc!=0 FAILS ✓
  • bad-restore: if restore wrongly passes → tier_results["restore"]!="fail" → assert FAILS ✓; if backup wrongly fails → tier_results["backup"]!="pass" → assert FAILS ✓

DoD status:

  • DoD#1 (tests/regression/ committed): ✓
  • DoD#2 (good canaries GREEN with semantic assertions): good-simple ✓; good-significant PENDING re-run
  • DoD#3 (bad-false-green catches false-green): ✓ verified
  • DoD#4 (4 per-tier RED canaries): ✓ all 4 verified
  • DoD#5 (README.md): ✓ present with cadence, canaries, how to add
  • DoD#6 (PR open for operator review): NOT YET

Remaining blockers before final PASS:

  1. good-significant must pass (or flakiness addressed with bounded retries on readiness)
  2. PR must be opened (DoD#6)

D-initial: FAIL @2026-06-02T01:38Z — suite won't collect (A-reg-1); plan gap (A-reg-2)

Builder claimed: test suite written, initial gate; canaries in-flight.

Cold verification result: FAIL — two blocking issues.

A-reg-1 (CRITICAL): Relative import fails, 0 tests collected.

ssh cc-ci && cd /root/builder-clone
cc-ci-run -m pytest tests/regression/ --collect-only

Output (cold, fresh shell):

collected 0 items / 1 error
ImportError: attempted relative import with no known parent package
tests/regression/test_canaries.py:18: from .conftest import run_recipe_ci, ...
!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!

Root cause: tests/regression/__init__.py and tests/__init__.py missing. Fix: add them or use absolute imports (as other test files in this repo do).

A-reg-2 (HIGH): Plan updated (commit 7bdeb74) — 4 per-tier RED canaries now mandatory (DoD#4). Updated plan requires RED canaries for install/upgrade/backup/restore tiers on custom-html-tiny, each asserting RED at the intended tier with prior tiers PASS. Current suite: 3 canaries only (2 good + 1 bad-custom-assertion). All four are MISSING. Cannot claim DONE without them.

Other code quality observations (not blocking):

  • Canary SHAs all verified present on Gitea ✓
    • custom-html-tiny: 435df8fc98ef7598 ✓ (main 2026-06-02 merge commit)
    • lasuite-docs: 290a8ad72d06232f ✓ (v0.3.3+v5.1.0 merge)
    • custom-html v5-stale-docroot: 71e7326a99bbb690 ✓ (confirmed RED via build #81)
  • CCCI_RUN_ID and CCCI_RUNS_DIR correctly picked up by results.py
  • _assert_red / _assert_green logic sound ✓
  • README cadence policy complete ✓

Verdict: FAIL. Standing issues: A-reg-1 (critical), A-reg-2 (high). Builder must fix both before re-claiming this gate.


Adversary findings

(See BACKLOG-regression.md § Adversary findings: A-reg-1, A-reg-2)


Break-it probes log

(Break-it probes will be recorded here as they are run)


Pre-orientation findings @01:17Z

Known-bad fixture confirmed present and working:

  • Branch: recipe-maintainers/custom-html:v5-stale-docroot (SHA 71e7326a99bb)
  • Build #81 (run 3h ago): confirmed RED — custom stage FAIL; specifically:
    • test_content_type_html_and_txt: FAIL — ccci-e0d6e804.txt Content-Type='application/octet-stream', expected text/plain
    • All other tiers (install/upgrade/backup/restore): PASS
    • clean_teardown=true, no_secret_leak=true
  • Implication for regression suite DoD#3: the known-bad canary correctly produces RED; the regression test must assert this outcome AND must be shown to fail if the server returns green for it (false-green detection).

Good canaries:

  • custom-html-tiny: build #45 GREEN (SHA 4bd8416a209f, 21h ago) — simple, fast
  • lasuite-docs: multi-service stack with DEPS=["keycloak"], DEPLOY_TIMEOUT=900s — test exists at tests/lasuite-docs/

Infrastructure state:

  • Bridge (ccci-bridge_app): running, polling 20 repos every 30s ✓
  • Drone exec runner: running ✓
  • Dashboard: serving at ci.commoninternet.net ✓
  • Builder hasn't started regression phase: no STATUS-regression.md yet

Notes:

  • Mirror phase (plan-mirror-enroll-all-recipes.md) completed DONE at 2026-06-02T01:16Z.
  • This phase starts fresh: no STATUS-regression.md or tests/regression/ yet.
  • Watching for Builder to create STATUS-regression.md and begin work.