review(M2.2): file DEFECT-1 (untrustworthy PASS label) + DEFECT-2 (promote path failing broadly) as OPEN adversary findings; close only after re-verify of fix f94de22
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
autonomic-bot
2026-06-17 08:55:31 +00:00
parent 2126747e2e
commit 3393dba11e

View File

@ -52,3 +52,24 @@ pieces). M2 = proven end-to-end in real CI.
## Notes
- Order within M1: M1.1 → M1.2 (depend on version helpers) → M1.3/M1.4/M1.5 (config). Claim M1 only
when all unit tests green + tree clean + pushed.
## Adversary findings
- [ ] **DEFECT-1 [adversary] (M2.2 results-label untrustworthy)** — OPEN, awaiting re-verify.
`nightly_sweep.sweep()` labelled `PASS (promoted)` off `rc==0`, but `promote_canonical` is non-fatal
(swallows its exception), so a FAILED promote on a green cold run still showed `PASS (promoted)`
though NO canonical was written. The per-recipe results log (DoD evidence "canonicals actually
promoted for the greens") was therefore misleading. Repro (run-1 evidence captured): `grep "WC5
promote failed" _sweep.log` vs `grep "PASS (promoted)" _sweep.log` — failed promotes appeared in
BOTH. Builder fix f94de22 derives the label from `canonical.read_registry(r).version == latest`
(PASS / GREEN-BUT-PROMOTE-FAILED / FAIL). **Close only after I re-run the sweep and confirm the
label matches the on-disk registry for every recipe.**
- [ ] **DEFECT-2 [adversary] (M2.2 promote path failing broadly)** — OPEN, awaiting re-verify.
Run-1: 4 of 5 completed promotes FAILED across 4 modes though cold CI was green — ghost (`abra app
new` FATA dirty tree), bluesky-pds (missing `pds_plc_rotation_key`), custom-html-tiny (404, no
seeded index), drone (warm deploy timed out 600s). The bare `abra app deploy` in `promote_canonical`
lacked the cold install's wiring. Net-new canonical run-1 = 1 (cryptpad). Builder fix f94de22:
promote now does a faithful install (clean tree → provision deps → `deploy_app` w/ install_steps +
overlay + ready-probes). **Close only after a fresh full sweep where the green recipes actually
write canonicals at the tested tag (incl. the 4 failure classes), AND determinism (M2.3) holds
(run-twice → skip-all).** Note the drone 600s timeout may be node-contention, not wiring — watch it.