diff --git a/machine-docs/REVIEW-2.md b/machine-docs/REVIEW-2.md index 8419971..e0ab8c8 100644 --- a/machine-docs/REVIEW-2.md +++ b/machine-docs/REVIEW-2.md @@ -879,3 +879,27 @@ TIMEOUT/`DEPLOY_TIMEOUT` covers the python subprocess, but abra's own per-servic what emitted `FATA deploy failed`), and/or a post-redeploy collabora-health wait before asserting reconverge. Anti-anchoring honored: verdict formed from the plan + code + my own run's observable log; I did NOT read JOURNAL-2 before writing this. + +## @2026-05-29 — Pre-claim recon: F2-12 fix e1147b5 (NOT re-claimed yet — no verdict) +Builder ACKed F2-12 and pushed fix `e1147b5` ("own convergence wait via abra `-c` + collabora +READY_PROBE"), status `cc4af49` = validating multi-run before RE-CLAIM. Read the fix ahead of the +re-claim. **The adversarial crux: the upgrade redeploy now passes `abra … -c` (`--no-converge-checks`), +which skips abra's own convergence monitor.** Skipping a convergence check is exactly the shape of a +P7 weakening — so I scrutinized whether the replacement is genuinely stronger or a green-washing. +- **Plausibly NOT a weakening (pending cold proof):** `-c` only skips abra's *post-deploy monitor*; + `docker stack deploy` (the real spec apply) still runs. The harness then owns the verification in + `generic.perform_upgrade`: `lifecycle.wait_healthy` (= `_wait_services_converged` "every swarm + service shows running == configured replicas" + HEALTH_PATH) **then** `lifecycle.wait_ready_probes` + (collabora `/hosting/discovery` → 200), bounded by the generous recipe DEPLOY_TIMEOUT. The READY_PROBE + loop **raises TimeoutError** if discovery never hits 200 (while/else) → upgrade op fails → tier fails, + so it's non-vacuous by construction. HC1 (chaos-version label == PR-head) preserved; chaos_redeploy + still bypasses deploy_app so deploy-count stays 1. +- **MUST cold-verify at re-claim (cannot fully settle by reading):** + 1. **Upgrade tier GREEN on MY own cold run** — the F2-12 close condition (repeat-green, not one-off; + Builder admits it was 3×green/1×fail before this fix). + 2. **P7 negative:** confirm `_wait_services_converged` truly fails on a stuck `0/1` service (i.e. `-c` + + owned-wait catches a genuinely broken converge, not just a slow one). I started reading its + parser (lifecycle.py ~286–328) — finish that read + ideally observe a broken-upgrade-still-RED. + 3. deploy-count == 1; clean teardown. +F2-12 stays OPEN (Adversary-owned). NO verdict until Q3.2 is re-claimed. Anti-anchoring: not reading +JOURNAL before the verdict.