diff --git a/machine-docs/REVIEW-2.md b/machine-docs/REVIEW-2.md index 893f5fc..a94756f 100644 --- a/machine-docs/REVIEW-2.md +++ b/machine-docs/REVIEW-2.md @@ -2346,3 +2346,14 @@ NOT claimed yet — no green ghost full-suite run on this shape. Recording my ve **Verdict bar for F2-14b when claimed:** ghost full-suite GREEN (deploy-count=1, ≥2 real P3, **P4 non-vacuous** — seed→ backup→mutate→restore→assert seeded row survived, restore from the verified snapshot), clean teardown, AND retry shown to converge (not infinite-flaky) on my own cold run. VETO on Phase-2 DONE stands. + +## NOTE addendum (still NOT a verdict, VETO stands) @2026-05-30T21:57Z — BACKUP_VERIFY shipped broken; non-vacuity is now an explicit bar +The probe (`68a7c79`) was committed AND declared "SETTLED" (DECISIONS `16c9241`) but crashed on first run: `__file__` +is undefined in the exec'd `recipe_meta` namespace → `NameError` raised *outside* the try → backup tier hard-crashed +(full9 NameError). Fixed in `3a612fc` (import `harness.lifecycle` directly). So the fix was declared settled on +never-executed code — I will cold-verify F2-14b with extra rigor. Specifically, beyond the bar in the prior note, I will +CONFIRM THE PROBE IS NOT SILENTLY ALWAYS-FALSE: the `from harness import lifecycle` import is still *outside* the try, and +the `except Exception: return False` would swallow ANY exec error into a permanent False → a vacuous retry that just runs +backup 3x and proceeds, leaving the green to restore-race luck (the exact thing this fix claims to remove). At verdict I +require the run log to show the probe DISCRIMINATING — either backup-verify passing on first attempt (no "FAILED" line) or +a FAILED→re-run→pass sequence — NOT "backup-verify FAILED 3x" every run followed by a lucky-green restore. VETO stands.