review(lvl5): M1 PASS @3d8d286 — cold clone HEAD-match, 246 unit tests green + repo lint PASS on CI venv; de-capped compute_level correct on all 4 mission worked examples (L1 fail-blocks, L5 skip-climbs, L2 unver-blocks, L4 lint-unver); derive_rungs N/A classification matches DECISIONS table incl subtle upgrade structural-skip vs abort-unver split; §2.3 mirror handled by scratch-clone CONTEXT not exemptions — NO rule filtered, proven by real-abra probe (hedgedoc pass + injected lightweight tag → R014 fail, classifier has teeth); verdict-neutral by inspection (single call site, double-wrapped, default unver, consumed only in best-effort results block) + 2 targeted tests; cap/cap_reason/capped removed everywhere (only absence-assertions + history-compat remain); lint never 'skip' (no N/A escape hatch). No VETO — Builder cleared to merge + proceed to M2.
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
69
REVIEW-lvl5.md
Normal file
69
REVIEW-lvl5.md
Normal file
@ -0,0 +1,69 @@
|
|||||||
|
# REVIEW — Phase lvl5 (L5 lint rung + de-cap) — Adversary verdicts
|
||||||
|
|
||||||
|
Cold-verification ledger (append-only). Each verdict formed from the plan (SSOT), the code/git
|
||||||
|
history, the verification info in STATUS-lvl5.md, and my own cold re-run — NOT from JOURNAL
|
||||||
|
(anti-anchoring, §6.1). JOURNAL not consulted before this verdict.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## M1 — Implementation complete (pre-merge): **PASS** @ 2026-06-11T07:54Z
|
||||||
|
|
||||||
|
Branch `phase-lvl5` @ `3d8d286cf3f2df7d164bf458f07bbb916cc18f2b` (claim 24baac5). Implementation
|
||||||
|
deliberately NOT on main (reverts 589943f/cd62743 hold it pre-merge) — confirmed; only the
|
||||||
|
DECISIONS entry (392f7df) is on main. Verified from a **fresh cold clone** on the cc-ci host
|
||||||
|
(`/tmp/adv-lvl5`, cloned from origin, checked out phase-lvl5; HEAD matched 3d8d286).
|
||||||
|
|
||||||
|
**Acceptance per plan §4 M1 — all satisfied:**
|
||||||
|
|
||||||
|
1. **Cold clone + HEAD** — `git rev-parse HEAD` = 3d8d286 ✓ (matches claim).
|
||||||
|
2. **Unit suite (CI host venv)** — `cc-ci-run -m pytest tests/unit/ -q` → **246 passed** in 5.32s
|
||||||
|
✓ (matches claimed count).
|
||||||
|
3. **Repo lint** — `nix develop .#lint --command bash scripts/lint.sh` → **lint: PASS** ✓.
|
||||||
|
4. **De-capped `compute_level` correct on ALL 4 mission worked examples** (hand-traced against
|
||||||
|
`level.py` + verified by the rewritten test_level.py):
|
||||||
|
- install✔ upgrade✘ backup✔ functional✔ lint✔ → **L1** (fail blocks) ✓
|
||||||
|
- install✔ upgrade✔ backup skip functional✔ lint✔ → **L5** (intentional skip climbs — the
|
||||||
|
de-cap; was L2 under old rule) ✓
|
||||||
|
- install✔ upgrade✔ backup **unver** functional✔ lint✔ → **L2** (unver blocks) ✓
|
||||||
|
- all four ✔, lint unver → **L4** (unverified top rung not earned) ✓
|
||||||
|
Formula `level = max i: rung_i==pass ∧ all j<i ∈ {pass,skip}` implemented exactly
|
||||||
|
(pass→advance, skip→continue, fail/unver→break). 0 if none.
|
||||||
|
5. **N/A classification table matches code.** `derive_rungs` (results.py) implements the
|
||||||
|
DECISIONS table verbatim, incl. the subtle upgrade split: `skip ∧ ¬has_upgrade_target` →
|
||||||
|
`skip` (structural, climbs); a prior-stage abort (`skip`/None WITH a target, undeclared) →
|
||||||
|
`unver` (blocks). install never skips; backup_restore skip iff not-capable or EXPECTED_NA;
|
||||||
|
functional skip iff EXPECTED_NA else unver; **lint pass/fail-or-unver, NEVER skip** (no N/A
|
||||||
|
escape hatch, §2 item 5; EXPECTED_NA["lint"] ignored). Default-unclassifiable = unver. ✓
|
||||||
|
6. **§2.3 mirror-context decision reviewed — NO rule filtered.** Executor (`lint.py`) lints a
|
||||||
|
pristine scratch clone of the per-run tree at the tested sha; origin→local path makes abra's
|
||||||
|
tag force-fetch work offline (no auth, no go-git "reference not found"), and the run's real
|
||||||
|
tags ride along so R014 evaluates real content. The plumbing pollution is solved by context,
|
||||||
|
not exemptions. Confirmed by **real-abra behavioral probe** (not just synthetic fixtures):
|
||||||
|
- `run_lint("hedgedoc", …)` clean → `{'status':'pass',...}` ✓ (proves scratch-clone makes
|
||||||
|
abra lint actually run — no FATA).
|
||||||
|
- inject lightweight tag → `{'status':'fail','detail':'error rule(s) unsatisfied: R014',
|
||||||
|
'rules_failed':['R014']}` ✓ (proves the classifier has teeth; R014 is NOT suppressed).
|
||||||
|
Classifier correctly recognizes `rc=0`-with-critical-errors (parses table + "critical errors
|
||||||
|
present" sentinel, fails closed on disagreement); only content-FATA ("unable to validate
|
||||||
|
recipe") → fail, all other non-zero → unver.
|
||||||
|
7. **Verdict-neutrality — code inspection + targeted tests.** `run_lint` invoked once
|
||||||
|
(run_recipe_ci.py:942), defaults to `unver`, double-wrapped in try/except (crash → stays
|
||||||
|
unver, non-fatal print), runs BEFORE the tiers at `head_ref` (the exact tested ref). Its
|
||||||
|
result is consumed ONLY at build_results (line 1278, "non-fatal, verdict unaffected"); NO
|
||||||
|
verdict computation reads it. 60s hard budget, never raises. Targeted tests pass:
|
||||||
|
`test_run_lint_missing_recipe_is_unver_not_raise`,
|
||||||
|
`test_build_results_no_lint_given_is_unverified_never_pass`. ✓
|
||||||
|
8. **cap/cap_reason/capped fully removed** from active code/schema/card/dashboard/docs. grep over
|
||||||
|
runner/dashboard/docs/tests finds the words only in (a) the unrelated screenshot timeout-cap,
|
||||||
|
(b) "capable"/max-users, (c) explicit test/doc assertions that the fields are ABSENT in
|
||||||
|
schema 2 and that old schema-1 artifacts (which carry level_cap_reason) still render with no
|
||||||
|
relabeling — history-compat covered by test_card/test_dashboard (green). ✓
|
||||||
|
|
||||||
|
No verdict regression, no run-verdict coupling, no rule suppression, no silent pass. **M1 PASS.**
|
||||||
|
Builder cleared to merge phase-lvl5 → main and proceed to P3/P4 (M2). No VETO.
|
||||||
|
|
||||||
|
**Scope note (carried to M2):** M1 verified the lint executor + classifier + level math on real
|
||||||
|
abra output and the unit surface. M2 must still prove, on real CI end-to-end: ≥1 genuine L5,
|
||||||
|
≥1 lint-blocked L4, ≥1 N/A-skip climb, drone `!testme` ×2, canaries at designed levels under the
|
||||||
|
NEW formula, old artifacts rendering live, durations not inflated (lint ≤~60s; observed ~0.7s),
|
||||||
|
the before/after level table for ALL enrolled recipes, and card/dashboard/badge visually (PNG/SVG).
|
||||||
Reference in New Issue
Block a user