Files
cc-ci/REVIEW-lvl5.md

5.1 KiB
Raw Blame History

REVIEW — Phase lvl5 (L5 lint rung + de-cap) — Adversary verdicts

Cold-verification ledger (append-only). Each verdict formed from the plan (SSOT), the code/git history, the verification info in STATUS-lvl5.md, and my own cold re-run — NOT from JOURNAL (anti-anchoring, §6.1). JOURNAL not consulted before this verdict.


M1 — Implementation complete (pre-merge): PASS @ 2026-06-11T07:54Z

Branch phase-lvl5 @ 3d8d286cf3f2df7d164bf458f07bbb916cc18f2b (claim 24baac5). Implementation deliberately NOT on main (reverts 589943f/cd62743 hold it pre-merge) — confirmed; only the DECISIONS entry (392f7df) is on main. Verified from a fresh cold clone on the cc-ci host (/tmp/adv-lvl5, cloned from origin, checked out phase-lvl5; HEAD matched 3d8d286).

Acceptance per plan §4 M1 — all satisfied:

  1. Cold clone + HEADgit rev-parse HEAD = 3d8d286 ✓ (matches claim).
  2. Unit suite (CI host venv)cc-ci-run -m pytest tests/unit/ -q246 passed in 5.32s ✓ (matches claimed count).
  3. Repo lintnix develop .#lint --command bash scripts/lint.shlint: PASS ✓.
  4. De-capped compute_level correct on ALL 4 mission worked examples (hand-traced against level.py + verified by the rewritten test_level.py):
    • install✔ upgrade✘ backup✔ functional✔ lint✔ → L1 (fail blocks) ✓
    • install✔ upgrade✔ backup skip functional✔ lint✔ → L5 (intentional skip climbs — the de-cap; was L2 under old rule) ✓
    • install✔ upgrade✔ backup unver functional✔ lint✔ → L2 (unver blocks) ✓
    • all four ✔, lint unver → L4 (unverified top rung not earned) ✓ Formula level = max i: rung_i==pass ∧ all j<i ∈ {pass,skip} implemented exactly (pass→advance, skip→continue, fail/unver→break). 0 if none.
  5. N/A classification table matches code. derive_rungs (results.py) implements the DECISIONS table verbatim, incl. the subtle upgrade split: skip ∧ ¬has_upgrade_targetskip (structural, climbs); a prior-stage abort (skip/None WITH a target, undeclared) → unver (blocks). install never skips; backup_restore skip iff not-capable or EXPECTED_NA; functional skip iff EXPECTED_NA else unver; lint pass/fail-or-unver, NEVER skip (no N/A escape hatch, §2 item 5; EXPECTED_NA["lint"] ignored). Default-unclassifiable = unver. ✓
  6. §2.3 mirror-context decision reviewed — NO rule filtered. Executor (lint.py) lints a pristine scratch clone of the per-run tree at the tested sha; origin→local path makes abra's tag force-fetch work offline (no auth, no go-git "reference not found"), and the run's real tags ride along so R014 evaluates real content. The plumbing pollution is solved by context, not exemptions. Confirmed by real-abra behavioral probe (not just synthetic fixtures):
    • run_lint("hedgedoc", …) clean → {'status':'pass',...} ✓ (proves scratch-clone makes abra lint actually run — no FATA).
    • inject lightweight tag → {'status':'fail','detail':'error rule(s) unsatisfied: R014', 'rules_failed':['R014']} ✓ (proves the classifier has teeth; R014 is NOT suppressed). Classifier correctly recognizes rc=0-with-critical-errors (parses table + "critical errors present" sentinel, fails closed on disagreement); only content-FATA ("unable to validate recipe") → fail, all other non-zero → unver.
  7. Verdict-neutrality — code inspection + targeted tests. run_lint invoked once (run_recipe_ci.py:942), defaults to unver, double-wrapped in try/except (crash → stays unver, non-fatal print), runs BEFORE the tiers at head_ref (the exact tested ref). Its result is consumed ONLY at build_results (line 1278, "non-fatal, verdict unaffected"); NO verdict computation reads it. 60s hard budget, never raises. Targeted tests pass: test_run_lint_missing_recipe_is_unver_not_raise, test_build_results_no_lint_given_is_unverified_never_pass. ✓
  8. cap/cap_reason/capped fully removed from active code/schema/card/dashboard/docs. grep over runner/dashboard/docs/tests finds the words only in (a) the unrelated screenshot timeout-cap, (b) "capable"/max-users, (c) explicit test/doc assertions that the fields are ABSENT in schema 2 and that old schema-1 artifacts (which carry level_cap_reason) still render with no relabeling — history-compat covered by test_card/test_dashboard (green). ✓

No verdict regression, no run-verdict coupling, no rule suppression, no silent pass. M1 PASS. Builder cleared to merge phase-lvl5 → main and proceed to P3/P4 (M2). No VETO.

Scope note (carried to M2): M1 verified the lint executor + classifier + level math on real abra output and the unit surface. M2 must still prove, on real CI end-to-end: ≥1 genuine L5, ≥1 lint-blocked L4, ≥1 N/A-skip climb, drone !testme ×2, canaries at designed levels under the NEW formula, old artifacts rendering live, durations not inflated (lint ≤~60s; observed ~0.7s), the before/after level table for ALL enrolled recipes, and card/dashboard/badge visually (PNG/SVG).