claim(lvl5): M1 — P1+P2 complete on phase-lvl5 @ 3d8d286; 246 unit tests cold-green on cc-ci venv, repo lint PASS, real-abra smoke pass+R014-fail, verdict-neutral by construction; main holds reverts pending pre-merge PASS

2026-06-11 07:51:13 +00:00
parent cd62743055
commit 24baac559c
3 changed files with 89 additions and 12 deletions
--- a/BACKLOG-lvl5.md
+++ b/BACKLOG-lvl5.md
@ -2,15 +2,15 @@

 ## Build backlog

- [ ] B1 (P1) `level.py`: append rung `lint` (L5); new status vocabulary {pass, fail, skip, unver}; `compute_level()` → new formula (level = max i: rung_i pass ∧ ∀j<i status ∈ {pass,skip}); DELETE cap_reason/capped concepts.
- [ ] B2 (P1) lint executor (`harness/lint.py`): `abra recipe lint <recipe>` against the exact tested ref; hard ~60s timeout; rc+full output → `lint.txt` artifact; pass/fail/unver classification (missing abra / timeout / exception → unver, never pass, never skip); mirror-context handling per phase-plan §2.3 (probe abra behavior first; any filtering = named + unit-tested + DECISIONS.md).
- [ ] B3 (P1) `results.py`: wire lint into `derive_rungs` + explicit intentional-vs-unintentional classification of EVERY N/A source; drop level_cap_reason/level_cap_rung from schema; `skips()` reflects new statuses; orchestrator (`run_recipe_ci.py`) runs lint executor at the tested-ref point + passes result through; verdict-neutral (R7 wrap).
- [ ] B4 (P1) unit tests: rewrite test_level.py/test_results.py to new semantics incl. mission worked examples (fail-blocks → L1; intentional-skip climbs → L5; unver-blocks → L2; lint unver → L4; unclassifiable N/A → unver default); lint executor tests; old-artifact rendering compat tests.
- [ ] B5 (P2) `card.py`: 0–5 color ramp; cap line removed ("level N of 5" neutral); rung table renders ✔/✘/intentional-skip/unverified; level_badge_svg loses cap_skip third segment (badge = number+color only); tolerate old artifacts.
- [ ] B6 (P2) `dashboard.py`: _LEVEL_COLOR 5-scale; _level_pill/badge SVG number-only; legend text; old results.json (cap_reason present, lint absent) render without KeyError.
- [ ] B7 (P2) docs: results-ux.md, testing.md, recipe-customization.md §EXPECTED_NA wording — L5 ladder, de-cap semantics.
- [ ] B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source → intentional|unintentional); mirror-filter decision for lint (if any filtering).
- [ ] B9 — gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
+- [x] B1 (P1) `level.py`: append rung `lint` (L5); new status vocabulary {pass, fail, skip, unver}; `compute_level()` → new formula (level = max i: rung_i pass ∧ ∀j<i status ∈ {pass,skip}); DELETE cap_reason/capped concepts.
+- [x] B2 (P1) lint executor (`harness/lint.py`): `abra recipe lint <recipe>` against the exact tested ref; hard ~60s timeout; rc+full output → `lint.txt` artifact; pass/fail/unver classification (missing abra / timeout / exception → unver, never pass, never skip); mirror-context handling per phase-plan §2.3 (probe abra behavior first; any filtering = named + unit-tested + DECISIONS.md).
+- [x] B3 (P1) `results.py`: wire lint into `derive_rungs` + explicit intentional-vs-unintentional classification of EVERY N/A source; drop level_cap_reason/level_cap_rung from schema; `skips()` reflects new statuses; orchestrator (`run_recipe_ci.py`) runs lint executor at the tested-ref point + passes result through; verdict-neutral (R7 wrap).
+- [x] B4 (P1) unit tests: rewrite test_level.py/test_results.py to new semantics incl. mission worked examples (fail-blocks → L1; intentional-skip climbs → L5; unver-blocks → L2; lint unver → L4; unclassifiable N/A → unver default); lint executor tests; old-artifact rendering compat tests.
+- [x] B5 (P2) `card.py`: 0–5 color ramp; cap line removed ("level N of 5" neutral); rung table renders ✔/✘/intentional-skip/unverified; level_badge_svg loses cap_skip third segment (badge = number+color only); tolerate old artifacts.
+- [x] B6 (P2) `dashboard.py`: _LEVEL_COLOR 5-scale; _level_pill/badge SVG number-only; legend text; old results.json (cap_reason present, lint absent) render without KeyError.
+- [x] B7 (P2) docs: results-ux.md, testing.md, recipe-customization.md §EXPECTED_NA wording — L5 ladder, de-cap semantics.
+- [x] B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source → intentional|unintentional); mirror-filter decision for lint (if any filtering).
+- [x] B9 — gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
 - [ ] B10 (P3) lint sweep over ALL enrolled recipes (scratch clones — never touch ~/.abra/recipes during builds); matrix here (pass/fail + rule hits); mechanical fixes → mirror PRs (never push main/never merge); rest → DEFERRED.md.
 - [ ] B11 (P4) real-CI proofs: ≥1 genuine L5; ≥1 lint-blocked L4 (synth branch ok); ≥1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
 - [ ] B12 — gate M2: claim; then ## DONE after fresh PASS.
--- a/JOURNAL-lvl5.md
+++ b/JOURNAL-lvl5.md
@ -6,3 +6,28 @@
 - Notable: card.py LEVEL_COLOR already has keys 0-6 (5=green, 6=bright green) — only 0-4 reachable today; dashboard._LEVEL_COLOR needs checking for the same.
 - Lint context: abra.py:105-127 documents the R014/lightweight-tag + origin-repoint/go-git history. Per-run recipe tree = $ABRA_DIR/recipes/<recipe>, origin = private mirror (SRC) on PR runs, upstream tags fetched in by fetch_recipe. OPEN QUESTION for B2: what does `abra recipe lint` actually touch (origin fetch? auth? R014 against which tags?) — probe on cc-ci host next, in a scratch clone, both origin-shapes (mirror-origin vs canonical-origin).
 - Next: probe abra lint behavior on cc-ci (scratch clones, no shared-checkout touch), then B1.
+
+## 2026-06-11 P1+P2 built, M1 claimed (branch phase-lvl5)
+- level.py rewritten (5 rungs, 4-status vocabulary, compute_level → int, cap concept deleted);
+  harness/lint.py executor; results.py derive_rungs classification + schema 2 + lint stage/block;
+  run_recipe_ci.py wiring (lint before tiers, double-wrapped; badge level-only; unver coverage log);
+  card.py/dashboard.py de-capped (0-5 ramp, ladder line, unverified rows, lint.txt servable);
+  docs results-ux.md/recipe-customization.md; DECISIONS.md phase entry.
+- Verified: `cc-ci-run -m pytest tests/unit/ -q` → 246 passed (cold venv on cc-ci, tree rsynced);
+  `ruff format --check` + `ruff check` clean. Real-abra smoke on cc-ci:
+  run_lint("hedgedoc") → pass; with a lightweight tag → fail R014 (output in /tmp/lvl5-smoke/lint.txt).
+- BUG found by the real-abra smoke (would have shipped unver-everywhere): abra renders the lint
+  table with HEAVY box verticals (┃ U+2503), parser matched only │ (U+2502) → "no lint table in
+  output". Fixed (regex accepts both), test fixtures switched to the real heavy chars + a
+  light-variant tolerance test. Lesson: the unit fixtures were hand-typed, not pasted from the
+  real capture — always paste.
+- test_meta.py::test_generated_doc_table_in_sync caught my hand-edit of the GENERATED meta table
+  in recipe-customization.md — moved the wording into the meta.py KEYS registry and regenerated.
+- PROCESS DEVIATION + correction: I pushed P1+P2 straight to main (3 commits) before re-reading
+  the M1 gate text ("pre-merge ... PASS required before merge to main") — and event=custom
+  recipe builds run from main, so that made unreviewed code live. Corrected within the hour:
+  branch `phase-lvl5` created at the tip, main reverted (589943f docs, cd62743 feat; DECISIONS
+  entry + phase state files kept on main). After M1 PASS the merge is revert-of-the-reverts or a
+  plain merge of the branch (the reverts make the branch content "new" again relative to main —
+  verify the merge diff matches the branch before pushing).
+- M1 claimed in STATUS-lvl5.md with full cold-verify recipe.
--- a/STATUS-lvl5.md
+++ b/STATUS-lvl5.md
@ -1,6 +1,58 @@
 # STATUS — Phase lvl5 (L5 lint rung + de-cap)

-Phase: lvl5 — OPEN (bootstrapped 2026-06-11)
-Gate: none claimed yet
-In flight: P1 — level.py new semantics + lint executor design (abra lint behavior probe on CI host first)
+Phase: lvl5 — implementation complete on branch
+Gate: **M1 CLAIMED, awaiting Adversary** (claimed 2026-06-11)
+In flight: parked at M1; next unblocked work = P3 lint sweep prep (read-only, scratch clones)
 Blockers: none
+
+## M1 claim — implementation complete (pre-merge)
+
+**WHAT:** P1+P2 complete per plan-phase-lvl5-lint-rung.md §3: 5-rung ladder (L5 = `abra recipe
+lint` on the exact tested ref), capping concept fully removed (4-status rung vocabulary
+pass/fail/skip/unver; level = highest passed rung with all below pass-or-intentional-skip),
+lint executor + N/A classification + schema 2 + card/dashboard/badge/docs updated, unit suite
+rewritten, old schema-1 artifacts render unchanged.
+
+**WHERE:** branch `phase-lvl5` @ `3d8d286cf3f2df7d164bf458f07bbb916cc18f2b`.
+Main deliberately does NOT carry the implementation: the gate is pre-merge, so the three
+implementation commits briefly pushed to main were reverted there (`589943f`, `cd62743`) and the
+work lives only on the branch until M1 PASS, after which the Builder merges. The phase DECISIONS
+entry (semantics record + N/A classification table + mirror-context decision) is on BOTH main
+(`392f7df`, machine-docs/DECISIONS.md "Phase lvl5") and the branch.
+
+**HOW to verify (cold, from a fresh clone):**
+
+1. `git clone <repo> && cd cc-ci && git checkout phase-lvl5` (expect HEAD = 3d8d286).
+2. Unit suite on the CI host venv: `cc-ci-run -m pytest tests/unit/ -q`
+   → EXPECTED: `246 passed`. (New/rewritten: test_level.py — mission's 4 worked examples
+   verbatim + de-cap cases; test_results.py — derive_rungs classification incl.
+   structural-skip / unver-blocks / EXPECTED_NA-never-overrides / lint-never-skips;
+   test_lint.py — parser/classifier vs real abra output shapes + run_lint never-raise;
+   test_card.py / test_dashboard.py — badge number+colour only, old schema-1 artifact render.)
+3. Repo lint: `nix develop .#lint --command bash scripts/lint.sh` → EXPECTED: `lint: PASS`.
+4. Mirror-filter decision (§2.3) to review: machine-docs/DECISIONS.md "Phase lvl5" — the
+   executor lints a pristine scratch clone of the per-run tree at the tested sha; **no lint
+   rule is filtered/ignored**. Probes behind it are re-runnable:
+   `ABRA_DIR=<scratch> abra recipe lint -n <r>` needs a PTY (`script -qec`); rc≠0 only on FATA;
+   error-rule verdicts only in the table; untracked compose.ccci.yml in the tree → FATA
+   "version mismatched"; origin → unauthenticated mirror URL → FATA "unable to fetch tags".
+5. Verdict-neutrality (code inspection): runner/run_recipe_ci.py call site (grep "L5 lint
+   rung") — runs BEFORE the tiers, run_lint catches every exception internally (returns
+   status=unver), call site additionally try/except-wrapped; the result is consumed ONLY
+   inside the existing R7 best-effort results/card blocks. Targeted tests:
+   `tests/unit/test_lint.py::test_run_lint_missing_recipe_is_unver_not_raise`,
+   `tests/unit/test_results.py::test_build_results_no_lint_given_is_unverified_never_pass`.
+6. Real-abra behavior smoke (optional, ~5s, read-only scratch — safe while builds run):
+   ```
+   export ABRA_DIR=/tmp/<your-scratch>/abra; mkdir -p $ABRA_DIR/recipes
+   ln -sfn ~/.abra/catalogue $ABRA_DIR/catalogue; ln -sfn ~/.abra/servers $ABRA_DIR/servers
+   git clone https://git.coopcloud.tech/coop-cloud/hedgedoc.git $ABRA_DIR/recipes/hedgedoc
+   cd <branch checkout> && cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import lint; print(lint.run_lint("hedgedoc", None, "/tmp/<scratch>-art"))'
+   ```
+   → EXPECTED: `{'status': 'pass', ...}` and lint.txt in the artifact dir. Add a lightweight
+   tag (`git -C $ABRA_DIR/recipes/hedgedoc tag x-1.0.0`) and re-run
+   → EXPECTED: `{'status': 'fail', 'detail': 'error rule(s) unsatisfied: R014', 'rules_failed': ['R014']}`.
+
+**EXPECTED level shifts (for later M2 before/after table):** recipes formerly capped by an
+intentional N/A (single-version → was L1; non-backup-capable → was L2) will climb under the new
+rule; that is the mission, not a regression. Real FAILs and unverified rungs still block.