claim(lvl5): M2 — P4 proven in real CI: L5 (398/406/407/413), lint-blocked L4 verdict-neutral (405), N/A-skip climb (399), drone !testme ×3, canaries red @ re-derived L1 (415/416), unver-blocks synthesized run L2, old artifacts render, durations at baseline, visuals verified

2026-06-11 11:18:26 +00:00
parent dc924c679b
commit a521d43a17
3 changed files with 74 additions and 50 deletions
--- a/BACKLOG-lvl5.md
+++ b/BACKLOG-lvl5.md
@ -12,7 +12,7 @@
 - [x] B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source → intentional|unintentional); mirror-filter decision for lint (if any filtering).
 - [x] B9 — gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
 - [x] B10 (P3) lint sweep over ALL enrolled recipes (scratch clones — never touch ~/.abra/recipes during builds); matrix here (pass/fail + rule hits); mechanical fixes → mirror PRs (never push main/never merge); rest → DEFERRED.md.
- [ ] B11 (P4) real-CI proofs: ≥1 genuine L5; ≥1 lint-blocked L4 (synth branch ok); ≥1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
+- [x] B11 (P4) real-CI proofs: ≥1 genuine L5; ≥1 lint-blocked L4 (synth branch ok); ≥1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
 - [ ] B12 — gate M2: claim; then ## DONE after fresh PASS.

 ## Adversary findings
--- a/JOURNAL-lvl5.md
+++ b/JOURNAL-lvl5.md
@ -91,3 +91,19 @@
  level 5 with "backup/restore INTENTIONAL SKIP" + declared reason inline; badge SVGs
  number+colour only (405 #a0b93f "level 4", 398 #3fb950 "level 5").
 - Canaries 411 (bkp-bad) + 412 (rst-bad) + mumble cold 413 triggered.
+
+## 2026-06-11 P4 complete — M2 claimed
+- Canaries: first attempts 411/412 died in 1s (FATA no recipe — they are mirror-only, need
+  SRC+REF like prior phases ran them); re-triggered as 415/416 with SRC+REF → both verdict RED,
+  level 1 (re-derived designed level: no version tags on mirror → upgrade skip climbs-but-never-
+  earns; backup_restore fail blocks; functional unver post-abort; lint pass).
+- mumble cold 413: level 5, 80s — first retained mumble artifact, fills its table row.
+- Synthesized unver-blocks: hand-run `RECIPE=custom-html STAGES=install,upgrade,custom
+  CCCI_RUN_ID=lvl5-unver-demo cc-ci-run runner/run_recipe_ci.py` (log /tmp/lvl5-unver-run.log,
+  rc=0) → results.json level=2, backup_restore=unver, functional+lint pass above it — mission
+  worked example #3 on the real harness.
+- OBSERVATION (pre-existing, not phase scope): the green STAGES-filtered hand-run triggered WC5
+  promote (canonical custom-html advanced) — should_promote_canonical doesn't check stage
+  completeness. Surfaced to Adversary in the M2 claim notes; not fixing inside this phase.
+- M2 claimed in STATUS-lvl5 with the full evidence table (runs 398/399/405/406/407/413/415/416 +
+  lvl5-unver-demo). B11 ticked.
--- a/STATUS-lvl5.md
+++ b/STATUS-lvl5.md
@ -1,58 +1,66 @@
 # STATUS — Phase lvl5 (L5 lint rung + de-cap)

-Phase: lvl5 — M1 PASSED (cfc87fd); merged to main (08e6cc8); dashboard rolled (image 15addbc7bf45)
-Gate: M1 PASS @cfc87fd. Next gate: M2 (P4 proofs in flight)
-In flight: P4 real-CI proofs (P3 sweep already complete — see BACKLOG-lvl5 matrix)
+Phase: lvl5 — M1 PASS (cfc87fd); P3+P4 complete
+Gate: **M2 CLAIMED, awaiting Adversary** (claimed 2026-06-11)
+In flight: parked at M2 (no unblocked items remain)
 Blockers: none

-## M1 claim — implementation complete (pre-merge)
+## M2 claim — proven in real CI

-**WHAT:** P1+P2 complete per plan-phase-lvl5-lint-rung.md §3: 5-rung ladder (L5 = `abra recipe
-lint` on the exact tested ref), capping concept fully removed (4-status rung vocabulary
-pass/fail/skip/unver; level = highest passed rung with all below pass-or-intentional-skip),
-lint executor + N/A classification + schema 2 + card/dashboard/badge/docs updated, unit suite
-rewritten, old schema-1 artifacts render unchanged.
+**WHAT:** plan-phase-lvl5 §4 M2: P3 matrix complete for ALL 19 enrolled recipes; P4 runs done
+(genuine L5, lint-blocked L4, N/A-skip climb, drone path ×3, canaries at re-derived designed
+levels, synthesized unver-blocks run); old artifacts render; durations not inflated;
+before/after table complete; card/dashboard/badge visually verified.

-**WHERE:** branch `phase-lvl5` @ `3d8d286cf3f2df7d164bf458f07bbb916cc18f2b`.
-Main deliberately does NOT carry the implementation: the gate is pre-merge, so the three
-implementation commits briefly pushed to main were reverted there (`589943f`, `cd62743`) and the
-work lives only on the branch until M1 PASS, after which the Builder merges. The phase DECISIONS
-entry (semantics record + N/A classification table + mirror-context decision) is on BOTH main
-(`392f7df`, machine-docs/DECISIONS.md "Phase lvl5") and the branch.
+**WHERE:** main @ `dc924c679b4ae6dd1e21bfe9d231acb28b58ddf8` (implementation merged 08e6cc8 after
+M1 + PR-path fix 68c3486). Evidence runs (all artifacts at
+`https://ci.commoninternet.net/runs/<n>/{results.json,summary.png,badge.svg,lint.txt}`):

-**HOW to verify (cold, from a fresh clone):**
+| run | what it proves | EXPECTED content |
+|---|---|---|
+| 398 hedgedoc cold | genuine L5, full clean climb | level=5, all 5 rungs pass, schema=2, no cap keys, dur 100s |
+| 399 custom-html-tiny cold | N/A-skip climb (was L2 @ #205) | level=5, backup_restore=skip + declared reason in skips.intentional, dur 45s |
+| 405 custom-html PR4 (!testme) | lint-blocked L4 + verdict-neutral | level=4, lint=fail rules_failed=[R011], **drone build status SUCCESS**, dur 61s |
+| 406 immich PR2 (!testme) | drone path L5 on real PR | level=5, dur 199s (shot baseline 198-199s — no inflation) |
+| 407 plausible PR3 (!testme) | drone path L5 on real PR | level=5, dur 164s (shot baseline 166s) |
+| 413 mumble cold | table row (no prior artifact) | level=5, dur 80s |
+| 415/416 bkp-bad/rst-bad (SRC+REF) | canaries at re-derived designed level | **verdict FAILURE (red)**, level=1, rungs {install pass, upgrade skip (no version tags on mirror), backup_restore fail, functional unver, lint pass} |
+| host `/var/lib/cc-ci-runs/lvl5-unver-demo/results.json` | synthesized unver-blocks (mission ex. #3) | hand-run STAGES=install,upgrade,custom on custom-html: level=2, backup_restore=unver in skips.unintentional, functional+lint pass above it |

-1. `git clone <repo> && cd cc-ci && git checkout phase-lvl5` (expect HEAD = 3d8d286).
-2. Unit suite on the CI host venv: `cc-ci-run -m pytest tests/unit/ -q`
-   → EXPECTED: `246 passed`. (New/rewritten: test_level.py — mission's 4 worked examples
-   verbatim + de-cap cases; test_results.py — derive_rungs classification incl.
-   structural-skip / unver-blocks / EXPECTED_NA-never-overrides / lint-never-skips;
-   test_lint.py — parser/classifier vs real abra output shapes + run_lint never-raise;
-   test_card.py / test_dashboard.py — badge number+colour only, old schema-1 artifact render.)
-3. Repo lint: `nix develop .#lint --command bash scripts/lint.sh` → EXPECTED: `lint: PASS`.
-4. Mirror-filter decision (§2.3) to review: machine-docs/DECISIONS.md "Phase lvl5" — the
-   executor lints a pristine scratch clone of the per-run tree at the tested sha; **no lint
-   rule is filtered/ignored**. Probes behind it are re-runnable:
-   `ABRA_DIR=<scratch> abra recipe lint -n <r>` needs a PTY (`script -qec`); rc≠0 only on FATA;
-   error-rule verdicts only in the table; untracked compose.ccci.yml in the tree → FATA
-   "version mismatched"; origin → unauthenticated mirror URL → FATA "unable to fetch tags".
-5. Verdict-neutrality (code inspection): runner/run_recipe_ci.py call site (grep "L5 lint
-   rung") — runs BEFORE the tiers, run_lint catches every exception internally (returns
-   status=unver), call site additionally try/except-wrapped; the result is consumed ONLY
-   inside the existing R7 best-effort results/card blocks. Targeted tests:
-   `tests/unit/test_lint.py::test_run_lint_missing_recipe_is_unver_not_raise`,
-   `tests/unit/test_results.py::test_build_results_no_lint_given_is_unverified_never_pass`.
-6. Real-abra behavior smoke (optional, ~5s, read-only scratch — safe while builds run):
-   ```
-   export ABRA_DIR=/tmp/<your-scratch>/abra; mkdir -p $ABRA_DIR/recipes
-   ln -sfn ~/.abra/catalogue $ABRA_DIR/catalogue; ln -sfn ~/.abra/servers $ABRA_DIR/servers
-   git clone https://git.coopcloud.tech/coop-cloud/hedgedoc.git $ABRA_DIR/recipes/hedgedoc
-   cd <branch checkout> && cc-ci-run -c 'import sys; sys.path.insert(0,"runner"); from harness import lint; print(lint.run_lint("hedgedoc", None, "/tmp/<scratch>-art"))'
-   ```
-   → EXPECTED: `{'status': 'pass', ...}` and lint.txt in the artifact dir. Add a lightweight
-   tag (`git -C $ABRA_DIR/recipes/hedgedoc tag x-1.0.0`) and re-run
-   → EXPECTED: `{'status': 'fail', 'detail': 'error rule(s) unsatisfied: R014', 'rules_failed': ['R014']}`.
+**HOW to verify (cold):**
+1. Fresh clone main; `cc-ci-run -m pytest tests/unit/ -q` → EXPECTED **247 passed** (new since M1:
+   `test_run_lint_detached_pr_tree_lints_exact_ref` — PR-path regression, see fix 68c3486:
+   abra lint checks out the repo's DEFAULT BRANCH, so run_lint forces local `main` AT the tested
+   ref + repoints origin to the scratch itself; found live in builds 400-402 where the rung
+   correctly degraded to unver/level 4 with run verdicts unaffected).
+   `nix develop .#lint --command bash scripts/lint.sh` → PASS.
+2. Fetch each run's results.json above and check the EXPECTED column; drone build statuses via
+   API (only 415/416 red — and red by tier failure, not by lint).
+3. Visuals: Read `summary.png` of 398 (level 5 of 5, lint row PASS, green 5 badge), 399
+   (backup/restore row "INTENTIONAL SKIP" + reason, level 5), 405 (lint row FAIL red, level 4 of
+   5, badge #a0b93f); badges are number+colour ONLY.
+4. Old artifacts: `/runs/370/{results.json,summary.png}` 200 + render (pre-lvl5 schema-1 with cap
+   fields); dashboard `/` and `/recipe/immich` 200 with mixed-schema rows; unit history-compat
+   tests (test_card/test_dashboard old-schema cases).
+5. lint.txt served: `/runs/398/lint.txt` 200 (full abra table; rc/status header).
+6. P3 matrix + §2.9 before/after table: BACKLOG-lvl5.md (19/19 lint pass sweep — re-runnable per
+   the documented scratch method; baseline column from latest artifacts; REAL column from the
+   runs above; canary re-derivation note).
+7. Dashboard runtime is the rolled image `cc-ci-dashboard:15addbc7bf45` (reconcile per DECISIONS
+   Phase 3/U2 — no host switch).

-**EXPECTED level shifts (for later M2 before/after table):** recipes formerly capped by an
-intentional N/A (single-version → was L1; non-backup-capable → was L2) will climb under the new
-rule; that is the mission, not a regression. Real FAILs and unverified rungs still block.
+**Notes for the verdict:**
+- The throwaway lint-violation PR (custom-html#4, branch lvl5-lintdemo) is left OPEN and marked
+  do-not-merge so you can re-run `!testme` independently; Builder will close branch+PR after M2.
+- Level shifts vs baseline are exactly the rule change (table): formerly-capped intentional-N/A
+  recipes climb; nothing else moved.
+- Observation (pre-existing, out of phase scope, noted in JOURNAL): WC5 promote-on-green-cold
+  does not require all stages — the STAGES-filtered green hand-run promoted custom-html's
+  canonical. Filed as a JOURNAL note; flag if you want it as a finding.
+
+---
+
+## (history) M1 claim — implementation complete (pre-merge): PASS @cfc87fd
+
+Branch `phase-lvl5` @ 3d8d286 (claim 24baac5); 246 unit tests cold-green, repo lint PASS,
+mirror-context decision reviewed, verdict-neutral confirmed. Merged to main 08e6cc8.