BACKLOG — Phase lvl5

Build backlog

B1 (P1) level.py: append rung lint (L5); new status vocabulary {pass, fail, skip, unver}; compute_level() → new formula (level = max i: rung_i pass ∧ ∀j<i status ∈ {pass,skip}); DELETE cap_reason/capped concepts.
B2 (P1) lint executor (harness/lint.py): abra recipe lint <recipe> against the exact tested ref; hard ~60s timeout; rc+full output → lint.txt artifact; pass/fail/unver classification (missing abra / timeout / exception → unver, never pass, never skip); mirror-context handling per phase-plan §2.3 (probe abra behavior first; any filtering = named + unit-tested + DECISIONS.md).
B3 (P1) results.py: wire lint into derive_rungs + explicit intentional-vs-unintentional classification of EVERY N/A source; drop level_cap_reason/level_cap_rung from schema; skips() reflects new statuses; orchestrator (run_recipe_ci.py) runs lint executor at the tested-ref point + passes result through; verdict-neutral (R7 wrap).
B4 (P1) unit tests: rewrite test_level.py/test_results.py to new semantics incl. mission worked examples (fail-blocks → L1; intentional-skip climbs → L5; unver-blocks → L2; lint unver → L4; unclassifiable N/A → unver default); lint executor tests; old-artifact rendering compat tests.
B5 (P2) card.py: 0–5 color ramp; cap line removed ("level N of 5" neutral); rung table renders ✔/✘/intentional-skip/unverified; level_badge_svg loses cap_skip third segment (badge = number+color only); tolerate old artifacts.
B6 (P2) dashboard.py: _LEVEL_COLOR 5-scale; _level_pill/badge SVG number-only; legend text; old results.json (cap_reason present, lint absent) render without KeyError.
B7 (P2) docs: results-ux.md, testing.md, recipe-customization.md §EXPECTED_NA wording — L5 ladder, de-cap semantics.
B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source → intentional|unintentional); mirror-filter decision for lint (if any filtering).
B9 — gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
B10 (P3) lint sweep over ALL enrolled recipes (scratch clones — never touch ~/.abra/recipes during builds); matrix here (pass/fail + rule hits); mechanical fixes → mirror PRs (never push main/never merge); rest → DEFERRED.md.
B11 (P4) real-CI proofs: ≥1 genuine L5; ≥1 lint-blocked L4 (synth branch ok); ≥1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
B12 — gate M2: claim; then ## DONE after fresh PASS.

Adversary findings

P3 lint sweep matrix (B10) — all 19 enrolled, mirror main HEAD, 2026-06-11

Method: per recipe, fresh scratch clone of its canonical origin (mirror for the 17 recipe-maintainers recipes; coopcloud upstream for bluesky-pds/custom-html-tiny/mumble) + upstream version tags fetched (production fetch_recipe shape), then harness.lint.run_lint from phase-lvl5 @ 3d8d286 in a scratch ABRA_DIR (/tmp/lvl5-sweep on cc-ci; full outputs in /tmp/lvl5-sweep/art/<recipe>/lint.txt). Canonical ~/.abra/recipes never touched.

Result: 19/19 PASS (no error-severity rule unsatisfied anywhere). No recipe-mirror PRs and no DEFERRED entries needed. Warn-severity misses (informational, do not fail the rung):

recipe	lint	warn-rule misses
bluesky-pds	pass	R002 R007 R015
cryptpad	pass	R002 R005 R007
custom-html	pass	R002 R004 R005
custom-html-tiny	pass	R002
discourse	pass	R002 R007 R015
ghost	pass	R015
hedgedoc	pass	R015
immich	pass	R002 R005
keycloak	pass	R002 R015
lasuite-docs	pass	R005
lasuite-drive	pass	R002 R005
lasuite-meet	pass	R002
mailu	pass	R002
matrix-synapse	pass	R002 R015
mattermost-lts	pass	R002 R015
mumble	pass	R002
n8n	pass	R002 R015
plausible	pass	R002 R005 R007
uptime-kuma	pass	R015

Note: lasuite-meet's historically-lightweight tag 0.3.0+v1.16.0 is now ANNOTATED upstream (verified git cat-file -t = tag on all three version tags) — R014 passes genuinely; the abra.py:105 lightweight-tag deploy fallback simply no longer triggers for it.

Before/after level table skeleton (§2.9 — "after" to be filled by P4 real runs)

Baseline = latest results.json on cc-ci per recipe re-scored under the CURRENT (pre-lvl5, 4-rung) rule; ancient 6-rung artifacts (builds ≤205, integration/recipe_local era) re-read on their four essential rungs. Predicted = same tier outcomes + sweep lint result under the new rule (assumption flagged; P4 produces the real values).

recipe	baseline rungs (latest artifact)	baseline level	predicted new level	REAL new level (P4 run)	why it shifts
bluesky-pds	no artifact (deploy-gated upstream, shot-phase N/A)	—	—	— (still deploy-gated; documented N/A)	still deploy-gated
cryptpad	I✔ U✔ B✔ F✔ (#181)	4	5	(not re-run; analytic 5)	+ lint pass
custom-html	I✔ U✔ B✔ F✔ (#182)	4	5	4 (#405 PR4 lintdemo: lint fail R011; main analytic 5)	+ lint pass
custom-html-tiny	I✔ U✔ B-na F-na (#205, predates functional/)	2	5	5 (#399 — N/A-skip climb, was 2)	de-cap: backup skip declared; functional/ tests exist now; + lint
discourse	I✔ U✔ B✔ F✔ (#184)	4	5	(not re-run; analytic 5)	+ lint pass
ghost	I✔ U✔ B✔ F✔ (#185)	4	5	(not re-run; analytic 5)	+ lint pass
hedgedoc	I✔ U✔ B✔ F✔ (#113)	4	5	5 (#398, 100s)	+ lint pass
immich	I✔ U✔ B✔ F✔ (#370)	4	5	5 (#406, drone !testme PR2, 199s)	+ lint pass
keycloak	I✔ U✔ B✔ F✔ (#187)	4	5	(not re-run; analytic 5)	+ lint pass
lasuite-docs	I✔ U✔ B✔ F✔ (#188)	4	5	(not re-run; analytic 5)	+ lint pass
lasuite-drive	I✔ U✔ B✔ F✔ (#189)	4	5	(not re-run; analytic 5)	+ lint pass
lasuite-meet	I✔ U✔ B✔ F✔ (#204)	4	5	(not re-run; analytic 5)	+ lint pass
mailu	I✔ U✔ B-na F✔ (#191)	2	5	(not re-run; analytic 5 — same de-cap as #399)	de-cap: not backup-capable → skip climbs (the §2.9 N/A-skip demo)
matrix-synapse	I✔ U✔ B✔ F✔ (#203)	4	5	(not re-run; analytic 5)	+ lint pass
mattermost-lts	I✔ U✔ B✔ F✔ (#196)	4	5	(not re-run; analytic 5)	+ lint pass
mumble	no results.json artifact retained	—	—	5 (#413, 80s — first retained artifact)	P4 run to establish
n8n	I✔ U✔ B✔ F✔ (#197)	4	5	(not re-run; analytic 5)	+ lint pass
plausible	I✔ U✔ B✔ F✔ (#371)	4	5	5 (#407, drone !testme PR3, 164s)	+ lint pass
uptime-kuma	I✔ U✔ B✔ F✔ (#165)	4	5	(not re-run; analytic 5)	+ lint pass

Canaries (designed levels under the NEW formula, re-derived): custom-html-bkp-bad / custom-html-rst-bad — backup-capable with a failing backup/restore tier → backup_restore rung FAIL → level 2 (fail still blocks; run verdict red as today). To be proven in P4.

Canary designed-level re-derivation (P4, runs 415/416 — 2026-06-11)

Under the NEW formula the bad canaries' designed level is 1, not the old 2: their mirrors carry no published version tags on the SRC+REF path → upgrade = intentional skip (climbs past but never earns), backup_restore = FAIL blocks → level = install = 1. Verified live: 415 (bkp-bad) + 416 (rst-bad) both verdict FAILURE (red), rungs {install: pass, upgrade: skip, backup_restore: fail, functional: unver (post-failure abort), lint: pass}, LEVEL 1. Backup/restore fail still blocks; verdict logic untouched. (First attempts 411/412 failed in 1s: canaries are mirror-only, not catalogue recipes — they need SRC+REF params, as prior phases ran them.)

7.9 KiB Raw Blame History Unescape Escape