Files
cc-ci/BACKLOG-lvl5.md
2026-06-11 11:29:32 +00:00

7.9 KiB
Raw Blame History

BACKLOG — Phase lvl5

Build backlog

  • B1 (P1) level.py: append rung lint (L5); new status vocabulary {pass, fail, skip, unver}; compute_level() → new formula (level = max i: rung_i pass ∧ ∀j<i status ∈ {pass,skip}); DELETE cap_reason/capped concepts.
  • B2 (P1) lint executor (harness/lint.py): abra recipe lint <recipe> against the exact tested ref; hard ~60s timeout; rc+full output → lint.txt artifact; pass/fail/unver classification (missing abra / timeout / exception → unver, never pass, never skip); mirror-context handling per phase-plan §2.3 (probe abra behavior first; any filtering = named + unit-tested + DECISIONS.md).
  • B3 (P1) results.py: wire lint into derive_rungs + explicit intentional-vs-unintentional classification of EVERY N/A source; drop level_cap_reason/level_cap_rung from schema; skips() reflects new statuses; orchestrator (run_recipe_ci.py) runs lint executor at the tested-ref point + passes result through; verdict-neutral (R7 wrap).
  • B4 (P1) unit tests: rewrite test_level.py/test_results.py to new semantics incl. mission worked examples (fail-blocks → L1; intentional-skip climbs → L5; unver-blocks → L2; lint unver → L4; unclassifiable N/A → unver default); lint executor tests; old-artifact rendering compat tests.
  • B5 (P2) card.py: 05 color ramp; cap line removed ("level N of 5" neutral); rung table renders ✔/✘/intentional-skip/unverified; level_badge_svg loses cap_skip third segment (badge = number+color only); tolerate old artifacts.
  • B6 (P2) dashboard.py: _LEVEL_COLOR 5-scale; _level_pill/badge SVG number-only; legend text; old results.json (cap_reason present, lint absent) render without KeyError.
  • B7 (P2) docs: results-ux.md, testing.md, recipe-customization.md §EXPECTED_NA wording — L5 ladder, de-cap semantics.
  • B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source → intentional|unintentional); mirror-filter decision for lint (if any filtering).
  • B9 — gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
  • B10 (P3) lint sweep over ALL enrolled recipes (scratch clones — never touch ~/.abra/recipes during builds); matrix here (pass/fail + rule hits); mechanical fixes → mirror PRs (never push main/never merge); rest → DEFERRED.md.
  • B11 (P4) real-CI proofs: ≥1 genuine L5; ≥1 lint-blocked L4 (synth branch ok); ≥1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
  • B12 — gate M2: claim; then ## DONE after fresh PASS.

Adversary findings

P3 lint sweep matrix (B10) — all 19 enrolled, mirror main HEAD, 2026-06-11

Method: per recipe, fresh scratch clone of its canonical origin (mirror for the 17 recipe-maintainers recipes; coopcloud upstream for bluesky-pds/custom-html-tiny/mumble) + upstream version tags fetched (production fetch_recipe shape), then harness.lint.run_lint from phase-lvl5 @ 3d8d286 in a scratch ABRA_DIR (/tmp/lvl5-sweep on cc-ci; full outputs in /tmp/lvl5-sweep/art/<recipe>/lint.txt). Canonical ~/.abra/recipes never touched.

Result: 19/19 PASS (no error-severity rule unsatisfied anywhere). No recipe-mirror PRs and no DEFERRED entries needed. Warn-severity misses (informational, do not fail the rung):

recipe lint warn-rule misses
bluesky-pds pass R002 R007 R015
cryptpad pass R002 R005 R007
custom-html pass R002 R004 R005
custom-html-tiny pass R002
discourse pass R002 R007 R015
ghost pass R015
hedgedoc pass R015
immich pass R002 R005
keycloak pass R002 R015
lasuite-docs pass R005
lasuite-drive pass R002 R005
lasuite-meet pass R002
mailu pass R002
matrix-synapse pass R002 R015
mattermost-lts pass R002 R015
mumble pass R002
n8n pass R002 R015
plausible pass R002 R005 R007
uptime-kuma pass R015

Note: lasuite-meet's historically-lightweight tag 0.3.0+v1.16.0 is now ANNOTATED upstream (verified git cat-file -t = tag on all three version tags) — R014 passes genuinely; the abra.py:105 lightweight-tag deploy fallback simply no longer triggers for it.

Before/after level table skeleton (§2.9 — "after" to be filled by P4 real runs)

Baseline = latest results.json on cc-ci per recipe re-scored under the CURRENT (pre-lvl5, 4-rung) rule; ancient 6-rung artifacts (builds ≤205, integration/recipe_local era) re-read on their four essential rungs. Predicted = same tier outcomes + sweep lint result under the new rule (assumption flagged; P4 produces the real values).

recipe baseline rungs (latest artifact) baseline level predicted new level REAL new level (P4 run) why it shifts
bluesky-pds no artifact (deploy-gated upstream, shot-phase N/A) — (still deploy-gated; documented N/A) still deploy-gated
cryptpad I✔ U✔ B✔ F✔ (#181) 4 5 (not re-run; analytic 5) + lint pass
custom-html I✔ U✔ B✔ F✔ (#182) 4 5 4 (#405 PR4 lintdemo: lint fail R011; main analytic 5) + lint pass
custom-html-tiny I✔ U✔ B-na F-na (#205, predates functional/) 2 5 5 (#399 — N/A-skip climb, was 2) de-cap: backup skip declared; functional/ tests exist now; + lint
discourse I✔ U✔ B✔ F✔ (#184) 4 5 (not re-run; analytic 5) + lint pass
ghost I✔ U✔ B✔ F✔ (#185) 4 5 (not re-run; analytic 5) + lint pass
hedgedoc I✔ U✔ B✔ F✔ (#113) 4 5 5 (#398, 100s) + lint pass
immich I✔ U✔ B✔ F✔ (#370) 4 5 5 (#406, drone !testme PR2, 199s) + lint pass
keycloak I✔ U✔ B✔ F✔ (#187) 4 5 (not re-run; analytic 5) + lint pass
lasuite-docs I✔ U✔ B✔ F✔ (#188) 4 5 (not re-run; analytic 5) + lint pass
lasuite-drive I✔ U✔ B✔ F✔ (#189) 4 5 (not re-run; analytic 5) + lint pass
lasuite-meet I✔ U✔ B✔ F✔ (#204) 4 5 (not re-run; analytic 5) + lint pass
mailu I✔ U✔ B-na F✔ (#191) 2 5 (not re-run; analytic 5 — same de-cap as #399) de-cap: not backup-capable → skip climbs (the §2.9 N/A-skip demo)
matrix-synapse I✔ U✔ B✔ F✔ (#203) 4 5 (not re-run; analytic 5) + lint pass
mattermost-lts I✔ U✔ B✔ F✔ (#196) 4 5 (not re-run; analytic 5) + lint pass
mumble no results.json artifact retained 5 (#413, 80s — first retained artifact) P4 run to establish
n8n I✔ U✔ B✔ F✔ (#197) 4 5 (not re-run; analytic 5) + lint pass
plausible I✔ U✔ B✔ F✔ (#371) 4 5 5 (#407, drone !testme PR3, 164s) + lint pass
uptime-kuma I✔ U✔ B✔ F✔ (#165) 4 5 (not re-run; analytic 5) + lint pass

Canaries (designed levels under the NEW formula, re-derived): custom-html-bkp-bad / custom-html-rst-bad — backup-capable with a failing backup/restore tier → backup_restore rung FAIL → level 2 (fail still blocks; run verdict red as today). To be proven in P4.

Canary designed-level re-derivation (P4, runs 415/416 — 2026-06-11)

Under the NEW formula the bad canaries' designed level is 1, not the old 2: their mirrors carry no published version tags on the SRC+REF path → upgrade = intentional skip (climbs past but never earns), backup_restore = FAIL blocks → level = install = 1. Verified live: 415 (bkp-bad) + 416 (rst-bad) both verdict FAILURE (red), rungs {install: pass, upgrade: skip, backup_restore: fail, functional: unver (post-failure abort), lint: pass}, LEVEL 1. Backup/restore fail still blocks; verdict logic untouched. (First attempts 411/412 failed in 1s: canaries are mirror-only, not catalogue recipes — they need SRC+REF params, as prior phases ran them.)