Files
cc-ci/BACKLOG-lvl5.md
2026-06-11 11:29:32 +00:00

100 lines
7.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# BACKLOG — Phase lvl5
## Build backlog
- [x] B1 (P1) `level.py`: append rung `lint` (L5); new status vocabulary {pass, fail, skip, unver}; `compute_level()` → new formula (level = max i: rung_i pass ∧ ∀j<i status {pass,skip}); DELETE cap_reason/capped concepts.
- [x] B2 (P1) lint executor (`harness/lint.py`): `abra recipe lint <recipe>` against the exact tested ref; hard ~60s timeout; rc+full output `lint.txt` artifact; pass/fail/unver classification (missing abra / timeout / exception unver, never pass, never skip); mirror-context handling per phase-plan §2.3 (probe abra behavior first; any filtering = named + unit-tested + DECISIONS.md).
- [x] B3 (P1) `results.py`: wire lint into `derive_rungs` + explicit intentional-vs-unintentional classification of EVERY N/A source; drop level_cap_reason/level_cap_rung from schema; `skips()` reflects new statuses; orchestrator (`run_recipe_ci.py`) runs lint executor at the tested-ref point + passes result through; verdict-neutral (R7 wrap).
- [x] B4 (P1) unit tests: rewrite test_level.py/test_results.py to new semantics incl. mission worked examples (fail-blocks L1; intentional-skip climbs L5; unver-blocks L2; lint unver L4; unclassifiable N/A unver default); lint executor tests; old-artifact rendering compat tests.
- [x] B5 (P2) `card.py`: 05 color ramp; cap line removed ("level N of 5" neutral); rung table renders ✔/✘/intentional-skip/unverified; level_badge_svg loses cap_skip third segment (badge = number+color only); tolerate old artifacts.
- [x] B6 (P2) `dashboard.py`: _LEVEL_COLOR 5-scale; _level_pill/badge SVG number-only; legend text; old results.json (cap_reason present, lint absent) render without KeyError.
- [x] B7 (P2) docs: results-ux.md, testing.md, recipe-customization.md §EXPECTED_NA wording L5 ladder, de-cap semantics.
- [x] B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source intentional|unintentional); mirror-filter decision for lint (if any filtering).
- [x] B9 gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
- [x] B10 (P3) lint sweep over ALL enrolled recipes (scratch clones never touch ~/.abra/recipes during builds); matrix here (pass/fail + rule hits); mechanical fixes mirror PRs (never push main/never merge); rest DEFERRED.md.
- [x] B11 (P4) real-CI proofs: 1 genuine L5; 1 lint-blocked L4 (synth branch ok); 1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
- [x] B12 gate M2: claim; then ## DONE after fresh PASS.
## Adversary findings
## P3 lint sweep matrix (B10) — all 19 enrolled, mirror main HEAD, 2026-06-11
Method: per recipe, fresh scratch clone of its canonical origin (mirror for the 17
recipe-maintainers recipes; coopcloud upstream for bluesky-pds/custom-html-tiny/mumble) +
upstream version tags fetched (production fetch_recipe shape), then `harness.lint.run_lint`
from phase-lvl5 @ 3d8d286 in a scratch ABRA_DIR (`/tmp/lvl5-sweep` on cc-ci; full outputs in
`/tmp/lvl5-sweep/art/<recipe>/lint.txt`). Canonical `~/.abra/recipes` never touched.
**Result: 19/19 PASS** (no error-severity rule unsatisfied anywhere). No recipe-mirror PRs and
no DEFERRED entries needed. Warn-severity misses (informational, do not fail the rung):
| recipe | lint | warn-rule misses |
|---|---|---|
| bluesky-pds | pass | R002 R007 R015 |
| cryptpad | pass | R002 R005 R007 |
| custom-html | pass | R002 R004 R005 |
| custom-html-tiny | pass | R002 |
| discourse | pass | R002 R007 R015 |
| ghost | pass | R015 |
| hedgedoc | pass | R015 |
| immich | pass | R002 R005 |
| keycloak | pass | R002 R015 |
| lasuite-docs | pass | R005 |
| lasuite-drive | pass | R002 R005 |
| lasuite-meet | pass | R002 |
| mailu | pass | R002 |
| matrix-synapse | pass | R002 R015 |
| mattermost-lts | pass | R002 R015 |
| mumble | pass | R002 |
| n8n | pass | R002 R015 |
| plausible | pass | R002 R005 R007 |
| uptime-kuma | pass | R015 |
Note: lasuite-meet's historically-lightweight tag `0.3.0+v1.16.0` is now ANNOTATED upstream
(verified `git cat-file -t` = tag on all three version tags) R014 passes genuinely; the
abra.py:105 lightweight-tag deploy fallback simply no longer triggers for it.
## Before/after level table skeleton (§2.9 — "after" to be filled by P4 real runs)
Baseline = latest results.json on cc-ci per recipe re-scored under the CURRENT (pre-lvl5,
4-rung) rule; ancient 6-rung artifacts (builds 205, integration/recipe_local era) re-read on
their four essential rungs. Predicted = same tier outcomes + sweep lint result under the new
rule (assumption flagged; P4 produces the real values).
| recipe | baseline rungs (latest artifact) | baseline level | predicted new level | REAL new level (P4 run) | why it shifts |
|---|---|---|---|---|---|
| bluesky-pds | no artifact (deploy-gated upstream, shot-phase N/A) | | | (still deploy-gated; documented N/A) | still deploy-gated |
| cryptpad | I U B F (#181) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| custom-html | I U B F (#182) | 4 | 5 | **4** (#405 PR4 lintdemo: lint fail R011; main analytic 5) | + lint pass |
| custom-html-tiny | I U B-na F-na (#205, predates functional/) | 2 | 5 | **5** (#399 N/A-skip climb, was 2) | de-cap: backup skip declared; functional/ tests exist now; + lint |
| discourse | I U B F (#184) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| ghost | I U B F (#185) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| hedgedoc | I U B F (#113) | 4 | 5 | **5** (#398, 100s) | + lint pass |
| immich | I U B F (#370) | 4 | 5 | **5** (#406, drone !testme PR2, 199s) | + lint pass |
| keycloak | I U B F (#187) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| lasuite-docs | I U B F (#188) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| lasuite-drive | I U B F (#189) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| lasuite-meet | I U B F (#204) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| mailu | I U B-na F (#191) | 2 | 5 | (not re-run; analytic 5 same de-cap as #399) | de-cap: not backup-capable skip climbs (the §2.9 N/A-skip demo) |
| matrix-synapse | I U B F (#203) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| mattermost-lts | I U B F (#196) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| mumble | no results.json artifact retained | | | **5** (#413, 80s first retained artifact) | P4 run to establish |
| n8n | I U B F (#197) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
| plausible | I U B F (#371) | 4 | 5 | **5** (#407, drone !testme PR3, 164s) | + lint pass |
| uptime-kuma | I U B F (#165) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
Canaries (designed levels under the NEW formula, re-derived): custom-html-bkp-bad /
custom-html-rst-bad backup-capable with a failing backup/restore tier backup_restore rung
FAIL level 2 (fail still blocks; run verdict red as today). To be proven in P4.
### Canary designed-level re-derivation (P4, runs 415/416 — 2026-06-11)
Under the NEW formula the bad canaries' designed level is **1**, not the old 2: their mirrors
carry no published version tags on the SRC+REF path upgrade = intentional skip (climbs past
but never earns), backup_restore = FAIL blocks level = install = 1. Verified live: 415
(bkp-bad) + 416 (rst-bad) both **verdict FAILURE (red)**, rungs
{install: pass, upgrade: skip, backup_restore: fail, functional: unver (post-failure abort),
lint: pass}, LEVEL 1. Backup/restore fail still blocks; verdict logic untouched.
(First attempts 411/412 failed in 1s: canaries are mirror-only, not catalogue recipes they
need SRC+REF params, as prior phases ran them.)