100 lines
7.9 KiB
Markdown
100 lines
7.9 KiB
Markdown
# BACKLOG — Phase lvl5
|
||
|
||
## Build backlog
|
||
|
||
- [x] B1 (P1) `level.py`: append rung `lint` (L5); new status vocabulary {pass, fail, skip, unver}; `compute_level()` → new formula (level = max i: rung_i pass ∧ ∀j<i status ∈ {pass,skip}); DELETE cap_reason/capped concepts.
|
||
- [x] B2 (P1) lint executor (`harness/lint.py`): `abra recipe lint <recipe>` against the exact tested ref; hard ~60s timeout; rc+full output → `lint.txt` artifact; pass/fail/unver classification (missing abra / timeout / exception → unver, never pass, never skip); mirror-context handling per phase-plan §2.3 (probe abra behavior first; any filtering = named + unit-tested + DECISIONS.md).
|
||
- [x] B3 (P1) `results.py`: wire lint into `derive_rungs` + explicit intentional-vs-unintentional classification of EVERY N/A source; drop level_cap_reason/level_cap_rung from schema; `skips()` reflects new statuses; orchestrator (`run_recipe_ci.py`) runs lint executor at the tested-ref point + passes result through; verdict-neutral (R7 wrap).
|
||
- [x] B4 (P1) unit tests: rewrite test_level.py/test_results.py to new semantics incl. mission worked examples (fail-blocks → L1; intentional-skip climbs → L5; unver-blocks → L2; lint unver → L4; unclassifiable N/A → unver default); lint executor tests; old-artifact rendering compat tests.
|
||
- [x] B5 (P2) `card.py`: 0–5 color ramp; cap line removed ("level N of 5" neutral); rung table renders ✔/✘/intentional-skip/unverified; level_badge_svg loses cap_skip third segment (badge = number+color only); tolerate old artifacts.
|
||
- [x] B6 (P2) `dashboard.py`: _LEVEL_COLOR 5-scale; _level_pill/badge SVG number-only; legend text; old results.json (cap_reason present, lint absent) render without KeyError.
|
||
- [x] B7 (P2) docs: results-ux.md, testing.md, recipe-customization.md §EXPECTED_NA wording — L5 ladder, de-cap semantics.
|
||
- [x] B8 (P1) DECISIONS.md: semantics change record (replaces Phase-3 "N/A caps"); N/A classification table (every derive_rungs N/A source → intentional|unintentional); mirror-filter decision for lint (if any filtering).
|
||
- [x] B9 — gate M1: claim (branch w/ P1+P2; clean tree; cold-verifiable).
|
||
- [x] B10 (P3) lint sweep over ALL enrolled recipes (scratch clones — never touch ~/.abra/recipes during builds); matrix here (pass/fail + rule hits); mechanical fixes → mirror PRs (never push main/never merge); rest → DEFERRED.md.
|
||
- [x] B11 (P4) real-CI proofs: ≥1 genuine L5; ≥1 lint-blocked L4 (synth branch ok); ≥1 N/A-skip climb; 2× drone !testme; canary suite at re-derived designed levels; 1 synthesized unver-blocks run; before/after level table for ALL enrolled recipes; card/dashboard PNG/SVG visually verified.
|
||
- [x] B12 — gate M2: claim; then ## DONE after fresh PASS.
|
||
|
||
## Adversary findings
|
||
|
||
## P3 lint sweep matrix (B10) — all 19 enrolled, mirror main HEAD, 2026-06-11
|
||
|
||
Method: per recipe, fresh scratch clone of its canonical origin (mirror for the 17
|
||
recipe-maintainers recipes; coopcloud upstream for bluesky-pds/custom-html-tiny/mumble) +
|
||
upstream version tags fetched (production fetch_recipe shape), then `harness.lint.run_lint`
|
||
from phase-lvl5 @ 3d8d286 in a scratch ABRA_DIR (`/tmp/lvl5-sweep` on cc-ci; full outputs in
|
||
`/tmp/lvl5-sweep/art/<recipe>/lint.txt`). Canonical `~/.abra/recipes` never touched.
|
||
|
||
**Result: 19/19 PASS** (no error-severity rule unsatisfied anywhere). No recipe-mirror PRs and
|
||
no DEFERRED entries needed. Warn-severity misses (informational, do not fail the rung):
|
||
|
||
| recipe | lint | warn-rule misses |
|
||
|---|---|---|
|
||
| bluesky-pds | pass | R002 R007 R015 |
|
||
| cryptpad | pass | R002 R005 R007 |
|
||
| custom-html | pass | R002 R004 R005 |
|
||
| custom-html-tiny | pass | R002 |
|
||
| discourse | pass | R002 R007 R015 |
|
||
| ghost | pass | R015 |
|
||
| hedgedoc | pass | R015 |
|
||
| immich | pass | R002 R005 |
|
||
| keycloak | pass | R002 R015 |
|
||
| lasuite-docs | pass | R005 |
|
||
| lasuite-drive | pass | R002 R005 |
|
||
| lasuite-meet | pass | R002 |
|
||
| mailu | pass | R002 |
|
||
| matrix-synapse | pass | R002 R015 |
|
||
| mattermost-lts | pass | R002 R015 |
|
||
| mumble | pass | R002 |
|
||
| n8n | pass | R002 R015 |
|
||
| plausible | pass | R002 R005 R007 |
|
||
| uptime-kuma | pass | R015 |
|
||
|
||
Note: lasuite-meet's historically-lightweight tag `0.3.0+v1.16.0` is now ANNOTATED upstream
|
||
(verified `git cat-file -t` = tag on all three version tags) — R014 passes genuinely; the
|
||
abra.py:105 lightweight-tag deploy fallback simply no longer triggers for it.
|
||
|
||
## Before/after level table skeleton (§2.9 — "after" to be filled by P4 real runs)
|
||
|
||
Baseline = latest results.json on cc-ci per recipe re-scored under the CURRENT (pre-lvl5,
|
||
4-rung) rule; ancient 6-rung artifacts (builds ≤205, integration/recipe_local era) re-read on
|
||
their four essential rungs. Predicted = same tier outcomes + sweep lint result under the new
|
||
rule (assumption flagged; P4 produces the real values).
|
||
|
||
| recipe | baseline rungs (latest artifact) | baseline level | predicted new level | REAL new level (P4 run) | why it shifts |
|
||
|---|---|---|---|---|---|
|
||
| bluesky-pds | no artifact (deploy-gated upstream, shot-phase N/A) | — | — | — (still deploy-gated; documented N/A) | still deploy-gated |
|
||
| cryptpad | I✔ U✔ B✔ F✔ (#181) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| custom-html | I✔ U✔ B✔ F✔ (#182) | 4 | 5 | **4** (#405 PR4 lintdemo: lint fail R011; main analytic 5) | + lint pass |
|
||
| custom-html-tiny | I✔ U✔ B-na F-na (#205, predates functional/) | 2 | 5 | **5** (#399 — N/A-skip climb, was 2) | de-cap: backup skip declared; functional/ tests exist now; + lint |
|
||
| discourse | I✔ U✔ B✔ F✔ (#184) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| ghost | I✔ U✔ B✔ F✔ (#185) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| hedgedoc | I✔ U✔ B✔ F✔ (#113) | 4 | 5 | **5** (#398, 100s) | + lint pass |
|
||
| immich | I✔ U✔ B✔ F✔ (#370) | 4 | 5 | **5** (#406, drone !testme PR2, 199s) | + lint pass |
|
||
| keycloak | I✔ U✔ B✔ F✔ (#187) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| lasuite-docs | I✔ U✔ B✔ F✔ (#188) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| lasuite-drive | I✔ U✔ B✔ F✔ (#189) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| lasuite-meet | I✔ U✔ B✔ F✔ (#204) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| mailu | I✔ U✔ B-na F✔ (#191) | 2 | 5 | (not re-run; analytic 5 — same de-cap as #399) | de-cap: not backup-capable → skip climbs (the §2.9 N/A-skip demo) |
|
||
| matrix-synapse | I✔ U✔ B✔ F✔ (#203) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| mattermost-lts | I✔ U✔ B✔ F✔ (#196) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| mumble | no results.json artifact retained | — | — | **5** (#413, 80s — first retained artifact) | P4 run to establish |
|
||
| n8n | I✔ U✔ B✔ F✔ (#197) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
| plausible | I✔ U✔ B✔ F✔ (#371) | 4 | 5 | **5** (#407, drone !testme PR3, 164s) | + lint pass |
|
||
| uptime-kuma | I✔ U✔ B✔ F✔ (#165) | 4 | 5 | (not re-run; analytic 5) | + lint pass |
|
||
|
||
Canaries (designed levels under the NEW formula, re-derived): custom-html-bkp-bad /
|
||
custom-html-rst-bad — backup-capable with a failing backup/restore tier → backup_restore rung
|
||
FAIL → level 2 (fail still blocks; run verdict red as today). To be proven in P4.
|
||
|
||
### Canary designed-level re-derivation (P4, runs 415/416 — 2026-06-11)
|
||
|
||
Under the NEW formula the bad canaries' designed level is **1**, not the old 2: their mirrors
|
||
carry no published version tags on the SRC+REF path → upgrade = intentional skip (climbs past
|
||
but never earns), backup_restore = FAIL blocks → level = install = 1. Verified live: 415
|
||
(bkp-bad) + 416 (rst-bad) both **verdict FAILURE (red)**, rungs
|
||
{install: pass, upgrade: skip, backup_restore: fail, functional: unver (post-failure abort),
|
||
lint: pass}, LEVEL 1. Backup/restore fail still blocks; verdict logic untouched.
|
||
(First attempts 411/412 failed in 1s: canaries are mirror-only, not catalogue recipes — they
|
||
need SRC+REF params, as prior phases ran them.)
|