plan(lvl5): fold in de-capping — level = highest passed rung, N/A skips, fail blocks
Operator decision (explicit Q&A 2026-06-11): remove cap/cap_reason/capped
entirely. New formula: level = max i with rung_i==pass and all j<i in
{pass,na}. N/A no longer stops the climb (the confusing part — e.g.
non-backup-capable recipes were stuck at L2); a real FAIL still blocks.
Per-rung table + verdict carry the completeness story. Added: de-cap
implementation reqs, both-schema rendering, before/after level table for all
recipes, N/A-skip proof run, bad-canary designed-levels re-derivation under
the new formula.
This commit is contained in:
@ -407,3 +407,12 @@ session cc-ci-orchestrator-stale can be killed; recipe-mirrors org still private
|
||||
CI mirror-origin repoint; chaos/PR path skips lint today) — rung must lint recipe
|
||||
content, not mirror plumbing; verdict-neutral; conservative capping; old artifacts render.
|
||||
- .phases-spec now rcust;shot;lvl5 (idx=1, shot active); watchdog bounce to load it.
|
||||
|
||||
## 2026-06-11 ~01:50 — lvl5 plan amended: de-capping folded in (operator decision)
|
||||
- Operator: remove the "capping" notion entirely. Explicit Q&A settled semantics:
|
||||
level = highest PASSED rung where everything below is pass-or-N/A — N/A rungs are
|
||||
skipped (no longer stop the climb), a real FAIL still blocks. cap/cap_reason/capped
|
||||
deleted from code+schema+card+dashboard+docs; rung table is the sole detail carrier.
|
||||
- Deliberate override of Phase-3 "N/A caps" stance — to be recorded in DECISIONS.md by
|
||||
the loops. Before/after level table for all recipes required so the Adversary can
|
||||
attribute every level shift to the rule change.
|
||||
|
||||
@ -1,10 +1,28 @@
|
||||
# Phase `lvl5` — add a 5th level rung: `abra recipe lint` passes on the PR
|
||||
# Phase `lvl5` — level-system changes: 5th rung (`abra recipe lint`) + remove "capping"
|
||||
|
||||
**Mission (operator-specified):** extend the level ladder with one new top rung.
|
||||
**Level 5 = `abra recipe lint` passes against the exact recipe ref under test (the PR
|
||||
head on PR builds)**, earned only after the existing four rungs (install, upgrade,
|
||||
backup/restore, functional) are all clean PASSes — standard ladder semantics, a gap caps.
|
||||
The existing four levels and their meanings are UNCHANGED.
|
||||
**Mission (operator-specified, two changes):**
|
||||
|
||||
1. **New top rung — Level 5 = `abra recipe lint` passes against the exact recipe ref
|
||||
under test (the PR head on PR builds)**, after the existing four rungs (install,
|
||||
upgrade, backup/restore, functional). The existing four rungs' meanings are UNCHANGED.
|
||||
|
||||
2. **Remove the "capping" concept entirely.** The operator finds cap/cap_reason confusing.
|
||||
New level semantics (operator-decided 2026-06-11, explicit Q&A):
|
||||
**level = the highest rung that PASSED, where every rung below it is either "pass" or
|
||||
"na" — N/A rungs are SKIPPED (they no longer stop the climb), a real FAIL still
|
||||
blocks.** Formally: `level = max i such that rung_i == "pass" and all j < i have
|
||||
status in {"pass", "na"}`; 0 if no such i.
|
||||
- install ✔, upgrade ✘, backup ✔, functional ✔, lint ✔ → **level 1** (fail blocks)
|
||||
- install ✔, upgrade ✔, backup N/A, functional ✔, lint ✔ → **level 5** (N/A skipped —
|
||||
previously this capped at 2; that was the confusing part)
|
||||
- all four ✔, lint N/A (e.g. abra missing) → **level 4** (an N/A rung is never EARNED,
|
||||
it just doesn't block the ones above)
|
||||
The words "cap", "capped", "cap_reason" disappear from code, schema, card, dashboard,
|
||||
and docs. The per-rung ✔/✘/— table remains everywhere it exists today, so nothing is
|
||||
hidden — the table is now the SOLE carrier of "why isn't this level higher".
|
||||
NOTE this is a deliberate operator override of the old "N/A caps so the level never
|
||||
overstates" stance from Phase 3: the number now reads "how far did it get", and the
|
||||
rung table + run verdict carry the completeness story. Record this in DECISIONS.md.
|
||||
|
||||
State files (phase-namespaced): `STATUS-lvl5.md`, `BACKLOG-lvl5.md`, `REVIEW-lvl5.md`,
|
||||
`JOURNAL-lvl5.md`. DECISIONS.md shared (append).
|
||||
@ -52,21 +70,36 @@ State files (phase-namespaced): `STATUS-lvl5.md`, `BACKLOG-lvl5.md`, `REVIEW-lvl
|
||||
hang a run — hard timeout, ~60s class).
|
||||
5. **No N/A escape hatch by default:** every recipe can be linted, so the rung is
|
||||
pass/fail in practice (keep "na" handling for totality, e.g. abra binary missing →
|
||||
"na" + loud log, capping at 4 — never silently "pass").
|
||||
6. **All consumers updated coherently:** RUNGS/labels, results schema, card (color map +
|
||||
the hardcoded "top level (4)" line), dashboard pills/badge SVG/legend text, docs
|
||||
"na" + loud log — never silently "pass"; per the new semantics an N/A lint rung is
|
||||
simply not earned, so the level stays 4).
|
||||
6. **De-cap implementation (mission item 2):** `compute_level()` reimplemented to the
|
||||
new formal rule; `cap_reason`/`capped` deleted from level.py, derive_rungs/results.json
|
||||
schema, card (the "capped: …"/"full clean climb — top level (4)" line at card.py:~246
|
||||
is replaced by the rung table alone or a neutral "level N of 5"), dashboard fields,
|
||||
and docs. Unit tests rewritten to the new semantics, INCLUDING the three worked
|
||||
examples from the mission and the old N/A-cases (single-published-version recipe,
|
||||
non-backup-capable recipe) now climbing past their former caps.
|
||||
7. **All consumers updated coherently:** RUNGS/labels, results schema, card (color map +
|
||||
hardcoded top-level line), dashboard pills/badge SVG/legend text, docs
|
||||
(testing.md / results docs / recipe-customization.md §levels if it references L4 as
|
||||
top), and every unit test that assumes 4 is the ceiling.
|
||||
7. **History compatibility:** old results.json artifacts (level ≤ 4, no lint rung) must
|
||||
still render correctly in dashboard/card history views — no KeyErrors, no retroactive
|
||||
relabeling of old runs.
|
||||
top), and every unit test that assumes 4 is the ceiling or asserts cap_reason.
|
||||
8. **History compatibility:** old results.json artifacts (level ≤ 4, lint rung absent,
|
||||
cap_reason PRESENT) must still render correctly in dashboard/card history views — no
|
||||
KeyErrors, no retroactive relabeling of old runs; renderers tolerate both schemas.
|
||||
9. **Expected level shifts are findings, not regressions:** recipes previously capped by
|
||||
an N/A rung will legitimately jump levels (e.g. a non-backup-capable recipe with
|
||||
passing functional goes 2 → 4/5). P3/P4 must produce a before/after level table for
|
||||
ALL enrolled recipes so the Adversary can check every shift against the new rule —
|
||||
any shift NOT explained by the rule change is a real regression.
|
||||
|
||||
## 3. Work plan
|
||||
|
||||
**P1 — Ladder + plumbing.** level.py rung; lint executor (new `harness/lint.py` or a
|
||||
clean home in abra.py) with timeout, artifact capture, mirror-context handling per §2.3;
|
||||
derive_rungs wiring; results schema. Unit tests for: full climb = 5, lint fail caps at 4
|
||||
with reason, lint na caps at 4, old-artifact rendering, mirror-filter decision (if any).
|
||||
**P1 — Ladder + plumbing.** level.py: new rung + the de-capped `compute_level()` per
|
||||
§2 item 2 (delete cap_reason/capped); lint executor (new `harness/lint.py` or a clean
|
||||
home in abra.py) with timeout, artifact capture, mirror-context handling per §2.3;
|
||||
derive_rungs wiring; results schema. Unit tests for: full climb = 5; fail-blocks
|
||||
(upgrade ✘ → L1 even with higher passes); N/A-skip (backup na + functional ✔ → L4+);
|
||||
lint na → stays 4; old-artifact rendering; mirror-filter decision (if any).
|
||||
|
||||
**P2 — Presentation.** Card, dashboard, badge, docs — a level-5 color/legend that reads
|
||||
as "above functional". Regenerate anything generated.
|
||||
@ -79,11 +112,15 @@ blocker). Where the fix is mechanical and safe, open a PR against the recipe mir
|
||||
(NEVER push main / NEVER merge — operator decides), else file the finding in
|
||||
DEFERRED.md with the rule output.
|
||||
|
||||
**P4 — Real-CI proof.** Full-stage runs on enough recipes to prove the rung end-to-end:
|
||||
≥1 recipe reaching a genuine L5 (all five rungs green), ≥1 recipe lint-capped at L4 (a
|
||||
real or synthesized lint failure — a throwaway branch with a deliberate lint violation is
|
||||
fine), ≥2 runs via the drone `!testme` path showing the rung on a real PR, plus the
|
||||
canary suite green. Card + dashboard visually verified (Read the PNGs/SVG output).
|
||||
**P4 — Real-CI proof.** Full-stage runs on enough recipes to prove the changes
|
||||
end-to-end: ≥1 recipe reaching a genuine L5 (all five rungs green), ≥1 recipe blocked at
|
||||
L4 by a lint failure (real or synthesized — a throwaway branch with a deliberate lint
|
||||
violation is fine), ≥1 recipe demonstrating the N/A-skip (formerly capped by an N/A rung,
|
||||
now climbing past it), ≥2 runs via the drone `!testme` path showing the rung on a real
|
||||
PR, plus the canary suite green — the bad canaries must still land at their designed
|
||||
levels under the NEW formula (re-derive what those designed levels are; backup/restore
|
||||
fail still blocks). Card + dashboard visually verified (Read the PNGs/SVG output), and
|
||||
the §2 item 9 before/after level table completed for all enrolled recipes.
|
||||
|
||||
## 4. Gates
|
||||
|
||||
@ -99,8 +136,11 @@ Builder writes `## DONE` to STATUS-lvl5.md.
|
||||
|
||||
## 5. Guardrails (binding)
|
||||
|
||||
- **Never make a run look greener than its tests** (the Phase-3 cardinal rule): the rung
|
||||
caps conservatively; no silent "pass" on lint errors, timeouts, or missing abra.
|
||||
- **Rung statuses stay honest** (the Phase-3 rule, adapted to the new semantics): no
|
||||
rung is ever silently "pass" — lint errors/timeouts/missing abra are fail/na, never
|
||||
pass. The level FORMULA is the operator-decided rule in §2 item 2 (N/A skips); the
|
||||
per-rung table and the run verdict must always remain visible so completeness is
|
||||
never hidden, and the verdict logic itself is untouched by this phase.
|
||||
- **No gate weakening; no verdict changes.** Existing tests/assertions untouchable except
|
||||
where the L4-ceiling assumption itself must change — those edits are mechanical and the
|
||||
Adversary checks each one against the old intent.
|
||||
@ -117,7 +157,10 @@ Builder writes `## DONE` to STATUS-lvl5.md.
|
||||
|
||||
## 6. Definition of Done
|
||||
|
||||
Level 5 (= abra recipe lint passes on the tested ref) live on main and visible end-to-end
|
||||
(results.json → card → dashboard → badge), all enrolled recipes linted and dispositioned,
|
||||
≥1 real L5 and ≥1 genuine L4-cap demonstrated through real CI including the drone path,
|
||||
old artifacts unharmed, M1+M2 fresh Adversary PASSes, no verdict or duration regressions.
|
||||
The new level system live on main and visible end-to-end (results.json → card →
|
||||
dashboard → badge): L5 = abra recipe lint on the tested ref, capping concept fully
|
||||
removed (level = highest passed rung with all lower rungs pass-or-N/A; N/A skips, fail
|
||||
blocks; no cap/cap_reason anywhere), all enrolled recipes linted and dispositioned with
|
||||
the before/after level table adversary-checked, ≥1 real L5 + ≥1 lint-blocked L4 + ≥1
|
||||
N/A-skip climb demonstrated through real CI including the drone path, old artifacts
|
||||
unharmed, M1+M2 fresh Adversary PASSes, no verdict or duration regressions.
|
||||
|
||||
Reference in New Issue
Block a user