plan: phase 'lvl5' — L5 level rung: abra recipe lint passes on the PR (queued after shot)
New top rung after install/upgrade/backup-restore/functional: lint the exact recipe ref under test; gap-caps per ladder semantics; verdict-neutral and time-bounded; mirror-origin R014 plumbing must not pollute recipe lint results (abra.py:109-114); all consumers (results/card/dashboard/badge/docs/tests) updated; old artifacts still render. M1 = adversary-cold-verified implementation pre-merge; M2 = real-CI proof incl. a genuine L5, a genuine lint-capped L4, and 2 drone-path runs. Recipe lint failures -> mirror PRs or DEFERRED, never merged.
This commit is contained in:
@ -399,3 +399,11 @@ session cc-ci-orchestrator-stale can be killed; recipe-mirrors org still private
|
||||
the new limit-wait system (d6e1a70/2e1ab8d) on each wake; remove the line 06-11 daytime.
|
||||
- Orchestrator renamed cc-ci-orchestrator (was -vm); stale Jun01 squatter killed;
|
||||
watchdog bounced twice tonight (limit patch, then hourly-wake-during-limit fallback).
|
||||
|
||||
## 2026-06-11 ~01:35 — phase `lvl5` queued after `shot`
|
||||
- Operator: extend the level ladder — L5 = `abra recipe lint` passes on the tested ref
|
||||
(PR head), after the existing four rungs. Plan: cc-ci-plan/plan-phase-lvl5-lint-rung.md.
|
||||
- Key design hazards captured: abra.py:109-114 (pinned deploy lints + FATAs R014 from the
|
||||
CI mirror-origin repoint; chaos/PR path skips lint today) — rung must lint recipe
|
||||
content, not mirror plumbing; verdict-neutral; conservative capping; old artifacts render.
|
||||
- .phases-spec now rcust;shot;lvl5 (idx=1, shot active); watchdog bounce to load it.
|
||||
|
||||
123
cc-ci-plan/plan-phase-lvl5-lint-rung.md
Normal file
123
cc-ci-plan/plan-phase-lvl5-lint-rung.md
Normal file
@ -0,0 +1,123 @@
|
||||
# Phase `lvl5` — add a 5th level rung: `abra recipe lint` passes on the PR
|
||||
|
||||
**Mission (operator-specified):** extend the level ladder with one new top rung.
|
||||
**Level 5 = `abra recipe lint` passes against the exact recipe ref under test (the PR
|
||||
head on PR builds)**, earned only after the existing four rungs (install, upgrade,
|
||||
backup/restore, functional) are all clean PASSes — standard ladder semantics, a gap caps.
|
||||
The existing four levels and their meanings are UNCHANGED.
|
||||
|
||||
State files (phase-namespaced): `STATUS-lvl5.md`, `BACKLOG-lvl5.md`, `REVIEW-lvl5.md`,
|
||||
`JOURNAL-lvl5.md`. DECISIONS.md shared (append).
|
||||
|
||||
---
|
||||
|
||||
## 1. Current system (file map — verified 2026-06-11)
|
||||
|
||||
- `runner/harness/level.py` (120 lines, PURE) — `RUNGS = ("install", "upgrade",
|
||||
"backup_restore", "functional")`, `RUNG_LABEL` 1–4, `compute_level()` (gap caps; N/A
|
||||
caps with distinct reason), `tier_to_rung`, `backup_restore_status`.
|
||||
- `runner/harness/results.py:137-218` — `derive_rungs()` builds the rung dict from tier
|
||||
results; `compute_level` → results.json `level` + `cap_reason` + `capped`.
|
||||
- `runner/harness/card.py` — `LEVEL_COLOR` map (line ~58), cap line hardcodes
|
||||
*"full clean climb — top level (4)"* (line ~246).
|
||||
- `dashboard/dashboard.py` — `_LEVEL_COLOR` (line ~81), corner level badge `_level_pill`,
|
||||
`/badge/<recipe>.svg`.
|
||||
- `tests/unit/test_level.py`, `test_results.py`, `test_card.py`, `test_dashboard.py`.
|
||||
- **Lint today:** abra's pinned (non-chaos) deploy runs `abra recipe lint` internally and
|
||||
FATALs on R014 — see `runner/harness/abra.py:109-114`: the CI's mirror-origin repointing
|
||||
tripped a go-git path, which is why chaos deploys (the PR-testing path!) deliberately
|
||||
SKIP lint. So lint currently runs implicitly on some paths and not at all on others —
|
||||
the new rung makes it explicit, uniform, and visible.
|
||||
|
||||
## 2. Design requirements
|
||||
|
||||
1. **New rung `lint` appended after `functional`** → ladder install(1) upgrade(2)
|
||||
backup_restore(3) functional(4) **lint(5)**. `RUNG_LABEL[5] = "lint (abra recipe lint)"`
|
||||
(or similar). Full clean climb is now 5.
|
||||
2. **What is linted:** the EXACT recipe tree/ref the run deployed (PR head on
|
||||
`!testme`/PR builds; the tested ref otherwise) — never some other branch. Run
|
||||
`abra recipe lint <recipe>` (the abra on the CI host) against the run's own checkout
|
||||
context. Capture rc + full output into the run artifacts (e.g. `lint.txt`), and put a
|
||||
pass/fail + short excerpt in results.json.
|
||||
3. **Mirror-plumbing must not pollute recipe lint results (CRITICAL, see abra.py:109-114):**
|
||||
the R014/go-git failure caused by CI's origin-repointing is a HARNESS artifact, not a
|
||||
recipe defect. The lint rung must evaluate the recipe's content. Solve it properly
|
||||
(e.g. lint in a context where origin looks canonical, or pre-step the same
|
||||
stash/revert dance abra.py already does) — do NOT blanket-ignore lint rules to make
|
||||
the plumbing pass, and document exactly what (if anything) is filtered and why. Any
|
||||
filtering is a named, unit-tested, Adversary-reviewed decision.
|
||||
4. **Verdict semantics UNCHANGED:** lint is a level rung, not a run gate. A lint failure
|
||||
caps the level at 4 with `cap_reason "L5 ... FAILED"`; it must NOT fail/flip the run
|
||||
verdict, and must be time-bounded + best-effort in execution (a wedged lint can never
|
||||
hang a run — hard timeout, ~60s class).
|
||||
5. **No N/A escape hatch by default:** every recipe can be linted, so the rung is
|
||||
pass/fail in practice (keep "na" handling for totality, e.g. abra binary missing →
|
||||
"na" + loud log, capping at 4 — never silently "pass").
|
||||
6. **All consumers updated coherently:** RUNGS/labels, results schema, card (color map +
|
||||
the hardcoded "top level (4)" line), dashboard pills/badge SVG/legend text, docs
|
||||
(testing.md / results docs / recipe-customization.md §levels if it references L4 as
|
||||
top), and every unit test that assumes 4 is the ceiling.
|
||||
7. **History compatibility:** old results.json artifacts (level ≤ 4, no lint rung) must
|
||||
still render correctly in dashboard/card history views — no KeyErrors, no retroactive
|
||||
relabeling of old runs.
|
||||
|
||||
## 3. Work plan
|
||||
|
||||
**P1 — Ladder + plumbing.** level.py rung; lint executor (new `harness/lint.py` or a
|
||||
clean home in abra.py) with timeout, artifact capture, mirror-context handling per §2.3;
|
||||
derive_rungs wiring; results schema. Unit tests for: full climb = 5, lint fail caps at 4
|
||||
with reason, lint na caps at 4, old-artifact rendering, mirror-filter decision (if any).
|
||||
|
||||
**P2 — Presentation.** Card, dashboard, badge, docs — a level-5 color/legend that reads
|
||||
as "above functional". Regenerate anything generated.
|
||||
|
||||
**P3 — Reality pass over all enrolled recipes.** Run the lint rung against every enrolled
|
||||
recipe's current main/HEAD (cheap — lint only, no deploys, respects the shared-checkout
|
||||
rule below). Matrix in BACKLOG-lvl5.md: pass/fail + rule hits per recipe. For failing
|
||||
recipes: the rung correctly caps them at ≤4 (that is WORKING AS DESIGNED, not a phase
|
||||
blocker). Where the fix is mechanical and safe, open a PR against the recipe mirror
|
||||
(NEVER push main / NEVER merge — operator decides), else file the finding in
|
||||
DEFERRED.md with the rule output.
|
||||
|
||||
**P4 — Real-CI proof.** Full-stage runs on enough recipes to prove the rung end-to-end:
|
||||
≥1 recipe reaching a genuine L5 (all five rungs green), ≥1 recipe lint-capped at L4 (a
|
||||
real or synthesized lint failure — a throwaway branch with a deliberate lint violation is
|
||||
fine), ≥2 runs via the drone `!testme` path showing the rung on a real PR, plus the
|
||||
canary suite green. Card + dashboard visually verified (Read the PNGs/SVG output).
|
||||
|
||||
## 4. Gates
|
||||
|
||||
**M1 — Implementation complete (pre-merge).** Branch with P1+P2; Adversary cold-runs unit
|
||||
suite + lint from a clean checkout; reviews the mirror-filter decision (§2.3) explicitly;
|
||||
confirms verdict-neutrality by code inspection + a targeted test. PASS required before
|
||||
merge to main.
|
||||
|
||||
**M2 — Proven in real CI.** P3 matrix complete for ALL enrolled recipes; P4 runs done
|
||||
(L5 achieved, L4-capped demonstrated, drone path ×2, canaries green); old artifacts still
|
||||
render; run durations not materially inflated (lint adds ≤ ~60s). Fresh Adversary PASS →
|
||||
Builder writes `## DONE` to STATUS-lvl5.md.
|
||||
|
||||
## 5. Guardrails (binding)
|
||||
|
||||
- **Never make a run look greener than its tests** (the Phase-3 cardinal rule): the rung
|
||||
caps conservatively; no silent "pass" on lint errors, timeouts, or missing abra.
|
||||
- **No gate weakening; no verdict changes.** Existing tests/assertions untouchable except
|
||||
where the L4-ceiling assumption itself must change — those edits are mechanical and the
|
||||
Adversary checks each one against the old intent.
|
||||
- **Recipe mirrors:** lint findings in recipes → PRs or DEFERRED entries only; NEVER push
|
||||
recipe-mirror main, NEVER merge their PRs.
|
||||
- **Shared checkout race:** NEVER git-checkout `~/.abra/recipes/<recipe>` on cc-ci while
|
||||
that recipe's CI build is running — the harness deploys from that tree. P3's lint-only
|
||||
sweep must use its own scratch clones or run when the recipe is not mid-build.
|
||||
- Real-CI etiquette: ≤2-3 concurrent live deploys; teardown every dev deploy on every
|
||||
exit path; no secrets in logs/commits. Commit author `autonomic-bot
|
||||
<autonomic-bot@noreply.git.autonomic.zone>`; push after every commit.
|
||||
- CI host has no python3 on default PATH for remote one-liners — use shell or the
|
||||
harness venv (`cc-ci-run`).
|
||||
|
||||
## 6. Definition of Done
|
||||
|
||||
Level 5 (= abra recipe lint passes on the tested ref) live on main and visible end-to-end
|
||||
(results.json → card → dashboard → badge), all enrolled recipes linted and dispositioned,
|
||||
≥1 real L5 and ≥1 genuine L4-cap demonstrated through real CI including the drone path,
|
||||
old artifacts unharmed, M1+M2 fresh Adversary PASSes, no verdict or duration regressions.
|
||||
Reference in New Issue
Block a user