claim(3 U0): results.json + level ladder — gate CLAIMED
U0 (R1) done: pure level() mapper (L0-L6 gap-caps) + per-test JUnit results + results.json, all emitted best-effort (never changes verdict, R7). Two real runs bracket the gate: custom-html-tiny=L2 (functional N/A, backup N/A caps at L2) and uptime-kuma=L4 (full climb, no SSO surface caps at L5). 28 unit tests + Adversary fuzz-clean. Rung-mapping contract in DECISIONS. Verify: STATUS-3.md HOW/EXPECTED. Awaiting Adversary cold-verify. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -6,22 +6,68 @@ State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. D
|
||||
**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.**
|
||||
|
||||
## Phase context
|
||||
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**
|
||||
(plan-phase3 transition = manual). Note for honesty: Phase-2 (recipe-tests) `## DONE` is not yet
|
||||
flipped and REVIEW-2 carries a standing VETO on full Phase-2 DONE authorization; cross-phase
|
||||
sequencing is an operator call — Phase 3 proceeds per the operator kickoff. Adversary concurs this
|
||||
is not a Phase-3 blocker (REVIEW-3 @05:42Z).
|
||||
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**.
|
||||
Note for honesty: Phase-2 `## DONE` not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE
|
||||
authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker
|
||||
(REVIEW-3 @05:42Z).
|
||||
- **Pre-existing repo-wide lint is RED on origin/main** (94 files `ruff format`-dirty + 36 `ruff check`
|
||||
errors; confirmed on cc-ci CI devshell against clean `origin/main`, ruff 0.7.3). This predates Phase 3
|
||||
and is NOT introduced by my work — my NEW Phase-3 files are fully `ruff`-clean, and I left
|
||||
`run_recipe_ci.py` with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3
|
||||
DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.
|
||||
|
||||
## Current state
|
||||
- Phase-3 loop live. Bootstrapping state files + settling open decisions, then executing **U0**.
|
||||
- No gate claimed yet.
|
||||
---
|
||||
|
||||
## In flight
|
||||
- **U0 — Results schema + level (R1).** Building: pure `level()` mapper (L0–L6, gap-caps),
|
||||
per-test structured results, `results.json` per run, artifact hosting path.
|
||||
## Gate: U0 — CLAIMED, awaiting Adversary (Results schema + level; R1)
|
||||
|
||||
## Gate
|
||||
(none claimed)
|
||||
**WHAT.** `run_recipe_ci.py` now emits a per-run `results.json` with per-stage AND per-test ✔/✘
|
||||
breakdown and a computed integer **level** (L0–L6, YunoHost gap-caps semantics). DoD R1 (level ladder)
|
||||
satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2")
|
||||
demonstrated on two real end-to-end runs.
|
||||
|
||||
**WHERE (commits / files).**
|
||||
- `9773e3f` `runner/harness/level.py` — pure `compute_level(rungs)->(level,cap_reason)` + helpers
|
||||
`backup_restore_status`, `tier_to_rung`. `tests/unit/test_level.py` (15 tests).
|
||||
- `52e5d21` `runner/harness/results.py` — JUnit-XML parse, `collect_stages`, `derive_rungs` (the
|
||||
tier+deps/SSO→rung translation), `build_results`, `write_results`. `tests/unit/test_results.py`
|
||||
(13 tests). `runner/run_recipe_ci.py` — tiers emit `--junitxml` + append `{tier,source,file,rc,junit}`
|
||||
records; `main()` assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7),
|
||||
incl. a narrow self leak-scan of the serialised artifact.
|
||||
- `757511e` `machine-docs/DECISIONS.md` (Phase-3 section) — the documented ladder + exact rung-mapping
|
||||
contract `derive_rungs` implements + results.json schema + artifact-hosting decision.
|
||||
|
||||
**HOW to verify (cold, from your clone on cc-ci).**
|
||||
1. **Unit tests** (deterministic; also fuzz-verifiable):
|
||||
`cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q`
|
||||
2. **Real-run L2-cap** (stateless, not backup-capable, ≥2 versions):
|
||||
`RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py`
|
||||
then read `/var/lib/cc-ci-runs/adv-cht/results.json`.
|
||||
3. **Real-run L4-pass** (backup-capable, 3 functional tests, no deps):
|
||||
`RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py`
|
||||
then read `/var/lib/cc-ci-runs/adv-uk/results.json`.
|
||||
(Compare the `level`/`rungs` against the `results` dict + DECISIONS contract — a level greener than
|
||||
the tiers would be a FAIL. Verify clean teardown: no orphan `*-pr*`/recipe service after.)
|
||||
|
||||
**EXPECTED.**
|
||||
1. `28 passed`.
|
||||
2. custom-html-tiny: `level=2`, `level_cap_reason="L3 backup/restore (data integrity) N/A"`,
|
||||
`rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}`,
|
||||
`results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}`,
|
||||
`flags={clean_teardown:true, no_secret_leak:true}`, stages=[install,upgrade] each w/ per-test rows.
|
||||
(My run: `/var/lib/cc-ci-runs/u0-cht-L2/results.json`.)
|
||||
3. uptime-kuma: `level=4`, `level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A"`,
|
||||
`rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}`,
|
||||
all five tiers pass, `flags.clean_teardown=true`, stages=[install,upgrade,backup,restore,custom]
|
||||
with per-test rows (incl. 3 uptime-kuma functional tests, source `cc-ci`).
|
||||
(My run: `/var/lib/cc-ci-runs/u0-uk-L4/results.json`.)
|
||||
|
||||
These two bracket the gate: a recipe whose functional tests **pass** is still capped at **L2** when a
|
||||
lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface
|
||||
caps at **L4**.
|
||||
|
||||
## In flight (next, post-gate)
|
||||
- U1 — app screenshot (Playwright, post-login, secret-safe). Will start once U0 PASSes; meanwhile I
|
||||
hold U1 design as the next unblocked item.
|
||||
|
||||
## Blocked
|
||||
(none)
|
||||
|
||||
Reference in New Issue
Block a user