Files
cc-ci/machine-docs/STATUS-3.md
autonomic-bot 5b6b378ade claim(3 U0): results.json + level ladder — gate CLAIMED
U0 (R1) done: pure level() mapper (L0-L6 gap-caps) + per-test JUnit results + results.json, all
emitted best-effort (never changes verdict, R7). Two real runs bracket the gate:
custom-html-tiny=L2 (functional N/A, backup N/A caps at L2) and uptime-kuma=L4 (full climb, no SSO
surface caps at L5). 28 unit tests + Adversary fuzz-clean. Rung-mapping contract in DECISIONS.
Verify: STATUS-3.md HOW/EXPECTED. Awaiting Adversary cold-verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 06:03:49 +00:00

74 lines
4.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 3 — Beautiful YunoHost-style results — STATUS
SSOT: `/srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md`. DoD = R1R8. Milestones U0U5.
State files (this phase): `machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md`. DECISIONS.md shared.
**WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.**
## Phase context
- Phase 2b is `## DONE` (Adversary-verified, no VETO). Phase 3 kicked off **manually by the operator**.
Note for honesty: Phase-2 `## DONE` not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE
authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker
(REVIEW-3 @05:42Z).
- **Pre-existing repo-wide lint is RED on origin/main** (94 files `ruff format`-dirty + 36 `ruff check`
errors; confirmed on cc-ci CI devshell against clean `origin/main`, ruff 0.7.3). This predates Phase 3
and is NOT introduced by my work — my NEW Phase-3 files are fully `ruff`-clean, and I left
`run_recipe_ci.py` with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3
DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.
---
## Gate: U0 — CLAIMED, awaiting Adversary (Results schema + level; R1)
**WHAT.** `run_recipe_ci.py` now emits a per-run `results.json` with per-stage AND per-test ✔/✘
breakdown and a computed integer **level** (L0L6, YunoHost gap-caps semantics). DoD R1 (level ladder)
satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2")
demonstrated on two real end-to-end runs.
**WHERE (commits / files).**
- `9773e3f` `runner/harness/level.py` — pure `compute_level(rungs)->(level,cap_reason)` + helpers
`backup_restore_status`, `tier_to_rung`. `tests/unit/test_level.py` (15 tests).
- `52e5d21` `runner/harness/results.py` — JUnit-XML parse, `collect_stages`, `derive_rungs` (the
tier+deps/SSO→rung translation), `build_results`, `write_results`. `tests/unit/test_results.py`
(13 tests). `runner/run_recipe_ci.py` — tiers emit `--junitxml` + append `{tier,source,file,rc,junit}`
records; `main()` assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7),
incl. a narrow self leak-scan of the serialised artifact.
- `757511e` `machine-docs/DECISIONS.md` (Phase-3 section) — the documented ladder + exact rung-mapping
contract `derive_rungs` implements + results.json schema + artifact-hosting decision.
**HOW to verify (cold, from your clone on cc-ci).**
1. **Unit tests** (deterministic; also fuzz-verifiable):
`cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q`
2. **Real-run L2-cap** (stateless, not backup-capable, ≥2 versions):
`RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py`
then read `/var/lib/cc-ci-runs/adv-cht/results.json`.
3. **Real-run L4-pass** (backup-capable, 3 functional tests, no deps):
`RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py`
then read `/var/lib/cc-ci-runs/adv-uk/results.json`.
(Compare the `level`/`rungs` against the `results` dict + DECISIONS contract — a level greener than
the tiers would be a FAIL. Verify clean teardown: no orphan `*-pr*`/recipe service after.)
**EXPECTED.**
1. `28 passed`.
2. custom-html-tiny: `level=2`, `level_cap_reason="L3 backup/restore (data integrity) N/A"`,
`rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}`,
`results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}`,
`flags={clean_teardown:true, no_secret_leak:true}`, stages=[install,upgrade] each w/ per-test rows.
(My run: `/var/lib/cc-ci-runs/u0-cht-L2/results.json`.)
3. uptime-kuma: `level=4`, `level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A"`,
`rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}`,
all five tiers pass, `flags.clean_teardown=true`, stages=[install,upgrade,backup,restore,custom]
with per-test rows (incl. 3 uptime-kuma functional tests, source `cc-ci`).
(My run: `/var/lib/cc-ci-runs/u0-uk-L4/results.json`.)
These two bracket the gate: a recipe whose functional tests **pass** is still capped at **L2** when a
lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface
caps at **L4**.
## In flight (next, post-gate)
- U1 — app screenshot (Playwright, post-login, secret-safe). Will start once U0 PASSes; meanwhile I
hold U1 design as the next unblocked item.
## Blocked
(none)