Files
cc-ci/machine-docs/STATUS-3.md
autonomic-bot 5b6b378ade claim(3 U0): results.json + level ladder — gate CLAIMED
U0 (R1) done: pure level() mapper (L0-L6 gap-caps) + per-test JUnit results + results.json, all
emitted best-effort (never changes verdict, R7). Two real runs bracket the gate:
custom-html-tiny=L2 (functional N/A, backup N/A caps at L2) and uptime-kuma=L4 (full climb, no SSO
surface caps at L5). 28 unit tests + Adversary fuzz-clean. Rung-mapping contract in DECISIONS.
Verify: STATUS-3.md HOW/EXPECTED. Awaiting Adversary cold-verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 06:03:49 +00:00

4.6 KiB
Raw Blame History

Phase 3 — Beautiful YunoHost-style results — STATUS

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md. DoD = R1R8. Milestones U0U5. State files (this phase): machine-docs/{STATUS,BACKLOG,REVIEW,JOURNAL}-3.md. DECISIONS.md shared.

WHAT + HOW + EXPECTED + WHERE live here; WHY → JOURNAL-3.md.

Phase context

  • Phase 2b is ## DONE (Adversary-verified, no VETO). Phase 3 kicked off manually by the operator. Note for honesty: Phase-2 ## DONE not yet flipped (REVIEW-2 standing VETO on full Phase-2 DONE authorization); cross-phase sequencing is an operator call. Adversary concurs it's not a P3 blocker (REVIEW-3 @05:42Z).
  • Pre-existing repo-wide lint is RED on origin/main (94 files ruff format-dirty + 36 ruff check errors; confirmed on cc-ci CI devshell against clean origin/main, ruff 0.7.3). This predates Phase 3 and is NOT introduced by my work — my NEW Phase-3 files are fully ruff-clean, and I left run_recipe_ci.py with fewer ruff errors than main (1 vs 4). Flagged for the operator; not a Phase-3 DoD item, and the U0 gate is verified by unit tests + real-run results.json, not repo-wide lint.

Gate: U0 — CLAIMED, awaiting Adversary (Results schema + level; R1)

WHAT. run_recipe_ci.py now emits a per-run results.json with per-stage AND per-test ✔/✘ breakdown and a computed integer level (L0L6, YunoHost gap-caps semantics). DoD R1 (level ladder) satisfied; U0 milestone acceptance ("level correct for a recipe through L4 and one capped at L2") demonstrated on two real end-to-end runs.

WHERE (commits / files).

  • 9773e3f runner/harness/level.py — pure compute_level(rungs)->(level,cap_reason) + helpers backup_restore_status, tier_to_rung. tests/unit/test_level.py (15 tests).
  • 52e5d21 runner/harness/results.py — JUnit-XML parse, collect_stages, derive_rungs (the tier+deps/SSO→rung translation), build_results, write_results. tests/unit/test_results.py (13 tests). runner/run_recipe_ci.py — tiers emit --junitxml + append {tier,source,file,rc,junit} records; main() assembles+writes results.json wrapped so a failure NEVER changes the verdict (R7), incl. a narrow self leak-scan of the serialised artifact.
  • 757511e machine-docs/DECISIONS.md (Phase-3 section) — the documented ladder + exact rung-mapping contract derive_rungs implements + results.json schema + artifact-hosting decision.

HOW to verify (cold, from your clone on cc-ci).

  1. Unit tests (deterministic; also fuzz-verifiable): cc-ci-run -m pytest tests/unit/test_level.py tests/unit/test_results.py -q
  2. Real-run L2-cap (stateless, not backup-capable, ≥2 versions): RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-cht cc-ci-run runner/run_recipe_ci.py then read /var/lib/cc-ci-runs/adv-cht/results.json.
  3. Real-run L4-pass (backup-capable, 3 functional tests, no deps): RECIPE=uptime-kuma STAGES=install,upgrade,backup,restore,custom CCCI_RUN_ID=adv-uk cc-ci-run runner/run_recipe_ci.py then read /var/lib/cc-ci-runs/adv-uk/results.json. (Compare the level/rungs against the results dict + DECISIONS contract — a level greener than the tiers would be a FAIL. Verify clean teardown: no orphan *-pr*/recipe service after.)

EXPECTED.

  1. 28 passed.
  2. custom-html-tiny: level=2, level_cap_reason="L3 backup/restore (data integrity) N/A", rungs={install:pass, upgrade:pass, backup_restore:na, functional:na, integration:na, recipe_local:na}, results={install:pass, upgrade:pass, backup:skip, restore:skip, custom:skip}, flags={clean_teardown:true, no_secret_leak:true}, stages=[install,upgrade] each w/ per-test rows. (My run: /var/lib/cc-ci-runs/u0-cht-L2/results.json.)
  3. uptime-kuma: level=4, level_cap_reason="L5 integration (SSO/OIDC + cross-app) N/A", rungs={install:pass, upgrade:pass, backup_restore:pass, functional:pass, integration:na, recipe_local:na}, all five tiers pass, flags.clean_teardown=true, stages=[install,upgrade,backup,restore,custom] with per-test rows (incl. 3 uptime-kuma functional tests, source cc-ci). (My run: /var/lib/cc-ci-runs/u0-uk-L4/results.json.)

These two bracket the gate: a recipe whose functional tests pass is still capped at L2 when a lower rung (L3 backup) is N/A (gap-caps; never inflates), and a full clean climb with no SSO surface caps at L4.

In flight (next, post-gate)

  • U1 — app screenshot (Playwright, post-login, secret-safe). Will start once U0 PASSes; meanwhile I hold U1 design as the next unblocked item.

Blocked

(none)