decisions(3): settle level ladder + rung-mapping contract + artifact hosting (U0)

Records the exact tier+deps/SSO -> rung translation derive_rungs uses (the layer the level depends on), gap-caps semantics (N/A caps like fail, conservative/never-inflate), the results.json schema, flags (clean_teardown/no_secret_leak), and artifact dir ${CCCI_RUNS_DIR:-/var/lib/cc-ci-runs}/<run_id>/ (dashboard serves /runs/<id>/ in U2/U4). So the Adversary can verify the level against a documented contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 06:01:38 +00:00
parent 52e5d210d8
commit 757511e4e7
1 changed files with 50 additions and 0 deletions
--- a/machine-docs/DECISIONS.md
+++ b/machine-docs/DECISIONS.md
@ -1156,3 +1156,53 @@ deploys == 1 (base) + N_cold_deps
  not a test-sequence deploy.

 Full record: `docs/perf/deploys.md`.
+
+---
+
+## Phase 3 — Level ladder + rung mapping + artifact hosting (SETTLED 2026-05-31)
+
+**Level ladder (R1, plan-phase3 §4.1).** A single integer `level` 0–6, YunoHost gap-caps semantics:
+`level = highest rung L such that rungs 1..L are ALL a clean PASS`. The first rung that is not a clean
+PASS — a real **FAIL** *or* genuinely **N/A** for this recipe — stops the climb; `level_cap_reason`
+records which rung and why. **N/A caps just like FAIL** (the only worked example in §4.1, "recipes
+with no integration surface cap at L4 by definition", is exactly N/A-caps, with a recorded reason so
+the level is *fair*, not inflated). Conservative by construction: presentation can only ever
+**understate**, never overstate, the tested quality (plan §6 cardinal guardrail). Pure mapper:
+`runner/harness/level.py::compute_level(rungs)->(level,cap_reason)`; unit-tested + Adversary
+fuzz-clean (REVIEW-3 @df54693, 729/729 no inflation).
+
+  L0 install failed/never healthy · L1 install · L2 upgrade · L3 backup/restore · L4 functional
+  · L5 integration (SSO/OIDC) · L6 recipe-local (repo's own tests/).
+
+**Rung mapping (the translation layer the level depends on).** `run_recipe_ci.py` holds the run's
+per-tier results + deps/SSO signals; `results.derive_rungs(...)` maps them to the rung-status dict
+`compute_level` consumes (each rung ∈ {pass,fail,na}):
+- **install** = install tier pass→pass / fail→fail.
+- **upgrade** = upgrade tier (skip → **na**: only one published version, nothing to upgrade from).
+- **backup_restore** = backup AND restore tiers both pass→pass; either fail→fail; not backup-capable
+  (both skip) → **na**. (One rung for the L3 data-integrity claim — needs both halves.)
+- **functional** (L4) = the custom tier minus its SSO tests: custom pass→pass, fail→fail (conservative:
+  with declared deps we don't split functional-vs-SSO failure, so a custom fail fails functional →
+  caps at L3, never inflates), no custom tests → **na**.
+- **integration** (L5) = applies ONLY if the recipe declares deps (else **na** → the "no integration
+  surface caps at L4" rule). pass iff deps wired (`deps_ready`) AND not `sso_dep_unverified` (F2-11)
+  AND custom didn't fail; else fail.
+- **recipe_local** (L6) = the recipe repo's own `tests/` (discovery source `repo-local`) ran and all
+  passed → pass; any repo-local file failed → fail; none present → **na**.
+
+Surfaced as **flags, not levels** (gating invariants from Phase 1, shown not climbed): `clean_teardown`
+(deploy-count == expected AND no dep-teardown error) and `no_secret_leak` (no known infra-secret value
+appears in the serialised results.json — a narrow self-scan; the Adversary's broader leak scan is the
+authority, R7/U5).
+
+**results.json** (`runner/harness/results.py::build_results`) carries:
+`{schema,run_id,recipe,version,pr,ref,finished,level,level_cap_reason,rungs,stages:[{name,status,
+tests:[{name,classname,status,ms,message,source}]}],results,flags,screenshot,summary_card}`.
+Per-test rows come from per-tier pytest `--junitxml` (stdlib XML parse — no new dep). Assembly is
+**best-effort, wrapped so a failure NEVER changes the run's exit code** (R7 — cosmetics never block).
+
+**Artifact hosting (U0.4).** Runner writes per-run artifacts to `${CCCI_RUNS_DIR:-/var/lib/cc-ci-runs}/
+<run_id>/` (results.json, junit/, later screenshot.png + summary.png). `run_id` = Drone build number
+when present (what the PR comment + dashboard link to), else the unique run domain. The dashboard
+service will serve this dir read-only at `/runs/<run_id>/...` (wired in U2/U4 via a host bind-mount on
+the dashboard swarm service). Decided here; serving deferred to U2/U4 where the card/screenshot need it.