decisions(3): settle level ladder + rung-mapping contract + artifact hosting (U0)

Records the exact tier+deps/SSO -> rung translation derive_rungs uses (the layer the level depends
on), gap-caps semantics (N/A caps like fail, conservative/never-inflate), the results.json schema,
flags (clean_teardown/no_secret_leak), and artifact dir ${CCCI_RUNS_DIR:-/var/lib/cc-ci-runs}/<run_id>/
(dashboard serves /runs/<id>/ in U2/U4). So the Adversary can verify the level against a documented contract.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
autonomic-bot
2026-05-31 06:01:38 +00:00
parent 52e5d210d8
commit 757511e4e7

View File

@ -1156,3 +1156,53 @@ deploys == 1 (base) + N_cold_deps
not a test-sequence deploy.
Full record: `docs/perf/deploys.md`.
---
## Phase 3 — Level ladder + rung mapping + artifact hosting (SETTLED 2026-05-31)
**Level ladder (R1, plan-phase3 §4.1).** A single integer `level` 06, YunoHost gap-caps semantics:
`level = highest rung L such that rungs 1..L are ALL a clean PASS`. The first rung that is not a clean
PASS — a real **FAIL** *or* genuinely **N/A** for this recipe — stops the climb; `level_cap_reason`
records which rung and why. **N/A caps just like FAIL** (the only worked example in §4.1, "recipes
with no integration surface cap at L4 by definition", is exactly N/A-caps, with a recorded reason so
the level is *fair*, not inflated). Conservative by construction: presentation can only ever
**understate**, never overstate, the tested quality (plan §6 cardinal guardrail). Pure mapper:
`runner/harness/level.py::compute_level(rungs)->(level,cap_reason)`; unit-tested + Adversary
fuzz-clean (REVIEW-3 @df54693, 729/729 no inflation).
L0 install failed/never healthy · L1 install · L2 upgrade · L3 backup/restore · L4 functional
· L5 integration (SSO/OIDC) · L6 recipe-local (repo's own tests/).
**Rung mapping (the translation layer the level depends on).** `run_recipe_ci.py` holds the run's
per-tier results + deps/SSO signals; `results.derive_rungs(...)` maps them to the rung-status dict
`compute_level` consumes (each rung ∈ {pass,fail,na}):
- **install** = install tier pass→pass / fail→fail.
- **upgrade** = upgrade tier (skip → **na**: only one published version, nothing to upgrade from).
- **backup_restore** = backup AND restore tiers both pass→pass; either fail→fail; not backup-capable
(both skip) → **na**. (One rung for the L3 data-integrity claim — needs both halves.)
- **functional** (L4) = the custom tier minus its SSO tests: custom pass→pass, fail→fail (conservative:
with declared deps we don't split functional-vs-SSO failure, so a custom fail fails functional →
caps at L3, never inflates), no custom tests → **na**.
- **integration** (L5) = applies ONLY if the recipe declares deps (else **na** → the "no integration
surface caps at L4" rule). pass iff deps wired (`deps_ready`) AND not `sso_dep_unverified` (F2-11)
AND custom didn't fail; else fail.
- **recipe_local** (L6) = the recipe repo's own `tests/` (discovery source `repo-local`) ran and all
passed → pass; any repo-local file failed → fail; none present → **na**.
Surfaced as **flags, not levels** (gating invariants from Phase 1, shown not climbed): `clean_teardown`
(deploy-count == expected AND no dep-teardown error) and `no_secret_leak` (no known infra-secret value
appears in the serialised results.json — a narrow self-scan; the Adversary's broader leak scan is the
authority, R7/U5).
**results.json** (`runner/harness/results.py::build_results`) carries:
`{schema,run_id,recipe,version,pr,ref,finished,level,level_cap_reason,rungs,stages:[{name,status,
tests:[{name,classname,status,ms,message,source}]}],results,flags,screenshot,summary_card}`.
Per-test rows come from per-tier pytest `--junitxml` (stdlib XML parse — no new dep). Assembly is
**best-effort, wrapped so a failure NEVER changes the run's exit code** (R7 — cosmetics never block).
**Artifact hosting (U0.4).** Runner writes per-run artifacts to `${CCCI_RUNS_DIR:-/var/lib/cc-ci-runs}/
<run_id>/` (results.json, junit/, later screenshot.png + summary.png). `run_id` = Drone build number
when present (what the PR comment + dashboard link to), else the unique run domain. The dashboard
service will serve this dir read-only at `/runs/<run_id>/...` (wired in U2/U4 via a host bind-mount on
the dashboard swarm service). Decided here; serving deferred to U2/U4 where the card/screenshot need it.