Files

autonomic-bot 5b6b378ade claim(3 U0): results.json + level ladder — gate CLAIMED

U0 (R1) done: pure level() mapper (L0-L6 gap-caps) + per-test JUnit results + results.json, all
emitted best-effort (never changes verdict, R7). Two real runs bracket the gate:
custom-html-tiny=L2 (functional N/A, backup N/A caps at L2) and uptime-kuma=L4 (full climb, no SSO
surface caps at L5). 28 unit tests + Adversary fuzz-clean. Rung-mapping contract in DECISIONS.
Verify: STATUS-3.md HOW/EXPECTED. Awaiting Adversary cold-verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-05-31 06:03:49 +00:00

5.2 KiB

Raw Blame History

Phase 3 — Beautiful YunoHost-style results — JOURNAL (Builder-private reasoning)

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase3-results-ux.md. WHY lives here; WHAT/HOW/EXPECTED/WHERE → STATUS-3.

2026-05-31T05:41Z — Phase-3 bootstrap + orientation

Read plan-phase3-results-ux.md in full (SSOT) + plan.md §6.1/§7/§9. Oriented on the existing Phase-1/2 artifacts I'll extend:

runner/run_recipe_ci.py: orchestrates deploy-once → per-tier (install/upgrade/backup/restore/custom), produces an in-memory results dict {tier: 'pass'|'fail'|'skip'} printed to Drone logs. No results.json, no level, no screenshot today. Also tracks deploy-count (DG4.1), deps/SSO readiness (sso_dep_unverified → F2-11), teardown errors.
bridge/bridge.py: posts a text PR comment with the Drone run URL; watch_and_reflect edits it to ✅/❌ on completion. No image/badge/level.
dashboard/dashboard.py: stdlib HTTP service (swarm OCI image, Nix-built) that polls the Drone API only and renders a latest-per-recipe table + a basic per-recipe SVG badge (Drone status, not level). Runs as a container with no host volume mounts — relevant for artifact hosting (U0.4).

Key Phase-3 mapping insight: the level ladder (§4.1) maps cleanly onto the existing per-tier results:

L1 install-tier pass; L2 upgrade pass; L3 backup AND restore pass; L4 custom (functional) pass; L5 SSO/integration (requires_deps tests actually ran + passed — deps_ready and not sso_dep_unverified); L6 recipe-local tests pass (D4 — discovered repo-local overlay/custom).
Gap-caps-level (YunoHost): level = highest rung L such that every rung ≤ L passed. A rung that is genuinely N/A (e.g. backup not BACKUP_CAPABLE, or no SSO/integration surface) must NOT block the climb but caps with a recorded reason ("L4 — no integration surface" etc.) for fairness (§4.1 L5).
Invariants surfaced as flags not levels: clean-teardown ✔ (no dep_teardown_error / DG4.1 ok), no-secret-leak ✔.

Adversary is live (REVIEW-3 @05:42Z), flagged the Phase-2-DONE prerequisite but is not treating it as a P3 blocker; operator kicked Phase 3 off manually. Proceeding.

Plan for U0 (foundation)

Pure level() function in a new runner/harness/level.py — unit-testable (no I/O), so I can prove "L4-pass" and "L2-cap" semantics cheaply and the Adversary can re-run the unit test cold. This is the load-bearing logic; everything else (card, badge, dashboard) just renders what it returns.
Capture per-test detail: run each tier's pytest with --junitxml to a run-scoped dir, parse the XML (stdlib xml.etree) into per-test rows {name, status, ms}. Aggregate per stage.
run_recipe_ci.py assembles results.json {recipe, version, pr, ref, run_id, stages[], level, level_cap_reason, flags} and writes it to the artifact dir — wrapped so a failure here NEVER changes the run's exit code (R7: cosmetics never block).
Artifact hosting (U0.4): runner writes to a host dir; dashboard bind-mounts it read-only to serve /runs/<id>/.... Decide details + record in DECISIONS.

2026-05-31T06:00Z — U0 complete + CLAIMED

Implemented U0.1–U0.4. Two real end-to-end runs on cc-ci confirm the translation layer (the binding risk the Adversary flagged at df54693) produces correct levels:

custom-html-tiny (stateless, not backup-capable, ≥2 versions): install+upgrade pass, backup/ restore skip→N/A, no custom → level=2, cap "L3 backup/restore N/A". Proves gap-caps on real data.
uptime-kuma (backup-capable, 3 functional tests, no deps): all five tiers pass → level=4, cap "L5 integration N/A". Proves a full clean climb with no SSO surface caps at L4. Both: deploy-count=1, clean_teardown=true, no_secret_leak=true, no orphan apps after.

Design notes / WHY:

Chose STRICT monotonic capping (N/A caps like FAIL, distinct reason) over "N/A transparent for middle rungs" because the only worked example in §4.1 (no-integration → cap L4) is N/A-caps, and the cardinal guardrail is never-inflate. A stateless app that can't back up is honestly capped at L2 with a clear reason rather than shown as L4 — understating is safe, overstating is the cardinal FAIL.
Kept the LEVEL driven by tier results + deps signals (precise, in-hand) rather than per-test marker plumbing; the per-test JUnit rows are for the card's DISPLAY (U2/U3). functional-vs-SSO split inside the custom tier is conservative: a custom FAIL fails the functional rung (caps L3) since we don't cheaply distinguish — never inflates.
results.json assembly + the narrow leak-scan are wrapped in try/except in main() so any failure is logged but never changes overall (R7). The broader Adversary leak scan over published artifacts is the authority (U5).
"version" field currently shows the recipe HEAD sha for a non-PR run (no VERSION env). Honest but ugly for the card; will prefer the tested version tag for display in U2.

Pre-existing repo lint RED (94 reformat + 36 ruff errors on origin/main, ruff 0.7.3 on CI devshell): not mine, flagged in STATUS for the operator. My new files are clean; run_recipe_ci.py left better than found (1 vs 4 errors). NOT reformatting 94 cross-phase files in Phase 3 (out of scope, huge noise).

5.2 KiB Raw Blame History Unescape Escape

Phase 3 — Beautiful YunoHost-style results — JOURNAL (Builder-private reasoning)

2026-05-31T05:41Z — Phase-3 bootstrap + orientation

Plan for U0 (foundation)

2026-05-31T06:00Z — U0 complete + CLAIMED

5.2 KiB

Raw Blame History