From 9e7d76ca1faab181902992db8cdd9098c531b5cb Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Wed, 17 Jun 2026 04:29:52 +0000 Subject: [PATCH] =?UTF-8?q?plan:=20queue=20dash=20=E2=80=94=20fix=20incomp?= =?UTF-8?q?lete=20per-recipe=20run=20history=20on=20the=20CI=20dashboard?= =?UTF-8?q?=20(opus,=20after=20canon)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Operator 2026-06-17. /recipe/ shows only the latest run for most recipes because history is built from a single page of the latest 100 Drone builds, while 362 runs exist on the host. Source per-recipe history from the local /var/lib/cc-ci-runs artifacts (already bind-mounted read-only) instead — full, durable history. Deploy + verify live on bluesky-pds. --- cc-ci-plan/agents.toml | 2 + cc-ci-plan/plan-phase-dash-recipe-history.md | 81 ++++++++++++++++++++ 2 files changed, 83 insertions(+) create mode 100644 cc-ci-plan/plan-phase-dash-recipe-history.md diff --git a/cc-ci-plan/agents.toml b/cc-ci-plan/agents.toml index 0fa9d5a..039944a 100644 --- a/cc-ci-plan/agents.toml +++ b/cc-ci-plan/agents.toml @@ -162,4 +162,6 @@ phases = [ { id = "samever", plan = "plan-phase-samever-older-base-fallback.md", status = "STATUS-samever.md", models = { builder = "claude-opus-4-8", adversary = "claude-opus-4-8" } }, # make the canonical sweep ACTUALLY work (substitute for the hollow nightly sweep) + upstream-sync + skip-unchanged; verify end-to-end (opus) — see plan-phase-canon-*.md (operator 2026-06-17) { id = "canon", plan = "plan-phase-canon-canonical-sweep.md", status = "STATUS-canon.md", models = { builder = "claude-opus-4-8", adversary = "claude-opus-4-8" } }, + # fix incomplete per-recipe run history on the CI dashboard (capped at latest 100 Drone builds; 362 runs exist) — source from local /var/lib/cc-ci-runs (opus) — see plan-phase-dash-*.md (operator 2026-06-17) + { id = "dash", plan = "plan-phase-dash-recipe-history.md", status = "STATUS-dash.md", models = { builder = "claude-opus-4-8", adversary = "claude-opus-4-8" } }, ] diff --git a/cc-ci-plan/plan-phase-dash-recipe-history.md b/cc-ci-plan/plan-phase-dash-recipe-history.md new file mode 100644 index 0000000..12c9dcc --- /dev/null +++ b/cc-ci-plan/plan-phase-dash-recipe-history.md @@ -0,0 +1,81 @@ +# Phase `dash` — fix incomplete per-recipe run history on the CI dashboard + +**Mission (operator-specified 2026-06-17):** the dashboard's per-recipe history page +(`https://ci.commoninternet.net/recipe/`, e.g. `/recipe/bluesky-pds`) shows only the latest run +for most recipes. Make it show the **full run history** per recipe. + +State files: `STATUS-dash.md`, `BACKLOG-dash.md`, `REVIEW-dash.md`, `JOURNAL-dash.md`. DECISIONS.md shared. + +## 1. Root cause (verified 2026-06-17) + +`dashboard/dashboard.py` builds the per-recipe history **solely from the Drone API, capped at a single +page of the latest 100 builds**: +```python +builds = _drone(f"/api/repos/{CI_REPO}/builds?per_page=100") # single page, no pagination +def history_for(recipe): + builds = _custom_recipe_builds() + return [_build_row(b) for b in builds if (b.get("params") or {}).get("RECIPE") == recipe] +``` +But there are **362 actual runs** on the host (`/var/lib/cc-ci-runs` has 362 run dirs). So ~262 runs are +older than the 100-build window and never fetched. The recent `regall` sweep `!testme`'d each of the 21 +recipes once, filling the latest-100 window and pushing each recipe's older runs out of view → most +recipes show exactly one run. (The overview/latest-per-recipe page is unaffected — it only needs the +recent window.) + +## 2. Design — source history from the local run artifacts + +The dashboard already **bind-mounts `/var/lib/cc-ci-runs` read-only** (see `nix/modules/dashboard.nix`) +and reads each run's `results.json` (`_results_for`). Build the per-recipe history from THAT, not from +the 100-build Drone slice — it's complete (362 runs), durable (independent of Drone pagination/retention), +and already available. + +- **`history_for(recipe)` → enumerate `/var/lib/cc-ci-runs/*/results.json`**, keep those whose recipe + matches, sort newest-first (by run id / timestamp), and render the existing history table (status, + level, version, ref, when, link to `/runs//…`). Apply a sane **display cap** (e.g. the last ~30 per + recipe) so a long-lived recipe's page stays bounded. +- First **confirm the `results.json` schema** carries what's needed (recipe, version, level/status, ref, + timestamp) — adapt if a field is named differently or read the run id from the dir name; skip a run dir + with no/À malformed `results.json` gracefully (don't 500). +- **Keep Drone only where it adds value** — e.g. the live "currently running" status for the most recent + run (a run mid-flight has no final `results.json` yet). The *historical* list comes from local + artifacts. Keep the overview + `/badge/.svg` working exactly as today. +- **Retention check:** 362 runs implies adequate retention, but confirm nothing (cleanlogs / docker-prune) + trims `/var/lib/cc-ci-runs` so aggressively that history vanishes; if it does, note it in DECISIONS and + keep a Drone-pagination fallback. Do not add unbounded growth — if retention needs a cap, record it. + +## 3. Gates + +**M1 — fix implemented + locally verified.** `history_for` (and any helper) sources per-recipe history +from `/var/lib/cc-ci-runs`, newest-first, display-capped; `results.json` schema confirmed; malformed/empty +run dirs handled without erroring. **Python stdlib only** (the dashboard's standing constraint); the +existing path-traversal guard + the `/recipe/` name validation preserved. Unit/local render test shows a +recipe with many runs now lists them all (up to the cap). Adversary cold-verifies: the rendered history +matches the actual run dirs for that recipe (count + order), no security regression (path traversal, arg +injection), overview + badge routes unchanged. + +**M2 — deployed + verified live.** Rebuild/redeploy the dashboard service (the `deploy-dashboard` +reconcile; the content-hash image tag rolls on `dashboard.py` change). Then confirm on the live site: +`/recipe/bluesky-pds` and ≥2 other recipes show their **full** run history (multiple runs, matching the +local run count), with correct status/level/links; the overview and badges still render. Fresh Adversary +PASS on both milestones → `## DONE`. + +## 4. Guardrails + +- **Read-only dashboard** — it never writes run artifacts; the mount stays read-only. +- **Python stdlib only** (no new deps — it's a stdlib HTTP server packaged into an OCI image). +- **Preserve security + the other routes:** keep the path-traversal guard and the `/recipe/` validation; + do not regress the overview, `/badge/.svg`, `/runs//`, or the bridge's `/hook` routing + (traefik priority). +- **Host/deploy change** (redeploying the dashboard service via its reconcile / a nixos-rebuild): loops may + deploy if clean and **verify host health after** (dashboard service N/N, `ci.commoninternet.net` 200); + else file for the orchestrator. Commit author `autonomic-bot `; + push every commit. +- Bounded scope — this is a history-page fix, not a dashboard redesign. + +## 5. Definition of Done + +The per-recipe history page sources its run list from the local `/var/lib/cc-ci-runs` artifacts and shows +the **full** (display-capped) history per recipe — deployed and verified live on `bluesky-pds` + ≥2 other +recipes (multiple runs each, matching the host's run count), with the overview/badges/other routes +unaffected and the dashboard still stdlib-only + read-only. Retention confirmed adequate (or recorded). +M1 + M2 fresh Adversary PASSes in REVIEW-dash.md.