Files
cc-ci/machine-docs/STATUS-dash.md
autonomic-bot f68f1c56d9
Some checks failed
continuous-integration/drone/push Build is failing
status(dash): ## DONE — M1+M2 fresh Adversary PASS (3595e80, 4c0b289), no VETO
Per-recipe history now sources the full run list from local /var/lib/cc-ci-runs
artifacts; deployed (image 11ac2a1e6c07, 1/1) + verified live: bluesky-pds 8 in
exact host ts order, ghost 24/immich 28/discourse 25, plausible/custom-html
capped 30 newest; overview+badges 200; traversal/injection rejected; retention
no-trim. DoD plan §5 met.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 16:40:29 +00:00

126 lines
7.1 KiB
Markdown

# STATUS — phase `dash` (per-recipe run history fix)
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase-dash-recipe-history.md
Gates: M1 (fix implemented + locally verified) · M2 (deployed + verified live)
## DONE
Both gates fresh Adversary PASS in REVIEW-dash.md, no VETO:
- **M1 PASS** @2026-06-17T16:30Z (claim 3595e80, cold) — local-artifact history sourcing.
- **M2 PASS** @2026-06-17T16:40Z (claim 4c0b289, cold live) — deployed `cc-ci-dashboard:11ac2a1e6c07`
1/1; live `/recipe/<recipe>` shows full per-recipe history (bluesky-pds 8 in exact host ts order,
ghost 24, immich 28, discourse 25; plausible/custom-html capped 30 keeping newest); overview +
badges 200; live traversal/injection rejected; retention confirmed no-trim.
DoD (plan §5) met: per-recipe history sources its run list from local `/var/lib/cc-ci-runs` artifacts,
shows the full display-capped history, deployed + verified live on bluesky-pds + ≥2 recipes, other
routes unaffected, dashboard stdlib-only + read-only, retention confirmed.
## Gate: M1 PASS (84ac65f)
**WHAT**`history_for(recipe)` in `dashboard/dashboard.py` now sources the FULL per-recipe run
history from the local run artifacts under `/var/lib/cc-ci-runs` (each run dir's `results.json`),
newest-first by the `finished` timestamp, display-capped at `HISTORY_CAP` (default 30). It no longer
reads the Drone `…/builds?per_page=100` slice (the root cause: that window dropped a recipe's older
runs out of view, so most recipes showed 1 run). Overview (`/`), `/badge/<recipe>.svg`,
`/runs/<id>/<file>`, security guards, and stdlib-only constraint are unchanged.
**WHERE**
- Commit: see `git log` on origin/main for the `claim(M1)` commit (this push).
- Changed files: `dashboard/dashboard.py` (new `_run_status`, `_numeric_id`, `_local_history_row`,
`_local_history`; rewritten `history_for`; new `HISTORY_CAP`; new `_LOCAL` cache), and
`tests/unit/test_dashboard.py` (new `test_history_sourced_from_local_artifacts`).
- Host artifacts the page reads: `/var/lib/cc-ci-runs/<id>/results.json` (bind-mounted read-only into
the dashboard container, unchanged from before).
**HOW to verify (cold, from a fresh clone)**
1. Unit suite (stdlib render + new local-sourcing test):
```
nix-shell -p 'python3.withPackages(ps:[ps.pytest])' --run \
'DRONE_TOKEN_FILE=$(mktemp) python3 -m pytest tests/unit/test_dashboard.py -q'
```
EXPECTED: `13 passed`.
2. Verify against the REAL host artifacts. Build a fixture of every `results.json` and run
`history_for` against it (no Drone, no network):
```
FIX=/tmp/advfix; rm -rf $FIX; mkdir -p $FIX
ssh cc-ci 'cd /var/lib/cc-ci-runs && tar -cf - */results.json 2>/dev/null' | tar -xf - -C $FIX
printf x > /tmp/t.tok
DRONE_TOKEN_FILE=/tmp/t.tok CCCI_RUNS_DIR=$FIX python3 -c '
import sys; sys.path.insert(0,"dashboard"); import dashboard as d
r=d.history_for("bluesky-pds")
print("count", len(r), [x["number"] for x in r])
print("total parseable", sum(len(v) for v in d._local_history().values()))
print("plausible cap", len(d.history_for("plausible")))'
```
EXPECTED:
- `bluesky-pds` count **8**, order EXACTLY
`['753','556','435','427','423','ab-bluesky-pds-oldmain','m2rr-bluesky-pds','m2r-bluesky-pds']`
(newest-first by `finished`; note 423 sorts BELOW 427 though id 423<427, and named ids land in
their timestamp positions — the mixed numeric+named id trap).
- total parseable grouped rows **308** (matches host: 432 dirs, 308 with parseable `results.json`).
- `plausible` capped at **30** (of 33), newest kept.
**EXPECTED — invariants the Adversary's break-tests should confirm hold**
- The 124 run dirs with no/malformed `results.json` are skipped (no 500, no garbage row): `_results_for`
returns `{}` on miss/malformed/non-dir, `_local_history` skips any row with no `recipe`.
- Security preserved (untouched code paths): `/recipe/<name>` still gated by `_RUN_ID_RE`
(`^[A-Za-z0-9][A-Za-z0-9._-]*$` → rejects `../..`, `foo/..`, spaces, `;`); `_results_for` /
`serve_run_file` still realpath-guarded against escaping `/var/lib/cc-ci-runs`.
- stdlib-only: no new imports (still `html,json,os,re,sys,time,urllib,http.server`).
- Overview (`/`) and `/badge/<recipe>.svg` still sourced from Drone latest-per-recipe (`_custom_recipe_builds`
/ `latest_per_recipe` unchanged) — only the *history* page changed source.
- Run-link resolution: numeric id → `{DRONE_URL}/{CI_REPO}/<id>`; named id (`m2r-*`, `ab-*`) →
`/runs/<id>/summary.html` (local, since no Drone build number exists).
- Status pill derived from the per-stage `results` map (`results.json` has no top-level status):
any `fail`/`error` → failure; all `pass`/`skip` → success; else unknown.
## Gate: M2 PASS (7507cf4)
**WHAT** — the dashboard service is rebuilt + redeployed with the M1 fix; the LIVE per-recipe
history page now shows the full (display-capped) local-artifact history. Verified on `bluesky-pds`
(8 runs) + `plausible` (30, capped from 33) + `ghost` (24); overview + badges + host health intact.
**WHERE** —
- Deployed image: `cc-ci-dashboard:11ac2a1e6c07` (content hash of the M1 dashboard.py; rolled FROM
`15addbc7bf45`). Source built from commit `84ac65f`+ (origin/main; this push adds the M2 status).
- Deploy: host flake clone `/etc/cc-ci` pulled, then `nixos-rebuild switch` from a `path:` flake of
the synced working tree (`path:/root/ccci-build#cc-ci`) — a plain git-flake build drops the
`secrets/` submodule (gitlink), the `path:` copy includes the on-disk `secrets/secrets.yaml`. The
`deploy-dashboard` reconcile rolled the swarm service on the new content-hash tag.
- Live: `https://ci.commoninternet.net/recipe/<recipe>`.
**HOW to verify (cold)** —
1. Deployed image + service health:
```
ssh cc-ci 'docker service ls --filter name=ccci-dashboard --format "{{.Replicas}} {{.Image}}"'
```
EXPECTED: `1/1 cc-ci-dashboard:11ac2a1e6c07`.
2. Live full history (count rows = run count on host):
```
for r in bluesky-pds plausible ghost; do
echo -n "$r: "; curl -s https://ci.commoninternet.net/recipe/$r \
| grep -coE '<tr><td><a href'; done
```
EXPECTED: `bluesky-pds 8`, `plausible 30` (capped from 33), `ghost 24` — matching the host run
counts (`history_for` cap = 30).
3. Live order matches host timestamp order (mixed numeric+named id trap):
```
curl -s https://ci.commoninternet.net/recipe/bluesky-pds | grep -oE '>#[^<]+</a>' \
| sed 's/[>#<]//g; s|/a||'
```
EXPECTED exactly: `753 556 435 427 423 ab-bluesky-pds-oldmain m2rr-bluesky-pds m2r-bluesky-pds`.
4. Other routes unaffected:
```
curl -s -o /dev/null -w '%{http_code}\n' https://ci.commoninternet.net/ # 200 overview
curl -s -o /dev/null -w '%{http_code}\n' https://ci.commoninternet.net/badge/bluesky-pds.svg # 200
```
EXPECTED: both `200`; overview still latest-per-recipe (Drone-sourced, unchanged).
**EXPECTED — retention** confirmed adequate: no nix module/tmpfiles/cron trims `/var/lib/cc-ci-runs`
(`grep -rn cc-ci-runs nix/` shows no rm/find-delete/prune/maxage). Host: 439 run dirs spanning
2026-05-31 → 2026-06-17 (17 days). No growth cap needed now (recorded in DECISIONS).
## Blocked
(none)