Reboot survival for the Pi orchestrator host: - systemd unit cc-ci-plan/systemd/cc-ci-loops.service (installed + enabled): on boot records the reboot, starts loops+watchdog (RESUME_PHASE=1), and resumes the orchestrator session. - reboot-log.sh: boot_id-gated reboot record -> REBOOTS.md (manual restarts don't count). - launch-orchestrator.sh: injects an AGENTS.md startup nudge so an auto-resumed orchestrator announces itself (PushNotification) + reports reboots. - AGENTS.md: on-startup notify routine documented. Plans/tooling accumulated this session: - plan-phase1d (generic suite), 1e (harness corrections), phase4 (final review), sso-dep-testing, orchestrator-migration (parked), test-e2e-testme-acceptance. - launch.sh: 1d/1e/2/2b/3/4 phase sequence, machine-docs-aware state resolution, limit-stall re-nudge, INBOX side-channel detection. - plan.md §6.1/§7: artifact-layer isolation, INBOX, 5-min long-run polling, DEFERRED. - prompts: isolation discipline + INBOX + pacing. - .gitignore: harden (.sops/, cc-ci-secrets/, .claude/, *.tmp.*). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
114 lines
7.7 KiB
Markdown
114 lines
7.7 KiB
Markdown
# cc-ci Phase 4 — Final review, polish & cleanup (capstone) (Autonomous Build Plan)
|
||
|
||
**Status:** QUEUED — the **LAST** phase, runs after Phase 3 (`plan-phase3-results-ux.md`). A bounded
|
||
final review/lint/cleanup pass over the **entire** codebase as it stands after all phases, ending in a
|
||
**full cold re-verification that nothing regressed**.
|
||
**Transition:** auto (last in the launcher sequence); after it, the whole build is done.
|
||
**Builds on:** everything — Phase 1 + 1c + 1b + 1d + 1e + 2 + 2b + 3 (flake/`nix/` modules, the
|
||
runner/harness + generic suite + recipe overlays, the comment-bridge, Drone, the dashboard/results
|
||
UX, docs, `machine-docs/`).
|
||
**Owner agents:** same Builder + Adversary loops (`plan.md` §6/§7); Adversary cold-verifies.
|
||
**This file's path:** `/srv/cc-ci/cc-ci-plan/plan-phase4-final-review-polish-cleanup.md`
|
||
**Phase order:** 1c → 1b → 1d → 1e → 2 → 2b → 3 → **4 (final)**.
|
||
|
||
---
|
||
|
||
## 0. Why this phase (and why it's bounded)
|
||
|
||
This is the analogue of Phase 1b, but final and whole-codebase. By now the tree has grown a lot —
|
||
recipe overlays (Phase 2), performance changes (2b), and results/dashboard UX (3) — all layered on
|
||
the foundation. Before calling the build done, do one **bounded** pass to clean and harden it, and —
|
||
critically — **re-verify from a cold start that none of the growth/cleanup regressed any earlier
|
||
guarantee.** Same discipline as 1b: **good-enough + enforceable**, style→tooling, judgment→checklist,
|
||
don't reopen settled design, and **never weaken a test** to satisfy a nit.
|
||
|
||
---
|
||
|
||
## 1. Definition of Done (Phase 4 exit condition)
|
||
|
||
Terminates when every item holds **and the Adversary has independently cold-verified** (logged in
|
||
`machine-docs/REVIEW-4.md`):
|
||
|
||
- [ ] **F1 — Lint/format green across the whole codebase.** Re-run the 1b toolchain (alejandra/
|
||
statix/deadnix, ruff, shellcheck/shfmt, yamllint) over everything added since 1b (Phase-2
|
||
overlays, 2b changes, 3 UX/dashboard); extend the lint config to any new languages/areas (e.g.
|
||
dashboard front-end) so it's covered going forward. The `.drone.yml` lint stage still passes
|
||
from a clean checkout; prove with a break-it probe.
|
||
- [ ] **F2 — White-box review checklist over all post-1b code.** Run the §3 checklist; fix every
|
||
**blocking** finding, triage advisories to `BACKLOG`/`IDEAS`. Findings + resolutions in
|
||
`machine-docs/REVIEW-4.md`.
|
||
- [ ] **F3 — Cleanup.** Remove dead code/scaffolding and stale TODOs; consistent naming/structure;
|
||
reconcile `machine-docs/` (BACKLOG/IDEAS/DECISIONS current, no contradictions); docs match the
|
||
final state. No behavior change beyond what F2 mandates.
|
||
- [ ] **F4 — FULL cold re-verification (the final gate).** *After* F1–F3 land, the Adversary
|
||
**independently re-verifies every prior Definition-of-Done from a cold start**, to the same bar
|
||
each phase used — fresh PASS + evidence + timestamps in `machine-docs/REVIEW-4.md` within 24h,
|
||
**nothing weakened/skipped/softened** by the cleanup:
|
||
- **Phase 1 D1–D10** (incl. the genuine **D8** byte-identical fresh-clone rebuild + a
|
||
category-spanning live `!testme` e2e through the public gateway).
|
||
- **Phase 1c C1–C7** (secrets-in-git, cert-in-sops, honest reproducibility).
|
||
- **Phase 1d DG1–DG8** (generic install/upgrade/backup/restore, deploy-once `DG4.1`, override
|
||
floor) **as amended by 1e**.
|
||
- **Phase 1e HC1–HC3** (upgrade→PR-head via `deploy --chaos`; repo-local gated to approved
|
||
recipes; generic-by-default + explicit opt-out).
|
||
- **Phase 2** recipe-coverage criteria (every enrolled recipe's overlays/ported tests real,
|
||
DRY, green).
|
||
- **Phase 2b** performance claims (the measured improvements still hold; no test weakened to
|
||
get them).
|
||
- **Phase 3** results/level/UX criteria (per-run level honest, PR comment + dashboard correct).
|
||
- [ ] **F5 — Documented + cold-verified.** Final `docs/` accurate (install reproduces from scratch;
|
||
enroll-recipe + overlay/approval flow correct); accepted deviations in `DECISIONS.md`; the
|
||
Adversary confirms F1–F4 with no standing VETO and no open `[adversary]` finding.
|
||
|
||
When F1–F5 hold and are confirmed, write `## DONE` to `machine-docs/STATUS-4.md` — the build is
|
||
complete.
|
||
|
||
---
|
||
|
||
## 2. Method
|
||
1. **Lint/format first** (F1) — re-run + extend; auto-fix style, don't deliberate.
|
||
2. **Review checklist** (F2, §3) — classify blocking vs advisory; fix blocking, triage rest.
|
||
3. **Cleanup** (F3) — dead code, naming, docs, `machine-docs/` reconciliation.
|
||
4. **Full cold re-verification LAST** (F4) — once everything has landed, the Adversary re-runs the
|
||
entire cross-phase acceptance from cold. Order matters: tooling → review/fixes → cleanup → *then*
|
||
full re-verify. Cleanup must regress nothing.
|
||
5. **Bound it** — a pass, not a rewrite; record dead-ends/deviations and stop.
|
||
|
||
## 3. White-box review checklist (teeth, not taste) — whole codebase
|
||
Blocking unless noted (plan-relevant invariants visible only by reading code):
|
||
- **Tests are real** (blocking) — every generic/overlay/custom test asserts actual app state; no
|
||
`skip`/`xfail`/can't-fail; per-op `pass/fail/skip` honest; the 1d/1e anti-vacuous guards
|
||
(`assert_serving` routing proof, `do_upgrade` "moved", deploy-count==1) intact.
|
||
- **1e corrections intact** (blocking) — repo-local code still gated to approved recipes; generic
|
||
still runs by default (opt-out explicit); upgrade still targets the PR head.
|
||
- **Generic-first / custom-additive invariant** (blocking — `docs/testing.md`). Confirm no path
|
||
makes the generic tier depend on custom: deps deploy + `setup_custom_tests` run **after** all
|
||
generic tiers, never before; a forced `setup_custom_tests` failure still yields a clean
|
||
generic-tier `pass/pass/pass/pass` + `skip(deps-not-ready)` for `@requires_deps` custom tests
|
||
(re-exercise the forced-failure case). Future maintainers must be able to operate cc-ci with
|
||
the generic tier alone — verify that path stays viable.
|
||
- **Harness DRY** (blocking-ish) — recipe quirks are data (`recipe_meta.py`), not shared-harness
|
||
conditionals; overlays are thin; no per-recipe copy-paste of lifecycle logic.
|
||
- **Server state Nix-declared & idempotent** (blocking) — no imperative drift / run-once sentinels /
|
||
manual post-rebuild steps; the `nix/` layout clean.
|
||
- **No footguns** (blocking) — no bare `sleep` for readiness (poll); teardown in `finally`; secrets
|
||
reused per run not regenerated; no hardcoded versions/domains that break upstream.
|
||
- **No secrets in code/committed files** (blocking) — grep source/configs/`.drone.yml`/fixtures;
|
||
log/dashboard redaction real (incl. any new Phase-3 UX surface that echoes run data).
|
||
- **Phase-3 UX correctness** (advisory→blocking on real drift) — the displayed level/badge/screenshot
|
||
reflect the true per-op results; no misleading "pass".
|
||
- **Architecture matches the plans; deviations in `DECISIONS.md`** (advisory→blocking on real drift).
|
||
- **Readability & docs** (advisory) — clear names, dead code removed, docs reproduce from scratch.
|
||
|
||
## 4. Guardrails
|
||
- **Never weaken a test** to satisfy a lint/review/cleanup nit (cardinal rule wins).
|
||
- **Don't reopen settled design** — clean + harden + re-verify; bigger ideas → `IDEAS.md`.
|
||
- **Bounded** — one pass; cap iterations; record + stop.
|
||
- **Cleanup regresses nothing** — F4 is the proof; if a cleanup breaks a prior guarantee, revert the
|
||
cleanup, not the guarantee.
|
||
|
||
## 5. Open decisions (log in machine-docs/DECISIONS.md)
|
||
- Any new linters/formatters for Phase-3 front-end / new areas, and their strictness.
|
||
- The precise blocking-vs-advisory split for the §3 checklist on the new code.
|
||
- Whether to add Python type-checking now or defer to `IDEAS.md` (carried from 1b).
|