Files
cc-ci-orchestrator/cc-ci-plan/plan-phase4-final-review-polish-cleanup.md
autonomic-bot 36a6c9872a orchestrator: reboot-resilience + session auto-resume + full session plan/tooling
Reboot survival for the Pi orchestrator host:
- systemd unit cc-ci-plan/systemd/cc-ci-loops.service (installed + enabled): on boot
  records the reboot, starts loops+watchdog (RESUME_PHASE=1), and resumes the
  orchestrator session.
- reboot-log.sh: boot_id-gated reboot record -> REBOOTS.md (manual restarts don't count).
- launch-orchestrator.sh: injects an AGENTS.md startup nudge so an auto-resumed
  orchestrator announces itself (PushNotification) + reports reboots.
- AGENTS.md: on-startup notify routine documented.

Plans/tooling accumulated this session:
- plan-phase1d (generic suite), 1e (harness corrections), phase4 (final review),
  sso-dep-testing, orchestrator-migration (parked), test-e2e-testme-acceptance.
- launch.sh: 1d/1e/2/2b/3/4 phase sequence, machine-docs-aware state resolution,
  limit-stall re-nudge, INBOX side-channel detection.
- plan.md §6.1/§7: artifact-layer isolation, INBOX, 5-min long-run polling, DEFERRED.
- prompts: isolation discipline + INBOX + pacing.
- .gitignore: harden (.sops/, cc-ci-secrets/, .claude/, *.tmp.*).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 20:28:10 +01:00

114 lines
7.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# cc-ci Phase 4 — Final review, polish & cleanup (capstone) (Autonomous Build Plan)
**Status:** QUEUED — the **LAST** phase, runs after Phase 3 (`plan-phase3-results-ux.md`). A bounded
final review/lint/cleanup pass over the **entire** codebase as it stands after all phases, ending in a
**full cold re-verification that nothing regressed**.
**Transition:** auto (last in the launcher sequence); after it, the whole build is done.
**Builds on:** everything — Phase 1 + 1c + 1b + 1d + 1e + 2 + 2b + 3 (flake/`nix/` modules, the
runner/harness + generic suite + recipe overlays, the comment-bridge, Drone, the dashboard/results
UX, docs, `machine-docs/`).
**Owner agents:** same Builder + Adversary loops (`plan.md` §6/§7); Adversary cold-verifies.
**This file's path:** `/srv/cc-ci/cc-ci-plan/plan-phase4-final-review-polish-cleanup.md`
**Phase order:** 1c → 1b → 1d → 1e → 2 → 2b → 3 → **4 (final)**.
---
## 0. Why this phase (and why it's bounded)
This is the analogue of Phase 1b, but final and whole-codebase. By now the tree has grown a lot —
recipe overlays (Phase 2), performance changes (2b), and results/dashboard UX (3) — all layered on
the foundation. Before calling the build done, do one **bounded** pass to clean and harden it, and —
critically — **re-verify from a cold start that none of the growth/cleanup regressed any earlier
guarantee.** Same discipline as 1b: **good-enough + enforceable**, style→tooling, judgment→checklist,
don't reopen settled design, and **never weaken a test** to satisfy a nit.
---
## 1. Definition of Done (Phase 4 exit condition)
Terminates when every item holds **and the Adversary has independently cold-verified** (logged in
`machine-docs/REVIEW-4.md`):
- [ ] **F1 — Lint/format green across the whole codebase.** Re-run the 1b toolchain (alejandra/
statix/deadnix, ruff, shellcheck/shfmt, yamllint) over everything added since 1b (Phase-2
overlays, 2b changes, 3 UX/dashboard); extend the lint config to any new languages/areas (e.g.
dashboard front-end) so it's covered going forward. The `.drone.yml` lint stage still passes
from a clean checkout; prove with a break-it probe.
- [ ] **F2 — White-box review checklist over all post-1b code.** Run the §3 checklist; fix every
**blocking** finding, triage advisories to `BACKLOG`/`IDEAS`. Findings + resolutions in
`machine-docs/REVIEW-4.md`.
- [ ] **F3 — Cleanup.** Remove dead code/scaffolding and stale TODOs; consistent naming/structure;
reconcile `machine-docs/` (BACKLOG/IDEAS/DECISIONS current, no contradictions); docs match the
final state. No behavior change beyond what F2 mandates.
- [ ] **F4 — FULL cold re-verification (the final gate).** *After* F1F3 land, the Adversary
**independently re-verifies every prior Definition-of-Done from a cold start**, to the same bar
each phase used — fresh PASS + evidence + timestamps in `machine-docs/REVIEW-4.md` within 24h,
**nothing weakened/skipped/softened** by the cleanup:
- **Phase 1 D1D10** (incl. the genuine **D8** byte-identical fresh-clone rebuild + a
category-spanning live `!testme` e2e through the public gateway).
- **Phase 1c C1C7** (secrets-in-git, cert-in-sops, honest reproducibility).
- **Phase 1d DG1DG8** (generic install/upgrade/backup/restore, deploy-once `DG4.1`, override
floor) **as amended by 1e**.
- **Phase 1e HC1HC3** (upgrade→PR-head via `deploy --chaos`; repo-local gated to approved
recipes; generic-by-default + explicit opt-out).
- **Phase 2** recipe-coverage criteria (every enrolled recipe's overlays/ported tests real,
DRY, green).
- **Phase 2b** performance claims (the measured improvements still hold; no test weakened to
get them).
- **Phase 3** results/level/UX criteria (per-run level honest, PR comment + dashboard correct).
- [ ] **F5 — Documented + cold-verified.** Final `docs/` accurate (install reproduces from scratch;
enroll-recipe + overlay/approval flow correct); accepted deviations in `DECISIONS.md`; the
Adversary confirms F1F4 with no standing VETO and no open `[adversary]` finding.
When F1F5 hold and are confirmed, write `## DONE` to `machine-docs/STATUS-4.md` — the build is
complete.
---
## 2. Method
1. **Lint/format first** (F1) — re-run + extend; auto-fix style, don't deliberate.
2. **Review checklist** (F2, §3) — classify blocking vs advisory; fix blocking, triage rest.
3. **Cleanup** (F3) — dead code, naming, docs, `machine-docs/` reconciliation.
4. **Full cold re-verification LAST** (F4) — once everything has landed, the Adversary re-runs the
entire cross-phase acceptance from cold. Order matters: tooling → review/fixes → cleanup → *then*
full re-verify. Cleanup must regress nothing.
5. **Bound it** — a pass, not a rewrite; record dead-ends/deviations and stop.
## 3. White-box review checklist (teeth, not taste) — whole codebase
Blocking unless noted (plan-relevant invariants visible only by reading code):
- **Tests are real** (blocking) — every generic/overlay/custom test asserts actual app state; no
`skip`/`xfail`/can't-fail; per-op `pass/fail/skip` honest; the 1d/1e anti-vacuous guards
(`assert_serving` routing proof, `do_upgrade` "moved", deploy-count==1) intact.
- **1e corrections intact** (blocking) — repo-local code still gated to approved recipes; generic
still runs by default (opt-out explicit); upgrade still targets the PR head.
- **Generic-first / custom-additive invariant** (blocking — `docs/testing.md`). Confirm no path
makes the generic tier depend on custom: deps deploy + `setup_custom_tests` run **after** all
generic tiers, never before; a forced `setup_custom_tests` failure still yields a clean
generic-tier `pass/pass/pass/pass` + `skip(deps-not-ready)` for `@requires_deps` custom tests
(re-exercise the forced-failure case). Future maintainers must be able to operate cc-ci with
the generic tier alone — verify that path stays viable.
- **Harness DRY** (blocking-ish) — recipe quirks are data (`recipe_meta.py`), not shared-harness
conditionals; overlays are thin; no per-recipe copy-paste of lifecycle logic.
- **Server state Nix-declared & idempotent** (blocking) — no imperative drift / run-once sentinels /
manual post-rebuild steps; the `nix/` layout clean.
- **No footguns** (blocking) — no bare `sleep` for readiness (poll); teardown in `finally`; secrets
reused per run not regenerated; no hardcoded versions/domains that break upstream.
- **No secrets in code/committed files** (blocking) — grep source/configs/`.drone.yml`/fixtures;
log/dashboard redaction real (incl. any new Phase-3 UX surface that echoes run data).
- **Phase-3 UX correctness** (advisory→blocking on real drift) — the displayed level/badge/screenshot
reflect the true per-op results; no misleading "pass".
- **Architecture matches the plans; deviations in `DECISIONS.md`** (advisory→blocking on real drift).
- **Readability & docs** (advisory) — clear names, dead code removed, docs reproduce from scratch.
## 4. Guardrails
- **Never weaken a test** to satisfy a lint/review/cleanup nit (cardinal rule wins).
- **Don't reopen settled design** — clean + harden + re-verify; bigger ideas → `IDEAS.md`.
- **Bounded** — one pass; cap iterations; record + stop.
- **Cleanup regresses nothing** — F4 is the proof; if a cleanup breaks a prior guarantee, revert the
cleanup, not the guarantee.
## 5. Open decisions (log in machine-docs/DECISIONS.md)
- Any new linters/formatters for Phase-3 front-end / new areas, and their strictness.
- The precise blocking-vs-advisory split for the §3 checklist on the new code.
- Whether to add Python type-checking now or defer to `IDEAS.md` (carried from 1b).