Before scaling to many recipes: (1) deterministic style/hygiene via linters/formatters (alejandra/statix/deadnix, ruff, shellcheck/shfmt) wired as a .drone.yml stage so commits stay clean; (2) a white-box review checklist with teeth (real tests not health-only/skipped, DRY harness, Nix-declared idempotent bring-up, no footguns/secrets-in-code, architecture matches plan) — blocking fixed, advisory triaged. Bounded pass; never weaken a test for a nit. Phase 2 now follows 1b. Linked in README. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.4 KiB
cc-ci Phase 1b — Review & lint pass (Autonomous Build Plan)
Status: QUEUED — a bounded pass that runs after Phase 1 (plan.md) reaches ## DONE, and
before Phase 2 (plan-phase2-recipe-tests.md).
Transition: manual (operator kicks it off).
Builds on: the complete Phase-1 codebase (flake/modules, runner/ + harness, the comment-bridge,
dashboard, scripts, the first recipes' tests, docs).
Owner agents: same Builder + Adversary loops (plan.md §6/§7). Here the Adversary also acts as
white-box reviewer (reads the code, not just black-box behavior).
This file's path: /srv/cc-ci/cc-ci-plan/plan-phase1b-review-lint.md
0. Why this phase (and why it's bounded)
Before scaling to ~18 recipes (Phase 2), clean and harden the foundation built in Phase 1 — it's far cheaper to fix the harness/modules now than after every recipe copies their patterns. This is a one-time, bounded pass, not an open-ended loop. Two distinct workstreams that go to two places, to avoid bikeshedding:
- Style & hygiene → deterministic tooling (formatters/linters in CI). No judgment calls, no churn, enforceable going forward.
- Code health a linter can't judge → a white-box review checklist with teeth (§3) — anchored to plan-relevant invariants, not taste. Blocking items get fixed; the rest are advisory.
Discipline: good-enough + enforceable, not gold-plating. Don't reopen settled design; don't weaken any test in the name of "cleanliness." When in doubt, lint it deterministically or leave it.
1. Definition of Done (Phase 1b exit condition)
Terminates when every item holds and the Adversary/reviewer has independently confirmed (logged
in REVIEW.md):
- RL1 — Lint/format tooling in place + green. Formatters + linters are added to the repo (a
lintentrypoint + a Nix devshell) and wired as a stage in cc-ci's own.drone.ymlso future commits stay clean. The whole Phase-1 codebase passes them. Tooling (defaults; settle inDECISIONS.md): - Nix:alejandra(ornixpkgs-fmt) format ·statix(lints) ·deadnix(dead code). - Python (runner/, harness,bridge/,tests/):ruff(lint + format). - Shell:shellcheck+shfmt. - (optional) YAML:yamllintfor.drone.yml/compose. - RL2 — Review checklist run, blocking items fixed. The white-box checklist (§3) has been run
over the codebase; every blocking finding is fixed; every advisory finding is either
fixed or moved to
BACKLOG.md/IDEAS.mdas[idea]with a one-line rationale. Findings + resolutions recorded inREVIEW.md. - RL3 — Still green, nothing weakened. After the cleanup, the full Phase-1 Definition of Done
(
plan.md§2 D1–D10) still holds — the Adversary re-confirms a representative!testmerun is still green and no test/assertion was softened or skipped to satisfy a lint/review nit. - RL4 — Documented.
docs/notes how to run lint/format locally and that CI enforces it; any accepted deviations from the review checklist are inDECISIONS.md.
When RL1–RL4 hold and are confirmed, write ## DONE to Phase-1b STATUS.md.
2. Method
- Tooling first. Add the formatters/linters + a
lintentrypoint + devshell; run format, then fix lint findings. Wire the.drone.ymllint stage so it's enforced from here on. (Auto-fixable style should just be auto-fixed — don't deliberate over it.) - Then the review checklist (§3). Read the code against the checklist; classify each finding blocking vs advisory; fix blocking, triage advisory.
- Re-verify. Confirm Phase-1 §2 still passes (RL3) — cleanup must not regress behavior or tests.
- Bound it. Cap iterations; this is a pass, not a rewrite. Record dead-ends/deviations and stop.
3. The white-box review checklist (teeth, not taste)
Blocking unless noted; these are plan-relevant invariants visible only by reading code:
- Tests are real (blocking). Every recipe/harness test asserts actual app state — not
health-only, not
skip/xfail, not assertions that can't fail. (Pre-empts Phase-2 §7.1 at the foundation.) - Harness is DRY (blocking-ish). Recipe-specific quirks live in the shared harness, not copy-pasted per recipe (the M6.5 "no harness surgery / no one-off hacks" rule) — so 18 recipes don't multiply the mess.
- Server state is Nix-declared & idempotent (blocking). Bring-up is declarative idempotent
reconciliation (the
swarm-init/deploy-proxyoneshot pattern, §9) — no imperative drift, no run-once sentinels, no manual post-rebuild steps ininstall.md. - No footguns (blocking). No bare
sleepfor readiness (use polling); correct--chaosusage; no hardcoded versions/domains that break across upstream versions; teardown intry/finally; per-run secrets reused across stages (not regenerated). - No secrets in code or committed files (blocking). Grep + read for tokens/passwords/keys in
source, configs,
.drone.yml, fixtures (complements the Adversary's behavioral leak test). - Architecture matches the plan (advisory→blocking on real drift). Layout/》 modules follow
plan.md§3; the trigger is poll-primary + optional manual webhook (§4.1); Traefik is the coop-cloud recipe (§4.2); deviations are recorded inDECISIONS.md(not silent). - Log redaction is real (blocking). The log/dashboard pipeline scrubs secrets before publish.
- Readability & error handling (advisory). Clear names, sensible structure, errors surfaced not swallowed, dead code removed — advisory; auto-fixable parts belong to the linters, not debate.
- Docs accurate (advisory).
install.mdreproduces from scratch;enroll-recipe.mdmatches the real enroll flow.
4. Milestones (bounded)
- W0 — Tooling + format. Linters/formatters +
lintentrypoint + devshell added; codebase auto-formatted;.drone.ymllint stage wired and green. Accept:lintpasses in CI from a clean checkout. - W1 — Review + fixes. Checklist (§3) run; blocking findings fixed; advisory triaged. Accept:
REVIEW.mdlists findings + resolutions; no blocking item open. - W2 — Re-verify + document. Phase-1 §2 still green; docs updated; deviations in
DECISIONS.md. Accept: a!testmerun is still green post-cleanup; flip Phase-1bSTATUS.mdto## DONE.
5. Guardrails
- Never weaken a test to satisfy a lint/review nit (the cardinal rule still wins).
- Style → tooling, judgment → checklist. Don't spend agent effort (or loop iterations) on what a formatter/linter settles deterministically; don't let the review drift into subjective taste.
- Bounded. This is a pass, not a refactor project — fix invariants + enforce linting, then stop;
bigger improvements go to
IDEAS.md, not this phase. - Don't reopen settled design. Architecture decisions from Phase 1 stand unless there's real drift from the plan; record, don't relitigate.
6. Open decisions (log in DECISIONS.md)
- Exact formatter choices (alejandra vs nixpkgs-fmt; ruff-format vs black) and lint strictness.
- Whether to add Python type-checking (mypy/pyright) now or defer to
IDEAS.md. - The precise blocking vs advisory split for the checklist.
- Whether the
.drone.ymllint stage should fail the build or just warn initially.