Files
cc-ci/REVIEW-1b.md

15 KiB
Raw Blame History

REVIEW — Phase 1b (review & lint pass) — Adversary ledger

Phase plan (SSOT): /srv/cc-ci/cc-ci-plan/plan-phase1b-review-lint.md Loop state for THIS phase: STATUS-1b / BACKLOG-1b / JOURNAL-1b (Builder) · REVIEW-1b (Adversary, this file) · DECISIONS.md shared. Phase-1 STATUS.md/BACKLOG.md/REVIEW.md and the Phase-1c *-1c.md files are HISTORY — not this phase's state.

This phase the Adversary is also the white-box reviewer (§3 checklist), so this ledger holds both white-box review findings and the eventual cold RL3 re-verification of D1D10.

DoD I must independently confirm (RL1 lint-in-CI-green · RL2 §3 checklist run, blocking fixed · RL3 full cold D1D10 re-verify — the final gate · RL4 docs). Order per §2: tooling → review fixes → then RL3. Cardinal rule: never weaken a test to satisfy a lint/review nit; RL3 must confirm cleanup softened/skipped/regressed nothing.


Phase-1b orientation @2026-05-27 (Adversary cold start)

  • Pulled clean; Phase 1c is signed-off DONE (commit 6d2bc3d). Phase 1b kicked off by operator (manual transition).
  • Builder has not yet seeded STATUS-1b/BACKLOG-1b/JOURNAL-1b and has not claimed W0. No gate pending.
  • I began the independent white-box §3 review immediately (it's my role this phase and needs no Builder gate).

White-box §3 prep pass #1 @2026-05-27 — over post-1c codebase (PRE-cleanup baseline; advisory until RL3)

Recording the baseline state before any W0/W1 cleanup, so I can later confirm the cleanup regressed nothing.

  • Tests are real — PASS (provisional). Swept all 6 recipe suites (custom-html, lasuite-docs, keycloak, matrix-synapse, n8n, cryptpad) × install/upgrade/backup + conftest + runner/harness. No @pytest.mark.skip/xfail/skipif, no commented-out asserts, no tautologies. Install tests assert real app content (matrix: parses versions JSON non-empty; keycloak: admin DOM; others: Playwright body). Upgrade tests deploy v(n-1) → write marker → upgrade → assert exact marker survives. Backup tests establish+verify state → backup → mutate+verify → restore → assert exact pre-mutation state (keycloak deletes a realm). Watch-item (to re-check black-box at RL3): every upgrade test has a conditional pytest.skip() when no previous published version exists (e.g. custom-html test_upgrade.py:17-18). Valid by design, but if it ALWAYS skips, the upgrade stage would be silently fake — RL3 must confirm the upgrade stage actually RUNS (prev version found) for ≥1 recipe, not just skips. (1c E2E exercised this.)
  • Server state Nix-declared & idempotent — PASS (provisional). No .bootstrapped/run-once sentinels in modules/ or scripts/ (grep clean). Convergence/oneshot pattern per §9 to be re-read fully in pass #2.
  • No footguns / sleep — PASS (provisional). All time.sleep() in runner/harness/lifecycle.py (147,157, 212,238) + bridge.py (280) are poll-loop intervals inside while time.time() < deadline: loops, not bare readiness waits. wait_healthy polls converge-then-HTTP with timeouts. Teardown (lifecycle.py:215) is correctly ordered (undeploy → docker stack rm fallback → volumes/secrets while .env exists → drop .env last), retries volume removal, and verifies residual is empty (raises TeardownError otherwise).
  • No secrets in code/committed files — PASS (provisional). Grep for inline passwords/tokens/private-key blocks across .py/.nix/.sh/.yml clean (only env/file references + generators). Full leak re-verify (incl. published logs + dashboard, and generated app passwords) belongs to RL3 D6.

Still owed in white-box pass #2 (after I read the rest): harness DRY (recipe quirks in shared harness, not per-recipe copy-paste), log redaction real (bridge/dashboard/log pipeline), architecture matches plan (layout/§3, poll-primary trigger §4.1, traefik-is-coop-cloud-recipe §4.2; drift → DECISIONS.md).

W0 (RL1 — lint/format tooling + green) : PASS @2026-05-27 (Adversary cold)

Gate claimed in STATUS-1b. Acceptance: clean checkout → nix develop .#lint --command bash scripts/lint.shlint: PASS; lint stage wired in .drone.yml push pipeline. Verified cold, independently (no nix on sandbox; ran on cc-ci over a pristine tree, not the Builder's working copy):

  • Cold checkout = exact reviewed SHA. git archive 233939a (= my origin/main HEAD) piped to cc-ci → /tmp/ccci-cold (clean tree, no untracked/cached state, secrets submodule empty as lint excludes it). Not cloned from /root/cc-ci (that's a non-git plain copy) — archived from my own clone.
  • Lint PASS cold. HOME=/root nix develop .#lint --command bash scripts/lint.shexit 0, lint: PASS. All 8 linters ran clean: nixpkgs-fmt (0/14 reformat), statix, deadnix, ruff format (32 files), ruff check (all passed), shfmt, shellcheck, yamllint.
  • Stage real, not rigged. scripts/lint.sh genuinely invokes each linter in check mode and accumulates a fail flag → exit "$fail" (correct set -uo pipefail, no -e, so all run). The .drone.yml self-test push pipeline runs the exact command nix develop .#lint --command bash scripts/lint.sh and FAILs the build on non-zero. Toolchain pinned from nixpkgs in flake.nix (devShells.lint), so CI == local.
  • Gate has TEETH (break-it probe). Injected violations into the cold tree (a .py with import os,sys + x=1+2, and a mis-formatted .nix) → re-ran lint → exit 1, lint: FAIL (ruff E401/I001/F401 + nixpkgs-fmt). So the stage is not vacuously green.

Verdict: W0 PASS. Builder may proceed to W1. Advisory (not W0-blocking; re-confirm at RL3): Builder notes the Gitea→Drone push webhook is flaky (§4.1), so the lint stage may not auto-fire as a real Drone build on every push — RL1's intent ("future commits stay clean") depends on that path actually firing. The stage IS wired and proven green via its exact command; I'll confirm a real push triggers the Drone lint build when I re-verify M2/D-gates at RL3 (it overlaps). Not filing a finding now — bounded phase, acceptance-as-stated is met.

White-box §3 pass #2 @2026-05-27 (Adversary, post-W0 formatted code) — RL2 input

Remaining §3 checklist items. No blocking findings.

  • Harness is DRY — PASS. Recipe quirks live in shared harness + per-recipe declarative metadata (tests/<recipe>/recipe_meta.py: HEALTH_PATH/HEALTH_OK/timeouts/EXTRA_ENV), consumed uniformly by tests/conftest.py (_recipe_meta, deployed/deployed_app fixtures) and runner/harness/lifecycle.py (_recipe_extra_env). No if recipe == "..." branches in the shared harness (the M6.5 no-surgery rule holds). Recipe-specific logic is isolated to that recipe's dir (e.g. keycloak kc_admin.py, cryptpad's derived SANDBOX_DOMAIN). Only smell: the ~6-8-line old_app upgrade fixture is copy-pasted across recipes — thin boilerplate over shared metadata; advisory, not a violation (factoring it would just add another per-recipe injection point). → IDEAS, not blocking.
  • Architecture matches plan — PASS. §4.1 trigger is poll-primary (bridge/bridge.py poll_loop runs unconditionally every ≤60s; webhook is optional + dedup'd by comment id; exact trimmed !testme; commenter-auth via read-level GET /orgs/{owner}/members/{user} 204=allow, fail-closed). §4.2 Traefik is the real coop-cloud/traefik recipe via abra (modules/proxy.nix: abra recipe fetch/app new traefik, WILDCARDS_ENABLED=1, compose.wildcard.yml, LETS_ENCRYPT_ENV="" → no ACME, cert as ssl_cert/ssl_key swarm secrets) — no hand-rolled traefik.nix. §3 layout matches.
  • Server state Nix-declared & idempotent — PASS. modules/proxy.nix deploy-proxy is Type=oneshot+RemainAfterExit, re-runs every activation and converges (insert secret only if absent, deploy). No .bootstrapped/run-once sentinels anywhere (grep clean, pass #1). Leans on 1c's already-proven D8 (byte-identical closure + live throwaway rebuild, no manual post-step).
  • Log redaction is real — PASS for infra secrets; one advisory gap to verify behaviorally at RL3/D6. runner/run_recipe_ci.py _redact_values() reads /run/secrets/* (≥8-char values) and run_stage_redacted() masks them in live-streamed stage output (sorted longest-first → no partial leak). But class-B generated app passwords are NOT under /run/secrets/*, so they are NOT in the _REDACT list — their non-leak rests entirely on the "harness never prints them / abra doesn't echo generated ones" assumption (code comment, run_recipe_ci.py:59-60). Also: the runner's own stdout (the cc-ci-run … Drone step) bypasses run_stage_redacted. This is exactly what my behavioral D6 leak test must catch at RL3 (grep published Drone logs and the dashboard for a known generated app password). Phase-1 D6 passed that test once; recording the white-box shape so RL3 re-checks it, not a new blocking finding. → WATCH-ITEM for RL3/D6.
  • Readability / docs accuracy — advisory; defer to RL4 (docs) + the ruff/lint pass already covers dead code / style deterministically.

Net of §3 white-box review (RL2 input): no blocking findings; 2 advisories (old_app copy-paste → IDEAS; app-secret redaction → RL3/D6 watch-item). I expect Builder's W1 to be light. I have NOT filed [adversary] BACKLOG items since nothing is blocking — will file if W1/RL3 surfaces a real defect.

Operator added RL5 + RL6 (plan §7, 2026-05-27) — both BLOCKING for 1b DONE. Noted; verification plan:

  • RL5 (Builder moves; Adversary verifies cold): modules/nix/modules/, hosts/nix/hosts/; flake.nix/flake.lock STAY at root so build ref #cc-ci is unchanged; fix flake internal paths + .drone.yml/scripts refs; update docs/architecture.md. Verification folds into RL3: a fresh recursive clone must still rebuild byte-identical to the running system (toplevel store hash WILL change — expected; what must hold is build==running + reproducible). I'll re-confirm cold at RL3.
  • RL6 (coordinated near-END-of-1b): move STATUS*/REVIEW*/JOURNAL*/BACKLOG*/DECISIONS.mdmachine-docs/; README.md stays at root (operator decision — human readme, not protocol). Update ALL refs (cc-ci-plan plans, AGENTS.md, .drone.yml, scripts). I verify refs updated + nothing broken. ⚠ CAVEAT affecting ME: the watchdog (launch.sh) reads STATUS-<id>.md/REVIEW-<id>.md at repo ROOT for handoffs/transitions — moving breaks it until launch.sh updated + watchdog restarted IN LOCKSTEP (orchestrator handles that). So I keep writing REVIEW-1b.md at root until the coordinated cutover, and at that moment I git mv my own REVIEW files (single-writer rule) in lockstep. Will NOT move them unilaterally or while a phase transition is pending.

RL2 (§3 white-box checklist) : PASS @2026-05-27 (Adversary)

My white-box passes #1+#2 found no blocking findings; Builder's own §3 self-review agrees. Advisories triaged (old_app copy-paste → IDEAS; generated-app-secret redaction → RL3/D6 watch-item). RL2 confirmed.

RL5 (nix/ consolidation) — structural PASS @2026-05-27; build-proof folds into RL3 below

  • modules/ and hosts/ gone from root; nix/modules/ (12 .nix) + nix/hosts/cc-ci/ (configuration.nix, hardware.nix) present; flake.nix + flake.lock stay at root (build ref #cc-ci unchanged). flake.nix imports ./nix/hosts/cc-ci/configuration.nix. No dangling ./modules/./hosts refs in flake.nix/.drone.yml/scripts (grep clean). docs/architecture.md + DECISIONS updated per Builder. The "flake still evaluates + builds byte-identical with new paths" proof = the cold rebuild in RL3 (below).

RL3 (final gate) — IN PROGRESS @2026-05-27 (Adversary cold). Re-verifying all D1D10; partial so far:

  • Cardinal rule — tests NOT weakened : PASS. Diffed every tests/**/test_*.py + runner/harness/ between pre-1b (6d2bc3d, the 1c-DONE commit) and HEAD. Every change is ruff line-wrapping only — assertion predicates, comparison operators (==, in), expected values, marker/SQL strings, and wait_healthy params are all byte-for-byte preserved (verified by reading the -w diff in full). No assertion removed/softened, no pytest.skip/xfail/assert True added, no test_ fn deleted. The format+RL5 cleanup regressed no test logic.

  • System health (cc-ci canonical) : confirmed. readlink /run/current-system == 8i3jcad9mrr01558lqckpi26nxn2ra3m-nixos-system-…50ab793 (matches claim); systemctl is-system-runningrunning; 5 infra stacks up (traefik[2 svc]/drone/ccci-bridge/ccci-dashboard/backups), no leftover test app (idle). [Note: "6 stacks" in 1c included a transient test app; 5 infra stacks is the idle baseline.]

  • D8 + RL5 byte-identical cold rebuild : PASS @2026-05-27 (Adversary cold, independent). On cc-ci: fresh git clone --recurse-submodules of origin to /tmp/ccci-rl3 (HEAD aa120d1, submodule secrets @2312f1c clean, secrets/secrets.yaml present) → nixos-rebuild build --flake "git+file:///tmp/ccci-rl3?submodules=1#cc-ci"toplevel 8i3jcad9mrr01558lqckpi26nxn2ra3m… == running (byte-identical, build==running). Proves D8 (reproducible from a fresh clone) and RL5 (new nix/ layout evaluates+builds, #cc-ci ref unchanged). Sanity: a build without ?submodules=1 fails secrets/secrets.yaml does not exist — confirms secrets genuinely come from the submodule, not baked in. Token used via transient -c http.extraHeader (not persisted in clone config — verified); temp clone removed.

  • Still owed for RL3 PASS: live !testme e2e on the cleaned closure (D1D4/D7) incl. upgrade-stage- actually-runs · D6 behavioral leak test (Drone logs + dashboard, incl. a generated app password) · D5/D9/D10 evidence refresh (lean on byte-identical harness/test code + prior Phase-1/1c green runs + spot checks). Pacing across wakes.

  • Live !testme e2e (D1D4/D7) : IN FLIGHT @2026-05-27. Posted exact !testme (comment id 13743, by authorized org-member bot) on recipe-maintainers/custom-html PR #2 at 20:33:16 UTC. Pre- trigger latest Drone build = #150. Bridge polls 30s. Background watcher (cc-ci) measuring trigger latency (D1 <60s), then watching install/upgrade/backup stages to completion (D2/D3/D4) + run URL (D7). Result logged on completion. Then D6 leak test over THIS run's published logs + dashboard. (Side note for the RL1 advisory: no push-triggered Drone build exists for recent 1b commits — latest push build is #149 [a 1c commit] — consistent with the flaky Gitea→Drone push webhook; the lint stage is wired + proven via its exact command but the auto-fire path needs the operator's webhook. Will note as a documented advisory, not a 1b blocker.)

Status: RL1 PASS · RL2 PASS · RL4 done(Builder) · RL5 structural PASS · RL3 IN PROGRESS (cardinal-rule

PASS + byte-identical cold rebuild PASS; live e2e + D6 leak test in flight) · RL6 deferred(coordinated).