6.2 KiB
JOURNAL — phase nixenv (Builder)
2026-06-17 — M1: single-source the harness runtime env
Why this design
The phase plan §2 wants ONE definition of "what's needed to run a recipe test", referenced from
three places, so DEFECT-3 (a dep present for one path, missing for another) becomes structurally
impossible. I put the single source in nix/modules/packages.nix because it is the existing
"shared pkgs" overlay module already imported by both host configs — so pkgs.ccciRuntimeTools
and pkgs.cc-ci-run are reachable from every module/host without a fragile cross-module let.
Three overlay defs:
ccciPyEnv(let-bound, internal) —python3.withPackages [pytest playwright], the ONLY pyEnv now.ccciRuntimeTools(overlay attr) — the union tool set.cc-ci-run(overlay attr) —writeShellApplicationwithruntimeInputs = [ccciPyEnv] ++ ccciRuntimeTools.
Consumers:
harness.nix→environment.systemPackages = [ pkgs.cc-ci-run ](installs the entrypoint).nightly-sweep.nix→ wrapper execscc-ci-run(same binary the Drone pipeline runs), so pyEnv + tooling + PLAYWRIGHT env are identical to the Drone path by construction. Dropped: the duplicate pyEnv, the parallelruntimeInputstool list, and the DEFECT-3export PATH=/run/current-system/sw/bin…prepend — git-lfs/bash/util-linux/openssl now come from cc-ci-run's runtimeInputs.- both host
configuration.nix→systemPackages = pkgs.ccciRuntimeTools ++ [ pkgs.openssh ].
Why the union is a superset (nothing dropped)
- old cc-ci-run:
abra docker git coreutils util-linux⊂ set. - old sweep:
bash abra docker git curl jq gnused gnugrep gnutar coreutils util-linux procps⊂ set; its host-PATH-derived git-lfs/openssl are now EXPLICIT in the set. - old host PATH:
curl git jq(+ git-lfs on hetzner only) ⊂ set;opensshkept as host-only add. - pyEnv (python3+pytest+playwright) + playwright browsers (via PLAYWRIGHT_BROWSERS_PATH) preserved.
Additions vs any single prior list:
git-lfs,openssl(plan §2). Thecc-cihost GAINS git-lfs, killing the one-off hetzner-only divergence — both host configs now byte-identical.
Why writeShellApplication makes this work
writeShellApplication emits export PATH="<runtimeInputs>:$PATH" (confirmed on the live wrapper).
So cc-ci-run's full tool set is the PATH prefix regardless of caller. Under Drone the inherited
suffix is /run/current-system/sw/bin:/run/wrappers/bin; under the sweep it's the systemd-minimal
PATH — but the harness tools all resolve from the shared prefix either way, which is the parity the
plan wants. The host systemPackages reference is the belt-and-suspenders path for direct
.drone.yml shell-outs (abra --version, docker info) that don't go through cc-ci-run.
buildEnv collision watch (resolved)
Worry: adding coreutils/util-linux/procps/bash/gnu* to host systemPackages could collide with the
NixOS base requiredPackages. It did not — base requiredPackages are lowPrio, so the normal-prio
additions override cleanly. Both #cc-ci and #cc-ci-hetzner built with no collision error.
Note on other modules' tool lists
backupbot/docker-prune/drone/proxy/warm-keycloak.nix still list gnused/gnugrep/etc. in their OWN
runtimeInputs — those are independent reconcile-service scripts, never part of the harness/recipe
-test env, never part of the DEFECT-3 divergence. Single-sourcing is scoped to the harness env
(pyEnv + recipe-test tooling consumed by cc-ci-run / sweep / host PATH), which is now packages.nix only.
Verification (local, dirty tree needs ?submodules=1 — secrets/ is a submodule)
nixos-rebuild build --flake '.?submodules=1#cc-ci-hetzner'→ builtnixos-system-…dhmpm232….nixos-rebuild build --flake '.?submodules=1#cc-ci'→ built OK.- cc-ci-run store
zxlx9jnylh7la5m48bsqb1wfm5l9r0bd; PATH carries all 15 tools incl git-lfs-3.6.1 + openssl-3.3.3. - sweep wrapper
gh02w1kc…execs the SAMEzxlx9j…/bin/cc-ci-run. - cc-ci host sw/bin now lists git-lfs + openssl (was missing git-lfs pre-refactor).
grep -rn withPackages nix/→ 1 hit (packages.nix:17).
2026-06-17T18:17Z — M2 claim (both live parity witnesses green)
Drone-path witness (build #871)
Why REF=357926f2 PR=1 SRC=recipe-maintainers/gitea: this is the lfs-plain-gitea capstone ref (the
gtea-phase Build #685 ref). PR #1 is now merged so compose.lfs.yml is also on main, but pinning the
PR head guarantees _lfs_enabled() is true (compose.lfs.yml in checkout + RECIPE=gitea) so the LFS
test RUNS rather than skips. fetch_recipe takes the SRC+REF mirror-clone path; EXTRA_ENV adds
compose.lfs.yml to install+custom tiers so the deployed gitea has LFS on for the round-trip. Triggered
via the Drone API with the bridge's drone token (kept on-host). Build went green in ~3 min;
test_lfs_roundtrip PASSED. This is the SAME cc-ci-run store path the timer sweep execs, so the two
witnesses prove parity by both construction (M1) and observation (M2).
Why the timer fire is the harder witness
The systemd unit PATH is systemd-minimal (coreutils/findutils/gnugrep/gnused/systemd) — NO git-lfs, NO /run/current-system/sw/bin. So a green LFS test there can ONLY come from cc-ci-run's runtimeInputs prepending git-lfs-3.6.1 to PATH. Confirmed by reading /proc/<run_recipe_ci pid>/environ live: PATH starts with the cc-ci-run tool prefix incl git-lfs. This is exactly the DEFECT-3 condition the phase set out to make structurally impossible.
GREEN-BUT-PROMOTE-FAILED is not mine
Spent effort confirming the gitea promote-fail (abra app deploy warm-gitea -o -n → "already
deployed") is pre-existing: it appears identically in the two pre-deploy sweep fires (14:28Z, 15:56Z,
OLD env) and the promote path (runner/nightly_sweep.py) is unchanged by nixenv (last touched canon
f94de22). It's an abra deploy-idempotency limitation on the persistent warm canonical (warm-gitea up
since 08:39Z), non-fatal, known-good unchanged. discourse/mattermost-lts reds are likewise recipe-level
and pre-existing (mattermost: postgres restore marker assertion; docker resolved fine → not a dropped
tool). nixenv changes only WHICH tools are on PATH; it dropped nothing (M1 superset proof), so it
cannot have caused an app-level red.