Files
cc-ci/machine-docs/STATUS-nixenv.md

101 lines
7.0 KiB
Markdown

# STATUS — phase `nixenv` (Builder)
Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase-nixenv-shared-runtime-env.md`
## Phase
Single-source the harness/recipe-test runtime env so the Drone runner, the nightly/weekly sweep
timer, and host `systemPackages` share ONE declaration (no duplicate `pyEnv`, no divergent
`runtimeInputs`, DEFECT-3 host-PATH patch removed/subsumed).
## M1 — PASS @ 2026-06-17T17:40Z (REVIEW-nixenv.md, claim 8b8fc1f). No VETO.
## Gate: M2 — IN FLIGHT (deploy + live parity witness)
**Deploy DONE** @ 2026-06-17T17:34Z. `nixos-rebuild switch --flake 'git+file:///etc/cc-ci?submodules=1#cc-ci-hetzner'`
(live host = hetzner; `/etc/cc-ci` @ d11f8f5). Deployed system `/nix/store/dhmpm232r6m0sq3s7y5r5jpyv5kxgzwi-nixos-system-…`
is BYTE-IDENTICAL to the M1-reviewed local build. Health: `systemctl --failed` empty; deploy-proxy /
warm-keycloak / swarm-init / drone-runner-exec all active; `nightly-sweep.timer` active;
drone healthz + ci.commoninternet.net → 200. Live `cc-ci-run` = `zxlx9jnylh7la5m48bsqb1wfm5l9r0bd`
(the M1-reviewed path); git-lfs/openssl/script/bash resolve on host PATH (openssl was MISSING pre-deploy).
**Live parity witness — timer fire GREEN; Drone path pending.** Diff scope: ONLY nix/ changed
(dd6712c..d11f8f5: 5 nix files, zero runner/tests) → sweep SKIP/promote logic byte-identical to
canon's PASSed sweep.
- **Real timer fire — PASS** @ 2026-06-17T17:57:54Z. `systemctl start nightly-sweep.service` @
17:35:38Z (PID 2743890; child run_recipe_ci PID 2808444). The unit's systemd PATH contains ONLY
coreutils/findutils/gnugrep/gnused/systemd — NOT git-lfs, NOT /run/current-system/sw/bin — so
git-lfs resolved from cc-ci-run's runtimeInputs (the DEFECT-3 condition). Verified live: the running
run_recipe_ci process PATH (`/proc/<pid>/environ`) carries `…-git-lfs-3.6.1/bin` from cc-ci-run.
gitea RUN (canonical 3.5.3+1.24.2 < tag 3.6.0+1.24.2) exercised LFS (upgrade-env COMPOSE_FILE
includes compose.lfs.yml) `tests/gitea/custom/test_lfs_roundtrip.py::test_lfs_roundtrip PASSED`
(18.66s); all other gitea tiers PASSED.
- HOW (Adversary re-run): `ssh cc-ci 'journalctl -u nightly-sweep.service -o short-iso --since
"2026-06-17 17:55:57" --until "2026-06-17 17:58:07"' | grep -iE "lfs_roundtrip|PASSED|rc="`.
EXPECTED: `test_lfs_roundtrip PASSED` then `sweep: gitea rc=0`.
- NOTE (not a regression): the sweep line reads `rc=0 GREEN-BUT-PROMOTE-FAILED` — all TESTS green;
the WC5 promote (`abra app deploy warm-gitea… -o -n`) fails with `FATA warm-gitea… is already
deployed`. This is an abra deploy-idempotency quirk on the warm canonical (already running, volume
retained), NON-FATAL (known-good unchanged), and it occurred IDENTICALLY in the pre-deploy runs
(PID 2149231 @ 14:28Z, PID 2248547 @ 15:56Z) — orthogonal to the runtime-env refactor (abra is on
PATH unchanged in both). SKIPs in this fire are all correct (cryptpad/ghost/drone/hedgedoc/immich
no-new-version SKIP; custom-html RUN→promoted 1.13.0+1.31.1).
- Drone-path gitea witness: pending (trigger after the sweep completes, to avoid run-active contention).
### (prior M1 claim block retained below for the record)
## M1 details — PASS
**WHAT (M1 DoD).** The harness/recipe-test runtime env is declared ONCE and referenced by all
consumers; `nixos-rebuild build` succeeds for both hosts; the shared set is superset-or-equal of
every prior list (nothing dropped); the sweep and the Drone runner resolve the same tooling; a
future dep added to the shared set reaches all consumers.
**WHERE (inputs).** All changes at the tip of `main` (commit pushed with this claim).
- Single source: `nix/modules/packages.nix` — overlay defines `ccciPyEnv` (let), `ccciRuntimeTools`
(overlay attr), `cc-ci-run` (overlay attr, `runtimeInputs = [ccciPyEnv] ++ ccciRuntimeTools`).
- Consumers: `nix/modules/harness.nix` (`systemPackages = [ pkgs.cc-ci-run ]`),
`nix/modules/nightly-sweep.nix` (wrapper execs `cc-ci-run`),
`nix/hosts/cc-ci/configuration.nix` + `nix/hosts/cc-ci-hetzner/configuration.nix`
(`systemPackages = pkgs.ccciRuntimeTools ++ [ pkgs.openssh ]`).
- `nix/modules/drone-runner.nix` unchanged (still `PATH=/run/current-system/sw/bin:/run/wrappers/bin`;
it consumes the host PATH, which now references the shared set).
**HOW + EXPECTED (cold-verifiable; `secrets/` is a git submodule → use `?submodules=1` for a dirty
tree, or build from a `git clone --recursive`).**
1. Builds succeed (both hosts):
- `nixos-rebuild build --flake '.?submodules=1#cc-ci-hetzner'` → builds
`nixos-system-nixos-24.11.…` (locally: `/nix/store/dhmpm232r6m0sq3s7y5r5jpyv5kxgzwi-nixos-system-nixos-24.11.20250630.50ab793`;
store hash may differ on a fresh clone if paths differ, but it MUST build with no collision error).
- `nixos-rebuild build --flake '.?submodules=1#cc-ci'` → builds OK (no collision error).
2. Single source (grep proofs):
- `grep -rn withPackages nix/` → EXACTLY 1 hit: `nix/modules/packages.nix` (`ccciPyEnv`).
- `grep -rn "pytest playwright" nix/` → EXACTLY 1 hit: same line. (No duplicate pyEnv.)
- `grep -rn ccciRuntimeTools nix/` → defined once (packages.nix), referenced by both host configs.
- `nightly-sweep.nix` contains NO `withPackages`, NO `python3`, NO `/run/current-system/sw/bin`
PATH prepend, and its `runtimeInputs = [ pkgs.cc-ci-run ]` only; it `exec cc-ci-run `.
3. Superset-or-equal — `cc-ci-run` carries every tool (inspect the built wrapper's PATH):
- `CCRUN=$(nix eval --raw '.?submodules=1#nixosConfigurations.cc-ci-hetzner.pkgs.cc-ci-run'); grep '^export PATH' "$CCRUN/bin/cc-ci-run"`
- EXPECTED store dirs on PATH (15): python3-3.12.8-env, abra-0.13.0-beta, docker-27.5.1,
git-2.47.2, **git-lfs-3.6.1**, bash-5.2p37, coreutils-9.5, util-linux-2.39.4, curl-8.12.1,
jq-1.7.1, gnused-4.9, gnugrep-3.11, gnutar-1.35, **openssl-3.3.3**, procps-4.0.4.
- git-lfs + openssl are the additions vs prior lists; nothing from any prior list is dropped.
4. Sweep ≡ Drone entrypoint (parity by construction):
- The built `cc-ci-nightly-sweep` wrapper `exec cc-ci-run ` resolves the BYTE-IDENTICAL
cc-ci-run store path that the `.drone.yml` `cc-ci-run runner/run_recipe_ci.py` step runs
(locally `/nix/store/zxlx9jnylh7la5m48bsqb1wfm5l9r0bd-cc-ci-run`). Same store path ⇒ same
pyEnv, same tooling, same PLAYWRIGHT_BROWSERS_PATH.
5. Host divergence removed:
- Both host `configuration.nix` `systemPackages` lines are textually identical
(`pkgs.ccciRuntimeTools ++ [ pkgs.openssh ]`). The `cc-ci` host now GAINS `git-lfs`+`openssl`
on its system PATH (`ls $(nix eval --raw '.?submodules=1#nixosConfigurations.cc-ci.config.system.build.toplevel')/sw/bin/ | grep -E '^(git-lfs|openssl)$'` → both present; pre-refactor cc-ci lacked git-lfs).
6. Future-dep propagation: adding a pkg to `ccciRuntimeTools` in packages.nix lands in cc-ci-run's
runtimeInputs (Drone + sweep) AND both hosts' systemPackages from the single edit.
## Build backlog
See `BACKLOG-nixenv.md`. M2 (deploy + live parity witness) is gated behind the M1 PASS.