156 lines
11 KiB
Markdown
156 lines
11 KiB
Markdown
# STATUS — phase `nixenv` (Builder)
|
|
|
|
Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase-nixenv-shared-runtime-env.md`
|
|
|
|
## Phase
|
|
Single-source the harness/recipe-test runtime env so the Drone runner, the nightly/weekly sweep
|
|
timer, and host `systemPackages` share ONE declaration (no duplicate `pyEnv`, no divergent
|
|
`runtimeInputs`, DEFECT-3 host-PATH patch removed/subsumed).
|
|
|
|
## DONE
|
|
|
|
Phase `nixenv` complete. The harness/recipe-test runtime env is single-sourced in
|
|
`nix/modules/packages.nix` (`ccciPyEnv` + `ccciRuntimeTools` + `cc-ci-run`) and referenced by the
|
|
Drone runner entrypoint (`cc-ci-run`), the nightly/weekly sweep (execs `cc-ci-run`), and both hosts'
|
|
`systemPackages` — no duplicate `pyEnv`, no divergent `runtimeInputs`, the DEFECT-3 host-PATH patch
|
|
removed. Deployed (`nixos-rebuild switch`, d11f8f5), host healthy. Live parity proven: gitea
|
|
`test_lfs_roundtrip` GREEN under BOTH a real timer fire (@17:57:54Z) and the Drone path (build #871) —
|
|
git-lfs/openssl resolve from the single shared declaration on every path. The DEFECT-3 divergence
|
|
class is structurally impossible.
|
|
|
|
- **M1 — PASS** @ 2026-06-17T17:40Z (REVIEW-nixenv.md, claim `8b8fc1f`).
|
|
- **M2 — PASS** @ 2026-06-17T18:20Z (REVIEW-nixenv.md, claim `f7b6f26`).
|
|
- No VETO; no standing defects.
|
|
|
|
## M1 — PASS @ 2026-06-17T17:40Z (REVIEW-nixenv.md, claim 8b8fc1f). No VETO.
|
|
|
|
## Gate: M2 — PASS @ 2026-06-17T18:20Z (REVIEW-nixenv.md, claim f7b6f26). No VETO.
|
|
|
|
**WHAT (M2 DoD).** (1) Deployed via `nixos-rebuild switch`, host verified healthy. (2) Live parity:
|
|
gitea `test_lfs_roundtrip` GREEN under BOTH a real timer fire AND the Drone path, from the shared
|
|
env (git-lfs resolves on both — DEFECT-3 condition met live). (3) A canon-style sweep still
|
|
promotes/SKIPs correctly under the unified env — no regression to canon's result.
|
|
|
|
**WHERE (inputs).** Deployed system from `/etc/cc-ci` @ d11f8f5 (= M1-reviewed tree). nixenv diff
|
|
`dd6712c..d11f8f5` = nix/ modules + machine-docs ONLY; **zero `runner/`/`tests/` changes** (verify:
|
|
`git diff --name-only dd6712c..d11f8f5 | grep -E 'runner/|tests/'` → empty). `runner/nightly_sweep.py`
|
|
(the promote path) last touched by canon commit `f94de22` — byte-identical to canon.
|
|
|
|
### M2 result summary (both witnesses PASS, host healthy, no regression)
|
|
- **(2a) Drone-path witness — PASS.** Drone build **#871** (event=custom, RECIPE=gitea REF=357926f2
|
|
PR=1 SRC=recipe-maintainers/gitea), status=success, 18:11→18:14Z. The Drone exec pipeline runs
|
|
`cc-ci-run runner/run_recipe_ci.py` (`.drone.yml:83`). compose.lfs.yml present at that ref →
|
|
`_lfs_enabled()` true → LFS test RAN (not skipped): `tests/gitea/custom/test_lfs_roundtrip.py::
|
|
test_lfs_roundtrip PASSED`; all install/upgrade/backup/restore/custom tiers PASSED.
|
|
- HOW (Adversary re-run): `ssh cc-ci 'TOK=$(cat /run/secrets/bridge_drone_token); curl -s -H
|
|
"Authorization: Bearer $TOK" https://drone.ci.commoninternet.net/api/repos/recipe-maintainers/cc-ci/builds/871/logs/1/2 | jq -r ".[].out"' | grep test_lfs_roundtrip`.
|
|
EXPECTED: `test_lfs_roundtrip PASSED`. (Or trigger your OWN build with the same params and re-run.)
|
|
- **(2b) Real timer fire witness — PASS** (details retained in the block below): `test_lfs_roundtrip
|
|
PASSED` @17:57:54Z under `systemctl start nightly-sweep.service`, git-lfs resolved from cc-ci-run's
|
|
runtimeInputs while the systemd unit PATH has NO git-lfs / no /run/current-system/sw/bin.
|
|
- **(3) No regression.** Sweep (PID 2743890, 17:35→18:0xZ) completed all 20 enrolled recipes; SKIPs
|
|
all correct (cryptpad/ghost/drone/hedgedoc/immich/lasuite-*/mailu/matrix-synapse/n8n/plausible/
|
|
uptime-kuma no-new-version SKIP), promotes correct (custom-html→1.13.0+1.31.1, mumble→1.0.0+v1.6.870-0).
|
|
Three results need explicit non-regression context, ALL pre-existing (identical in the pre-deploy
|
|
fires PID 2149231@14:xx / 2248547@15:xx, OLD env):
|
|
- gitea `rc=0 GREEN-BUT-PROMOTE-FAILED` — tests green; WC5 promote fails `FATA warm-gitea… is
|
|
already deployed` (abra deploy-idempotency on the persistent warm canonical, up since 08:39Z;
|
|
non-fatal). promote path = canon `nightly_sweep.py` f94de22, unchanged by nixenv.
|
|
- discourse `rc=1` and mattermost-lts `rc=1` — recipe-level red (mattermost: `test_restore_returns_state`
|
|
→ `docker exec … postgres … relation "ci_marker" does not exist`; docker resolved fine → NOT a
|
|
missing-tool/dropped-dep failure). Both failed identically pre-deploy → not caused by the env change.
|
|
- **Host health (re-verified post-sweep @18:16Z).** `systemctl --failed` empty; `nightly-sweep.timer`
|
|
+ deploy-proxy/deploy-drone/deploy-bridge/drone-runner-exec/swarm-init/warm-keycloak all active;
|
|
drone `/healthz` 200, ci.commoninternet.net 200; live `cc-ci-run` = `zxlx9jnylh7la5m48bsqb1wfm5l9r0bd`
|
|
(M1-reviewed path).
|
|
|
|
### M2 deploy + timer-fire details (retained for the record)
|
|
|
|
**Deploy DONE** @ 2026-06-17T17:34Z. `nixos-rebuild switch --flake 'git+file:///etc/cc-ci?submodules=1#cc-ci-hetzner'`
|
|
(live host = hetzner; `/etc/cc-ci` @ d11f8f5). Deployed system `/nix/store/dhmpm232r6m0sq3s7y5r5jpyv5kxgzwi-nixos-system-…`
|
|
is BYTE-IDENTICAL to the M1-reviewed local build. Health: `systemctl --failed` empty; deploy-proxy /
|
|
warm-keycloak / swarm-init / drone-runner-exec all active; `nightly-sweep.timer` active;
|
|
drone healthz + ci.commoninternet.net → 200. Live `cc-ci-run` = `zxlx9jnylh7la5m48bsqb1wfm5l9r0bd`
|
|
(the M1-reviewed path); git-lfs/openssl/script/bash resolve on host PATH (openssl was MISSING pre-deploy).
|
|
|
|
**Live parity witness — BOTH paths GREEN** (Drone #871 + timer fire; summarised above). Diff scope: ONLY nix/ changed
|
|
(dd6712c..d11f8f5: 5 nix files, zero runner/tests) → sweep SKIP/promote logic byte-identical to
|
|
canon's PASSed sweep.
|
|
- **Real timer fire — PASS** @ 2026-06-17T17:57:54Z. `systemctl start nightly-sweep.service` @
|
|
17:35:38Z (PID 2743890; child run_recipe_ci PID 2808444). The unit's systemd PATH contains ONLY
|
|
coreutils/findutils/gnugrep/gnused/systemd — NOT git-lfs, NOT /run/current-system/sw/bin — so
|
|
git-lfs resolved from cc-ci-run's runtimeInputs (the DEFECT-3 condition). Verified live: the running
|
|
run_recipe_ci process PATH (`/proc/<pid>/environ`) carries `…-git-lfs-3.6.1/bin` from cc-ci-run.
|
|
gitea RUN (canonical 3.5.3+1.24.2 < tag 3.6.0+1.24.2) exercised LFS (upgrade-env COMPOSE_FILE
|
|
includes compose.lfs.yml) → `tests/gitea/custom/test_lfs_roundtrip.py::test_lfs_roundtrip PASSED`
|
|
(18.66s); all other gitea tiers PASSED.
|
|
- HOW (Adversary re-run): `ssh cc-ci 'journalctl -u nightly-sweep.service -o short-iso --since
|
|
"2026-06-17 17:55:57" --until "2026-06-17 17:58:07"' | grep -iE "lfs_roundtrip|PASSED|rc="`.
|
|
EXPECTED: `test_lfs_roundtrip PASSED` then `sweep: gitea rc=0`.
|
|
- NOTE (not a regression): the sweep line reads `rc=0 GREEN-BUT-PROMOTE-FAILED` — all TESTS green;
|
|
the WC5 promote (`abra app deploy warm-gitea… -o -n`) fails with `FATA warm-gitea… is already
|
|
deployed`. This is an abra deploy-idempotency quirk on the warm canonical (already running, volume
|
|
retained), NON-FATAL (known-good unchanged), and it occurred IDENTICALLY in the pre-deploy runs
|
|
(PID 2149231 @ 14:28Z, PID 2248547 @ 15:56Z) — orthogonal to the runtime-env refactor (abra is on
|
|
PATH unchanged in both). SKIPs in this fire are all correct (cryptpad/ghost/drone/hedgedoc/immich
|
|
no-new-version SKIP; custom-html RUN→promoted 1.13.0+1.31.1).
|
|
- Drone-path gitea witness: DONE — build #871 PASS (see "(2a)" above).
|
|
|
|
### (prior M1 claim block retained below for the record)
|
|
## M1 details — PASS
|
|
|
|
**WHAT (M1 DoD).** The harness/recipe-test runtime env is declared ONCE and referenced by all
|
|
consumers; `nixos-rebuild build` succeeds for both hosts; the shared set is superset-or-equal of
|
|
every prior list (nothing dropped); the sweep and the Drone runner resolve the same tooling; a
|
|
future dep added to the shared set reaches all consumers.
|
|
|
|
**WHERE (inputs).** All changes at the tip of `main` (commit pushed with this claim).
|
|
- Single source: `nix/modules/packages.nix` — overlay defines `ccciPyEnv` (let), `ccciRuntimeTools`
|
|
(overlay attr), `cc-ci-run` (overlay attr, `runtimeInputs = [ccciPyEnv] ++ ccciRuntimeTools`).
|
|
- Consumers: `nix/modules/harness.nix` (`systemPackages = [ pkgs.cc-ci-run ]`),
|
|
`nix/modules/nightly-sweep.nix` (wrapper execs `cc-ci-run`),
|
|
`nix/hosts/cc-ci/configuration.nix` + `nix/hosts/cc-ci-hetzner/configuration.nix`
|
|
(`systemPackages = pkgs.ccciRuntimeTools ++ [ pkgs.openssh ]`).
|
|
- `nix/modules/drone-runner.nix` unchanged (still `PATH=/run/current-system/sw/bin:/run/wrappers/bin`;
|
|
it consumes the host PATH, which now references the shared set).
|
|
|
|
**HOW + EXPECTED (cold-verifiable; `secrets/` is a git submodule → use `?submodules=1` for a dirty
|
|
tree, or build from a `git clone --recursive`).**
|
|
|
|
1. Builds succeed (both hosts):
|
|
- `nixos-rebuild build --flake '.?submodules=1#cc-ci-hetzner'` → builds
|
|
`nixos-system-nixos-24.11.…` (locally: `/nix/store/dhmpm232r6m0sq3s7y5r5jpyv5kxgzwi-nixos-system-nixos-24.11.20250630.50ab793`;
|
|
store hash may differ on a fresh clone if paths differ, but it MUST build with no collision error).
|
|
- `nixos-rebuild build --flake '.?submodules=1#cc-ci'` → builds OK (no collision error).
|
|
|
|
2. Single source (grep proofs):
|
|
- `grep -rn withPackages nix/` → EXACTLY 1 hit: `nix/modules/packages.nix` (`ccciPyEnv`).
|
|
- `grep -rn "pytest playwright" nix/` → EXACTLY 1 hit: same line. (No duplicate pyEnv.)
|
|
- `grep -rn ccciRuntimeTools nix/` → defined once (packages.nix), referenced by both host configs.
|
|
- `nightly-sweep.nix` contains NO `withPackages`, NO `python3`, NO `/run/current-system/sw/bin`
|
|
PATH prepend, and its `runtimeInputs = [ pkgs.cc-ci-run ]` only; it `exec cc-ci-run …`.
|
|
|
|
3. Superset-or-equal — `cc-ci-run` carries every tool (inspect the built wrapper's PATH):
|
|
- `CCRUN=$(nix eval --raw '.?submodules=1#nixosConfigurations.cc-ci-hetzner.pkgs.cc-ci-run'); grep '^export PATH' "$CCRUN/bin/cc-ci-run"`
|
|
- EXPECTED store dirs on PATH (15): python3-3.12.8-env, abra-0.13.0-beta, docker-27.5.1,
|
|
git-2.47.2, **git-lfs-3.6.1**, bash-5.2p37, coreutils-9.5, util-linux-2.39.4, curl-8.12.1,
|
|
jq-1.7.1, gnused-4.9, gnugrep-3.11, gnutar-1.35, **openssl-3.3.3**, procps-4.0.4.
|
|
- git-lfs + openssl are the additions vs prior lists; nothing from any prior list is dropped.
|
|
|
|
4. Sweep ≡ Drone entrypoint (parity by construction):
|
|
- The built `cc-ci-nightly-sweep` wrapper `exec cc-ci-run …` resolves the BYTE-IDENTICAL
|
|
cc-ci-run store path that the `.drone.yml` `cc-ci-run runner/run_recipe_ci.py` step runs
|
|
(locally `/nix/store/zxlx9jnylh7la5m48bsqb1wfm5l9r0bd-cc-ci-run`). Same store path ⇒ same
|
|
pyEnv, same tooling, same PLAYWRIGHT_BROWSERS_PATH.
|
|
|
|
5. Host divergence removed:
|
|
- Both host `configuration.nix` `systemPackages` lines are textually identical
|
|
(`pkgs.ccciRuntimeTools ++ [ pkgs.openssh ]`). The `cc-ci` host now GAINS `git-lfs`+`openssl`
|
|
on its system PATH (`ls $(nix eval --raw '.?submodules=1#nixosConfigurations.cc-ci.config.system.build.toplevel')/sw/bin/ | grep -E '^(git-lfs|openssl)$'` → both present; pre-refactor cc-ci lacked git-lfs).
|
|
|
|
6. Future-dep propagation: adding a pkg to `ccciRuntimeTools` in packages.nix lands in cc-ci-run's
|
|
runtimeInputs (Drone + sweep) AND both hosts' systemPackages from the single edit.
|
|
|
|
## Build backlog
|
|
See `BACKLOG-nixenv.md`. M2 (deploy + live parity witness) is gated behind the M1 PASS.
|