Some checks failed
continuous-integration/drone/push Build is failing
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
157 lines
10 KiB
Markdown
157 lines
10 KiB
Markdown
# REVIEW — phase `nixenv` (Adversary)
|
||
|
||
Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase-nixenv-shared-runtime-env.md`
|
||
SSOT for verification. Verdicts below; cold-runs only.
|
||
|
||
Status: **M1 PASS** @ 17:40Z (`8b8fc1f`) + **M2 PASS** @ 18:20Z (`f7b6f26`). Both milestones fresh
|
||
Adversary PASS, no VETO → Builder cleared to write `## DONE`.
|
||
|
||
---
|
||
|
||
## M2 — PASS @ 2026-06-17T18:20Z — claim `f7b6f26` (deployed `/etc/cc-ci`@d11f8f5 = M1-reviewed tree)
|
||
|
||
**Deploy + live parity proven — cold-verified.** Verdict from the plan (SSOT), the code, the claim's
|
||
verification info, and my OWN live re-runs (Drone API, journald, host probes). JOURNAL-nixenv.md NOT
|
||
read before this verdict (anti-anchoring preserved).
|
||
|
||
**(1) Deploy clean + host healthy (re-verified live post-sweep @18:16–18:18Z).**
|
||
- Deployed system `dhmpm232r6m0sq3s7y5r5jpyv5kxgzwi-nixos-system-…` BYTE-IDENTICAL to my M1 build.
|
||
- `systemctl --failed` EMPTY; `nightly-sweep.timer` active+enabled; drone-runner-exec / deploy-proxy /
|
||
warm-keycloak / swarm-init all active; `nightly-sweep.service` finished Result=success
|
||
ExecMainStatus=0. drone `/healthz`→200, `ci.commoninternet.net`→200.
|
||
- Live `cc-ci-run` = `zxlx9jnylh7la5m48bsqb1wfm5l9r0bd` (M1-reviewed path). git-lfs/openssl/script/bash
|
||
resolve on host PATH AND inside cc-ci-run (git-lfs→`33ikv…-git-lfs-3.6.1`, openssl→`48p8b…-openssl-3.3.3`
|
||
from runtimeInputs, NOT host PATH). openssl was MISSING on this host pre-deploy.
|
||
- NO orphan ephemeral test stacks left by the sweep (no `gite-/matt-/disc-` per-run stacks); only the
|
||
expected warm canonicals (bluesky-pds, gitea, keycloak) remain — clean teardown.
|
||
|
||
**(2) Live LFS parity — GREEN on BOTH paths (the DEFECT-3 witness).**
|
||
- **Real timer fire:** `systemctl start nightly-sweep.service` @17:35:38Z; gitea RUN-eligible
|
||
(canonical 3.5.3 < tag 3.6.0) → `tests/gitea/custom/test_lfs_roundtrip.py::test_lfs_roundtrip
|
||
PASSED` @17:57:54Z (+ install/upgrade/backup/restore all PASS). The systemd unit PATH carries NO
|
||
git-lfs and NO /run/current-system/sw/bin, so git-lfs MUST have resolved from cc-ci-run's
|
||
runtimeInputs — exactly the old DEFECT-3 condition, now satisfied by the shared env.
|
||
- **Drone path:** independently inspected build **#871** via Drone API (status=success): stage
|
||
recipe-ci → step `ci` runs `cc-ci-run runner/run_recipe_ci.py` (`.drone.yml:83`). Log shows LFS
|
||
RAN not skipped: `test_lfs_roundtrip PASSED`; RUN SUMMARY install/upgrade/backup/restore/custom all
|
||
pass, level=5 of 5.
|
||
- Both paths exec the SAME `zxlx9jn` cc-ci-run ⇒ git-lfs resolves identically. DEFECT-3 class
|
||
structurally eliminated, demonstrated live.
|
||
|
||
**(3) No regression — sweep SKIPs/promotes correct; the 3 non-green results ALL pre-existing.**
|
||
- **Regression canary:** scanned the ENTIRE post-deploy sweep journal for missing-tool signatures
|
||
(`command not found` / `not found` / `executable file not found` / `No such file`) → **ZERO**.
|
||
Nothing got dropped from the env (consistent with the M1 superset proof). No recipe went GREEN→RED.
|
||
- SKIPs all correct (cryptpad/ghost/drone/hedgedoc/immich/lasuite-*/mailu/matrix-synapse/n8n/
|
||
plausible/uptime-kuma — no-new-version); promotes correct (custom-html, mumble).
|
||
- **gitea GREEN-BUT-PROMOTE-FAILED**: tests green; WC5 promote `abra app deploy warm-gitea… -o -n`
|
||
fails `FATA … is already deployed` — abra idempotency on the persistent warm canonical (warm-gitea
|
||
confirmed still up). canonical.json unchanged (3.5.3, ts 08:39Z). Promote path = `nightly_sweep.py`
|
||
@canon f94de22, UNCHANGED by nixenv (diff dd6712c..d11f8f5 is nix/+machine-docs only, zero
|
||
runner/tests) → behaviour identical to canon by construction.
|
||
- **discourse rc=1 / mattermost-lts rc=1**: recipe-level reds, env-independent —
|
||
discourse `test_head_runs_official_image_not_bitnamilegacy` + `test_sidekiq_service_dropped_by_head`
|
||
(HEAD-image/service assertions); mattermost `test_restore_returns_state` → `docker exec … postgres …
|
||
relation "ci_marker" does not exist` (docker RESOLVED and ran — a restore-data failure, not a
|
||
missing tool). **Corroborated pre-existing:** the SAME reds occur in BOTH OLD-env pre-deploy fires
|
||
today (PID 2149231@14:xx, PID 2248547@15:xx) — mattermost byte-identical postgres error; discourse
|
||
red in all fires (never green). Not caused by the env change.
|
||
|
||
**No defects, no VETO.** M2 DoD fully met live. The harness runtime env is single-sourced and proven
|
||
identical across the Drone runner, the timer sweep, and host systemPackages, with git-lfs/openssl now
|
||
guaranteed from one declaration — the DEFECT-3 divergence class is structurally impossible.
|
||
|
||
**M1 + M2 fresh Adversary PASS → DONE is cleared.** (Consulted JOURNAL-nixenv.md? No — verdict stands
|
||
on plan + code + my own live re-runs.)
|
||
|
||
---
|
||
|
||
## M1 — PASS @ 2026-06-17T17:40Z — claim `8b8fc1f`
|
||
|
||
**Single-source harness runtime env — cold-verified, all 6 DoD items.** Verdict formed from the
|
||
phase plan (SSOT), the code, and my OWN cold builds/evals — JOURNAL-nixenv.md NOT consulted
|
||
(anti-anchoring preserved).
|
||
|
||
1. **Builds succeed, both hosts (no collision).** `nix build .?submodules=1#…cc-ci-hetzner…toplevel`
|
||
→ EXIT 0; `…#…cc-ci…toplevel` → EXIT 0. (A transient SQLite eval-cache "busy" from running both
|
||
in parallel was `error (ignored)`, not a build failure.)
|
||
2. **Single source (greps).** `withPackages` → 1 hit (`packages.nix:17` `ccciPyEnv`); `pytest
|
||
playwright` → 1 hit (same line); `ccciRuntimeTools` defined once (`packages.nix:45`), referenced
|
||
by `cc-ci-run` (`:68`) + both host configs. `nightly-sweep.nix` has NO `withPackages`, NO
|
||
`python3`, NO `/run/current-system/sw/bin` PATH prepend — `runtimeInputs = [ pkgs.cc-ci-run ]`
|
||
and `exec cc-ci-run …`. The DEFECT-3 host-PATH patch is GONE.
|
||
3. **Superset-or-equal — inspected the BUILT wrapper PATH.** `cc-ci-run` store
|
||
`zxlx9jnylh7la5m48bsqb1wfm5l9r0bd` `export PATH` carries all 15 store dirs:
|
||
python3-3.12.8-env, abra-0.13.0-beta, docker-27.5.1, git-2.47.2, **git-lfs-3.6.1**, bash-5.2p37,
|
||
coreutils-9.5, util-linux-2.39.4, curl-8.12.1, jq-1.7.1, gnused-4.9, gnugrep-3.11, gnutar-1.35,
|
||
**openssl-3.3.3**, procps-4.0.4 — and ends `:$PATH` (PREPEND, inherited PATH retained → nothing
|
||
from any path lost). Covers the full union of all 3 prior lists; `git-lfs`+`openssl` are the only
|
||
additions. Nothing dropped.
|
||
4. **Sweep ≡ Drone entrypoint (parity by construction).** Built `cc-ci-nightly-sweep` references the
|
||
BYTE-IDENTICAL `zxlx9jnylh7la5m48bsqb1wfm5l9r0bd-cc-ci-run`; both hosts'
|
||
`pkgs.cc-ci-run` resolve that SAME store path; `.drone.yml:83` runs `cc-ci-run
|
||
runner/run_recipe_ci.py` (host systemPackages wrapper = same path). Same store path ⇒ identical
|
||
pyEnv + tooling + PLAYWRIGHT_BROWSERS_PATH on Drone path AND timer sweep.
|
||
5. **Host divergence removed.** Both `configuration.nix` systemPackages lines are textually identical
|
||
(`pkgs.ccciRuntimeTools ++ [ pkgs.openssh ]`). The pre-refactor `cc-ci`-vs-`hetzner` `git-lfs`
|
||
one-off divergence (my prep flag #1) is ELIMINATED: built `cc-ci` toplevel `sw/bin` now contains
|
||
`git-lfs`, `openssl`, `script` (util-linux) — tools it previously lacked. `openssh` correctly kept
|
||
host-only (ssh client, not a recipe tool); it remains on both hosts so the Drone path's inherited
|
||
PATH is unchanged for it.
|
||
6. **Future-dep propagation (by construction).** `ccciRuntimeTools` is the lone definition; it feeds
|
||
`cc-ci-run.runtimeInputs` (→ Drone path via `.drone.yml`, → sweep via `exec cc-ci-run`) AND both
|
||
hosts' `systemPackages` (→ Drone runner host PATH). One edit to that list reaches every consumer.
|
||
Proven structurally via the reference graph; no working-tree mutation needed.
|
||
|
||
**No defects, no VETO.** Faithful refactor — one shared definition, three references, DEFECT-3 class
|
||
structurally eliminated. M2 (deploy via `nixos-rebuild switch` + live parity witness: gitea LFS
|
||
roundtrip green under BOTH Drone path and a real timer fire) remains to be claimed/verified.
|
||
|
||
---
|
||
|
||
## (prior) Cold-prep notes
|
||
|
||
---
|
||
|
||
## Cold-prep — enumeration of the CURRENT (pre-refactor) declarations @ HEAD dd6712c
|
||
|
||
The M1 superset-or-equal proof must show the new shared set ⊇ the union of all of these. Captured
|
||
from the code (SSOT), independent of any Builder narrative:
|
||
|
||
**(A) `nix/modules/harness.nix` — `cc-ci-run` (Drone entrypoint) `runtimeInputs`:**
|
||
`pyEnv abra docker git coreutils util-linux`
|
||
- `pyEnv = python3.withPackages [ pytest playwright ]`
|
||
- env: `PLAYWRIGHT_BROWSERS_PATH=${playwright-driver.browsers}`, `PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1`
|
||
|
||
**(B) `nix/modules/nightly-sweep.nix` — sweep `runtimeInputs`:**
|
||
`bash abra docker git curl jq gnused gnugrep gnutar coreutils util-linux procps`
|
||
- DUPLICATE `pyEnv = python3.withPackages [ pytest playwright ]`
|
||
- same PLAYWRIGHT env
|
||
- DEFECT-3 patch: `export PATH="/run/current-system/sw/bin:/run/wrappers/bin:$PATH"` (host-PATH prepend)
|
||
|
||
**(C) Drone runner path — `nix/modules/drone-runner.nix`:**
|
||
`PATH = mkForce "/run/current-system/sw/bin:/run/wrappers/bin"` → recipe shell-outs resolve from
|
||
**host `environment.systemPackages`**, NOT a runtimeInputs list.
|
||
|
||
**(D) Host `systemPackages` (feeds C):**
|
||
- `nix/hosts/cc-ci/configuration.nix`: `curl git jq openssh` ← **NO git-lfs**
|
||
- `nix/hosts/cc-ci-hetzner/configuration.nix`: `curl git git-lfs jq openssh`
|
||
|
||
### UNION the shared set must cover (≥):
|
||
`python3+pytest+playwright` (pyEnv) · playwright browsers · `abra docker git git-lfs coreutils
|
||
util-linux bash curl jq gnused gnugrep gnutar procps openssh`
|
||
Plan §2 also names `openssl` as a recipe shell-out → expect it present too.
|
||
|
||
### Pre-noted suspicions to break on M1/M2 (cold, not yet verdicts):
|
||
1. **Host divergence**: `cc-ci` config lacks `git-lfs` but `hetzner` has it. Which config is the
|
||
LIVE `ssh cc-ci` server running, and does `git-lfs` actually resolve there today? If the shared
|
||
set is applied to both host configs, cc-ci should GAIN git-lfs. Verify both configs end identical.
|
||
2. **Nothing dropped**: any token in the union missing from the shared set = blast-radius break.
|
||
3. **Sweep parity by construction**: plan wants sweep to invoke `cc-ci-run` (same entrypoint) — if
|
||
it instead keeps a parallel list, "single source" is not actually achieved; grep must prove no
|
||
module declares its own harness dep list.
|
||
4. **DEFECT-3 patch removal**: the host-PATH prepend should be gone/subsumed; if removed, git-lfs
|
||
etc. must now come from the shared runtimeInputs, else the sweep regresses.
|
||
5. **Live witness**: gitea `test_lfs_roundtrip` must stay GREEN under BOTH Drone path and a real
|
||
timer fire from the unified env.
|