3.7 KiB
3.7 KiB
BACKLOG — Phase 2pc (sane image-prune policy)
SSOT: /srv/cc-ci/cc-ci-plan/plan-phase2pc-image-cache.md.
Scope (post operator correction 2026-05-29): PC1 prune policy + confirm local-store
retention/auth ONLY. The registry:2 pull-through cache is dropped (deferred to IDEAS /
Phase 2b — revisit only if multi-node OR a measured cold-deploy bottleneck on recreate-surviving
storage).
Build backlog
- PC1 — Conservative prune policy. Remove
virtualisation.docker.autoPrune(--allevicts in-use base images → forced cold re-pull → rate-limit). Replace with a surgical, gated prune: dangling +until=24honly, NEVER--all/--volumes; gated on (a) genuine disk pressure (/≥ 80%), (b) no run-app stack live, (c) no swarm service converging (mid-pull). Teardown already removes only services/volumes/secrets/.env — NOT images (verified) — keep it that way. - PC2 — Confirm local cache retained + authenticated. Daemon stays PAT-authenticated
(
docker infoUsername=nptest2, sopsdockerhub_auth→/root/.docker/config.json); local image store/var/lib/dockerpersists across runs/teardowns/reboots. No code change expected — confirm + document. - PC3 — Verify + document. Deploy → teardown → redeploy reuses local layers (no
re-download); disk bounded without
-af. Updatedocs/runbook.md+docs/prune note; record the policy + the dropped-registry-cache deviation inDECISIONS.md.
Adversary findings
- F2pc-1 [adversary] CLOSED @2026-05-29 (re-verified, re-claim
9e73ebd). Builder renamed committed unitsdocker-prune→ci-docker-prune(b9bbd25; NixOS reservesdocker-prune). Re-verified:git show HEAD:nix/modules/{docker-prune,swarm}.nixbyte-identical to host/root/cc-ci; committed units =ci-docker-prune.*= live (enabled+active); olddocker-prune.timernot-found. git now reproduces the verified system → CLOSED by Adversary. F2pc-1 [adversary] BLOCKING — committed code ≠ deployed/"verified" host (gate 2pc, claimThe verified prune behavior is correct, but git does not reproduce the verified system. - Observed. origin/main HEADde6103d).de6103dnix/modules/docker-prune.nix:56,67definessystemd.services.docker-prune/systemd.timers.docker-prune. The live host runsci-docker-prune.service/.timer(enabled+active), built from uncommitted source in/root/cc-ci(not a git repo; its module names unitsci-docker-prune). STATUS-2pc's verify commands also useci-docker-prune.timer. - Repro.cd /srv/cc-ci/cc-ci-adv && grep -nE 'systemd\.(services|timers)\.' nix/modules/docker-prune.nix→docker-prune.ssh cc-ci 'systemctl is-active ci-docker-prune.timer; systemctl is-enabled docker-prune.timer'→active/not-found. So a from-git rebuild createsdocker-prune.*(≠ verifiedci-docker-prune.*); a verifier following STATUS against a git-built host gets false FAIL. - Impact. D8/fresh-rebuild contract: the "deployed+verified" artifact was never committed. Functionally equivalent (samecc-ci-docker-prunescript body), so this is a reproducibility/integrity defect, not behavioral. - To clear (Builder). Make git == host: commit the deployedci-docker-prunenaming (push/root/cc-ci's module), OR rename module units todocker-prune+nixos-rebuild switch+ fix STATUS verify cmds. Confirm staledocker-prune.service(linked,ignored) leftover GC's cleanly. Then re-claim; only the Adversary closes this after re-verifying the committed rev builds the units STATUS documents.