Files
cc-ci/machine-docs/REVIEW-2pc.md

2.8 KiB

REVIEW-2pc — Adversary verdicts for Phase 2pc (image pull-through cache + sane prune)

SSOT: /srv/cc-ci/cc-ci-plan/plan-phase2pc-image-cache.md. DoD = PC1 + PC2 + PC3, each Adversary cold-verified here before Builder may write ## DONE to STATUS-2pc.md.

Status: AWAITING CLAIM

Phase 2pc opened 2026-05-29 (operator interjection into paused Phase 2). As of this file's creation the Builder has not bootstrapped 2pc (no STATUS-2pc.md, no claim(2pc…)). No gate is claimed → no verdict yet. Watching origin/main; will cold-verify on first claim.

Pre-claim baseline recon (read-only; NOT a verdict — just what "before" looks like)

  • PC1 / prune. Current prune policy lives in nix/modules/swarm.nix:15-19: autoPrune = { enable=true; dates="daily"; flags=["--all" "--filter" "until=24h"]; }.
    • --all removes any image not used by a container in 24h → would evict warm base images between runs (PC1's complaint). No --volumes (correct — warm volumes survive).
    • The destructive docker image prune -af churn cited in JOURNAL-2 was a manual operator action mid-deploy (JOURNAL-2:507,690-693), not this systemd unit.
    • PC1 acceptance to run cold: confirm (a) no reflexive -af remains in harness/janitor code paths; (b) prune never fires during an active deploy/run; (c) a normal run does NOT evict cached base images; (d) disk stays bounded without -af.
  • PC2 / pull-through cache. Does NOT exist yet — no registry:2, registry-mirrors, registry-1.docker.io, or pull-through config anywhere in repo (nix/, runner/). Expect a new nix/modules/registry-cache.nix (Nix-reconciled service) + daemon virtualisation.docker.daemon.settings.registry-mirrors + sops PAT (nptest2) for upstream auth.
    • PC2 acceptance to run cold: (a) 2nd deploy of an image pulls from cache not Docker Hub (cache logs / measured pull-time drop); (b) survives a prune (re-pull is local, not a Hub hit); (c) measured cold-vs-warm deploy speedup; (d) cache-miss pulls authenticate.
  • PC3 / bounded+documented. Cache must have disk cap / own GC (LRU/old eviction); scope = docker.io only; docs/ notes cache+prune policy; deviations in DECISIONS.md.

Break-it probes to run once PC2 lands (anti-anchoring checklist)

  • Cache must NOT mask a genuinely-broken image pull (cardinal rule — don't weaken a test). Probe: request a nonexistent/garbage tag through the mirror → must still FAIL, not serve stale.
  • registry-mirrors must be transparent to abra/swarm — verify a real abra deploy's pulls traverse the cache with NO command change / no pull special-casing.
  • Cache survives a D8-style rebuild as a service (Nix-reconciled), but its contents are NOT in the git closure (re-warmed by pulls) — verify both halves.
  • PAT secret must not leak into published logs / dashboard / world-readable registry config.