Files
cc-ci/REVIEW-1c.md

9.3 KiB
Raw Blame History

REVIEW-1c.md — Adversary ledger for Phase 1c (Full reproducibility + genuine D8 live rebuild)

Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase1c-full-reproducibility.md Definition of Done: C1C7 (each must be Adversary-verified cold within 24h before DONE).

  • C1 — Secrets-repo split (cc-ci-secrets private repo, secrets-only, consumed via flake input; base stays one well-parameterized repo; nixosConfigurations.cc-ci still byte-identical to running).
  • C2 — Cert in git (wildcard cert+key are sops secrets in cc-ci-secrets, decrypted at activation; "operator drops a cert file" step gone; rebuild serves valid TLS from git-sourced cert).
  • C3 — All secrets in git, one exception (only out-of-band secret = bootstrap age key; everything else sops-encrypted in git).
  • C4 — Genuine throwaway-VM live rebuild (blank NixOS VM in terraform-ci, only bootstrap age key provisioned; clone base+secrets, nixos-rebuild switch, oneshots converge, cert+secrets decrypt, no manual step outside docs/install.md; Adversary performs cold).
  • C5 — Honest D8 (evidence rewritten: static byte-identical closure + live throwaway rebuild; "infeasible by design" removed; any limitation narrow + Adversary-signed-off).
  • C6 — Resource fit + cleanup (cc-nix-test 6→4 GB; throwaway VM at 4 GB; ≤~12 GB running guideline; throwaway destroyed after test; final sizing recorded in DECISIONS.md).
  • C7 — Docs (install.md/secrets.md/architecture.md + plan refs updated to new model; fresh engineer can stand up an instance).

Mapping to method milestones: W1→C6(headroom), W2→C1/C2/C3, W3→C4(VM), W4→C4(rebuild), W5→C4/C5(cold proof+honest D8), W6→C6/C7(cleanup+docs).

Standing rules: verify every claim from a COLD START (fresh shell, own clone, no cached state). Re-run the acceptance check myself. Veto power: ## VETO <reason> forbids DONE until cleared.


Cold-start baseline @2026-05-27 (Phase 1c kickoff)

Adversary loop entered. Observations from cold start:

  • git pull --rebase → up to date @ 492fa23 (Phase-1 DONE sign-off). No Phase-1c state files yet (STATUS-1c.md / BACKLOG-1c.md / JOURNAL-1c.md absent) — Builder has not begun 1c bootstrap. Nothing CLAIMED.
  • ssh cc-ci 'hostname && systemctl is-system-running'nixos / running (healthy, pre-refactor baseline).
  • SOCKS proxy 127.0.0.1:1055 and ssh cc-ci working. Incus skill present at /srv/incus-terraform-nix-vm-creator/skills/incus-terraform/SKILL.md.

No gates to verify yet. Idling until the Builder seeds 1c state and claims the first gate (watchdog will ping on CLAIM). Will keep break-it probes ready (greps for plaintext secrets in base + store; cert-in-git decrypt path; byte-identical drift; throwaway-VM rebuild cold-repro).

Pre-W2 cold baselines @2026-05-27 16:10Z (reference values for verifying C1/C2/C3 after W2)

Builder has bootstrapped 1c state; W2 in flight, not yet CLAIMED. Decisions recorded by Builder (DECISIONS.md): secrets linkage = git submodule (deviates from flake-input default — rationale: no private-repo fetch cred at nix-eval, keeps defaultSopsFile a local path = minimal change + trivially byte-identical); bootstrap key for throwaway = recovery age key via sops.age.keyFile.

Reference values to compare against after W2:

  • C1 byte-identical — running system toplevel: /nix/store/m1pdvbhlmlj3x3gn0x83rgwcgssks7qs-nixos-system-nixos-24.11.20250630.50ab793 (booted: 09ia5qd0jw0nghx83b4fijcg2jak9cp4-…). nixos-version 24.11.20250630.50ab793 (Vicuna). After the refactor, nixos-rebuild build .#cc-ci must produce the same toplevel (pure structural move ⇒ identical closure).
  • C2 cert content — out-of-band cert at cc-ci:/var/lib/ci-certs/live/: fullchain.pem 2909 B sha256 c1d96d61a43bfec10716e18d13832bd325ef173e9af01f197a48490481300080; privkey.pem 227 B sha256 9ec25d00910677718762713717b8c763da46fa7489e292b057e916a252d0ca42 (EC key). After W2 these must be sops-decrypted from git to the same path with the same hashes, and the operator-cert-drop precondition framing in proxy.nix must be gone.
  • C3 no-plaintext — base repo clean: secrets/secrets.yaml is sops ENC[AES256_GCM,…]; git grep for BEGIN … PRIVATE KEY|BEGIN CERTIFICATE outside secrets/ = 0 matches; no *.pem/*.key/*.crt/*.p12/*.pfx tracked. After W2: cert+key must be ENC[…] in cc-ci-secrets, never plaintext; base must stay clean; also grep the Nix store for decrypted secret material at activation.

Things to scrutinize hard when W2 is CLAIMED:

  1. Submodule actually points at a private recipe-maintainers/cc-ci-secrets holding only encrypted secrets (no code/config logic).
  2. Byte-identical: same toplevel store path (or differences are only expected & explained — zero functional drift).
  3. Cert genuinely served from the git-sourced cert after switch (live TLS handshake on a *.ci.commoninternet.net host), not the stale out-of-band file.
  4. All D1D10 still hold after the refactor (no regression) — spot-check the live system health + a !testme-path sanity check before DONE.

Interim probe @2026-05-27 16:22Z — cc-ci-secrets repo (pre-W2-gate; not a gate verdict)

Independent cold check of the new secrets repo (Builder W2 step 1, commit f972bc1), via Gitea API with bot creds:

  • recipe-maintainers/cc-ci-secrets exists, private: True, non-empty. Top-level: .sops.yaml, README.md, secrets.yaml (no code / no config logic — matches §2's "encrypted secrets only"; README is doc-only and leak-clean).
  • secrets.yaml: all 8 keys ENC[...] — 6 infra (test_secret, drone_rpc_secret, drone_gitea_client_secret, bridge_drone_token, bridge_gitea_token, bridge_webhook_hmac) + wildcard_cert + wildcard_key. 0 plaintext PEM/cert markers; sops mac metadata present. → cert+key genuinely moved into sops-in-git (C2/C3 secrets-side looks good).
  • Layout nuance: secrets file is at repo root secrets.yaml; Builder will mount the submodule at base secrets/ so it resolves to secrets/secrets.yaml. OK for the submodule linkage.

Not yet verifiable (needs W2 base-switch + activation): byte-identical build==running (C1), cert sops-decrypts to the same hashes at /var/lib/ci-certs/live/ (C2 — must match fullchain c1d96d61…, privkey 9ec25d00…), no plaintext leak into the Nix store, live TLS from git-cert, and no D1D10 regression. Will run these when Gate W2 is CLAIMED.

W2: PASS @2026-05-27 16:55Z — secrets-split + cert-in-git (verifies C1, C2, C3) — COLD

Gate W2 CLAIMED by Builder (commits f972bc1/f79e542/faa3709; running toplevel vh6vwxbl…). Verified independently from a cold start (fresh clone on cc-ci, own checks, no reliance on the Builder's /root/cc-ci):

(1) Byte-identical build==running (C1) — PASS. Fresh recursive clone of origin/main (HEAD 0633aa7) on cc-ci into /tmp/advverify, submodule secrets2312f1c initialized with bot creds (via http.extraheader, not URL/args), secrets/secrets.yaml present + ENC[…]. nixos-rebuild build --flake 'git+file:///tmp/advverify?submodules=1#cc-ci'/nix/store/vh6vwxbl4qr9whzpwgjimhf9gn4329p8-nixos-system-… == /run/current-system (readlink -f identical). Zero drift — the currently published repo+submodule reproduces the currently running system byte-for-byte. Base stays one parameterized repo; only secrets/ is the external private submodule.

(2) Cert in git + live TLS (C2) — PASS. /var/lib/ci-certs/live/{fullchain.pem,privkey.pem} are now symlinks → /run/secrets/wildcard_cert,wildcard_key (sops-decrypted at activation), not out-of-band files. File sha256 c1d96d61…/9ec25d00… == my pre-W2 operator-cert baseline (byte-identical cert, now git-sourced). secrets.nix adds wildcard_cert(0444)/wildcard_key(0400) with a comment that this "Replaces the prior operator-drops-a-cert-file step." Live HTTPS https://ci.commoninternet.net via proxy → http_code=200, ssl_verify_result=0, served leaf = LE *.ci.commoninternet.net (SAN *.ci+bare), valid 2026-05-26→08-24. Served leaf fingerprint 57:8D:67:9E:FE:89:…:B8:A6 == the git-sourced cert's leaf fingerprint (computed locally from the decrypted file) → live TLS provably served from the git cert, full chain of custody intact.

(3) No plaintext leak (C3) — PASS. Base repo: secrets/ is a gitlink (.gitmodules→ private cc-ci-secrets); no *.pem/*.key tracked; git grep BEGIN…PRIVATE KEY|CERTIFICATE outside REVIEW text = 0. cc-ci-secrets: all 8 secrets ENC[…] (6 infra + cert + key), 0 plaintext PEM, valid sops MAC, private repo. On the host: secrets decrypt to /run/secrets.d (ramfs, in-memory), not the world-readable store; no private key found in the system-closure store dirs.

Non-regression: systemctl is-system-running=running, 0 failed units; swarm stack all 1/1 (traefik v3.6.15, drone 2.26.0, ccci-bridge, ccci-dashboard, backups), drone-runner-exec running; reconcile oneshots converged. No D1D10 regression observed.

C1, C2, C3 Adversary-PASS (24h freshness clock starts now; will be re-exercised on the blank host at C4). Remaining for DONE: C4 (genuine throwaway-VM live rebuild), C5 (honest D8), C6 (resize+cleanup), C7 (docs). No VETO.