The orchestrator Pi is retired (2026-05-31). All agents now run on the cc-ci-orchestrator VM (NixOS, loops user, /srv/cc-ci). The VM is a direct tailnet peer to cc-ci — no SOCKS proxy, no userspace tailscaled, no ProxyCommand. Updated across all affected files: AGENTS.md - Remove Pi from reboot description; migration complete (not "parked") - cc-ci access: direct ssh, not via proxy kickoff.md - Prerequisites: direct tailnet peer, not proxy - Host deps: NixOS (not apt) - Fallback/Incus: b1 reachable directly, no --proxy curl flag plan.md §1 + §1.5 - §1 bootstrap: direct SSH, check tailscale status (not restart proxy) - §1.5 intro: "VM" not "sandbox host"; no proxy - Credentials table: remove TS_AUTH_KEY row; update cc-ci SSH row - Replace "Tailscale connection (proxy)" subsection with direct-peer description plan-orchestrator-migration.md - Mark COMPLETE (2026-05-31); historical record only plan-phase1c-full-reproducibility.md - Incus access: direct, not via SOCKS proxy prompts/builder.md + prompts/adversary.md - cc-ci access language only: direct ssh, no proxy restart instructions - adversary: *.ci.commoninternet.net via plain curl, no proxy flag REBOOTS.md - Retitle for VM; note Pi retired; Pi entries marked historical systemd/cc-ci-loops.service - User/Group/HOME/PATH: notplants → loops - Remove cc-ci-tailscaled.service dependency (no proxy on VM) - Add note about nix/configuration.nix as the authoritative VM declaration test-e2e-testme-acceptance.md - tailscale status: no --socket flag - ssh to throwaway: no ProxyCommand Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.0 KiB
Acceptance test — real end-to-end !testme on the clean-room-rebuilt VM
Owner: the Builder + Adversary loops (they execute and independently verify this).
When: after C4/C5 PASS (genuine throwaway-VM clean-room rebuild verified). The Builder then
performs the tailnet swap (§1) and runs the e2e; the Adversary independently verifies. It is the
functional acceptance of D8/clean-room: proof that the rebuilt-from-git VM doesn't just match
byte-for-byte, but actually serves a real CI run end-to-end through the public domain.
This file: /srv/cc-ci/cc-ci-plan/test-e2e-testme-acceptance.md
0. Why
The reproducibility gates (C1–C5) prove the rebuilt VM is structurally identical and boots clean.
This test proves it is operationally a working CI server: a maintainer comment triggers a build,
the app deploys and is reachable on its real public URL through the operator's gateway, the test
passes, and it tears down — the whole !testme pipeline, on the from-git VM, over the real domain.
1. Setup — the Builder performs the tailnet swap (then the e2e)
The rebuilt throwaway must become the live cc-nix-test so that the public gateway routes real
ci.commoninternet.net traffic to it (the gateway TLS-passthroughs via MagicDNS to
cc-nix-test.taila4a0bf.ts.net and re-resolves every ~10s, so it auto-follows the name). The swap is
two reversible tailscale set --hostname commands on VMs you already control — the Builder does
it. Do this only after C4/C5 PASS and after the rebuilt VM's full stack
(traefik + bridge + drone + dashboard) is up and serving locally.
Order matters (rename the original aside first, or the throwaway will get cc-nix-test-1):
- Rename the original prod VM aside (it stays running — do NOT destroy it; needed for swap-back):
(
ssh cc-ci 'tailscale set --hostname=cc-nix-test-orig'ssh cc-ciis pinned to the original's IP100.90.116.4, so it keeps reaching the original regardless of the name change.) - Rename the rebuilt throwaway →
cc-nix-test. Re-derive its current tailscale IP (throwaways get a fresh IP each rebuild): pick the ONLINE throwaway node fromtailscale status | grep -i throwaway, then:(The orchestrator VM is a direct tailnet peer — no ProxyCommand needed.)ssh -i /srv/incus-terraform-nix-vm-creator/terraform-secrets/vm_ssh_key \ root@<throwaway-ip> \ 'tailscale set --hostname=cc-nix-test'
Heads-up — tailnet-wide effect: after the swap, cc-nix-test.taila4a0bf.ts.net resolves to the
rebuilt VM for everyone on the tailnet, so any of your own tooling that targets cc-nix-test by
MagicDNS name will now hit the rebuilt VM (tooling pinned to the raw IP 100.90.116.4 still hits
the original). Account for that when you point !testme/deploys.
Verify the swap took (P1+P2) before starting the e2e — must pass:
tailscale status | grep cc-nix-test # → the throwaway's IP
curl -sS -o /dev/null -w '%{http_code} ssl_verify=%{ssl_verify_result}\n' https://ci.commoninternet.net/
# expect: 200 ssl_verify=0 (real public path now served by the rebuilt VM, valid cert)
Swap-back when testing is done (reversible): rename the throwaway back to its old name, then
ssh cc-ci 'tailscale set --hostname=cc-nix-test' to restore the original; the gateway re-follows.
2. Procedure
- Pick one fast, already-enrolled recipe. Prefer the lightest enrolled app (e.g.
custom-html) so the run is quick and resource-cheap. Note the recipe + the repo/issue or PR where!testmeis recognised (the same place prior runs were triggered). - Record the baseline. Capture the recipe's current latest Drone run number and the dashboard
row (
https://ci.commoninternet.net/andhttps://drone.ci.commoninternet.net/...) so you can prove the run you trigger is new. - Trigger via the real path. Post
!testmeas the bot (the normal maintainer-comment trigger) on that recipe — exactly as a real maintainer would. Do not invoke Drone directly or shortcut the bridge; the comment→bridge→Drone path is part of what's under test. - Confirm the bridge picked it up. Within the bridge's poll interval, a new Drone build for that recipe starts. Capture the new run number (must be > the baseline from step 2).
- Confirm the app deploys and is reachable on its PUBLIC URL. While the build runs, the app is
deployed to its
*.ci.commoninternet.nettest domain. From off the VM (external — through the gateway, notlocalhost/127.0.0.1), confirm a real request succeeds:This is the crux: it proves routing public-DNS → gateway → MagicDNS → rebuilt VM → Traefik → deployed app all works on the rebuilt server.curl -sS -D- -o /dev/null https://<app-test-subdomain>.ci.commoninternet.net/ # expect: HTTP 200 (or the app's expected status), valid *.ci.commoninternet.net cert, # served content from the deployed app — NOT a Traefik 404 / default-cert. - Confirm the test logic passed. The Drone build runs the recipe's real test assertions (app state, not health-only) and finishes success.
- Confirm teardown. After the run, the app is undeployed (no leftover stack/containers), per the standard post-run cleanup — verify it's gone.
- Confirm the result was reported. The outcome posts back to the trigger location and the
dashboard row updates to the new run with
success.
3. Pass criteria (all must hold; Adversary verifies independently)
- E1. Self-check §1 passed (
ci.commoninternet.net= 200, valid cert, on the rebuilt VM). - E2. Posting
!testmeproduced a new Drone build (run # > baseline) via the bridge — not a manual Drone trigger. - E3. The deployed app answered an external request on its real
<app>.ci.commoninternet.netURL (through the gateway) with the expected response + valid cert — captured with headers/body evidence. - E4. The Drone build's real test assertions ran and the build finished success (no skipped/softened tests).
- E5. The app undeployed cleanly afterward (no residual stack).
- E6. Result reported back + dashboard updated to the new successful run.
Evidence (run #, the external curl headers/body, dashboard before/after, undeploy proof) is logged
in JOURNAL-1c.md, and the verdict in REVIEW-1c.md / STATUS-1c.md as E2E-TESTME — PASS.
4. If it fails
Treat as a clean-room finding, not a config patch: a failure here means the from-git rebuild is
missing something the running server had out-of-band (a secret, a manual step, drift). Capture the
failing stage + logs in JOURNAL-1c.md, raise it as a blocker, and fix it in the git source
(base or cc-ci-secrets) so the next rebuild includes it — do not hand-fix the live VM. Re-run
this test after the fix.
5. Bound
One recipe, one green run. This is a functional smoke test of the rebuilt VM, not a full recipe-test campaign (that's Phase 2). Don't expand scope here.