The orchestrator Pi is retired (2026-05-31). All agents now run on the cc-ci-orchestrator VM (NixOS, loops user, /srv/cc-ci). The VM is a direct tailnet peer to cc-ci — no SOCKS proxy, no userspace tailscaled, no ProxyCommand. Updated across all affected files: AGENTS.md - Remove Pi from reboot description; migration complete (not "parked") - cc-ci access: direct ssh, not via proxy kickoff.md - Prerequisites: direct tailnet peer, not proxy - Host deps: NixOS (not apt) - Fallback/Incus: b1 reachable directly, no --proxy curl flag plan.md §1 + §1.5 - §1 bootstrap: direct SSH, check tailscale status (not restart proxy) - §1.5 intro: "VM" not "sandbox host"; no proxy - Credentials table: remove TS_AUTH_KEY row; update cc-ci SSH row - Replace "Tailscale connection (proxy)" subsection with direct-peer description plan-orchestrator-migration.md - Mark COMPLETE (2026-05-31); historical record only plan-phase1c-full-reproducibility.md - Incus access: direct, not via SOCKS proxy prompts/builder.md + prompts/adversary.md - cc-ci access language only: direct ssh, no proxy restart instructions - adversary: *.ci.commoninternet.net via plain curl, no proxy flag REBOOTS.md - Retitle for VM; note Pi retired; Pi entries marked historical systemd/cc-ci-loops.service - User/Group/HOME/PATH: notplants → loops - Remove cc-ci-tailscaled.service dependency (no proxy on VM) - Add note about nix/configuration.nix as the authoritative VM declaration test-e2e-testme-acceptance.md - tailscale status: no --socket flag - ssh to throwaway: no ProxyCommand Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
125 lines
7.0 KiB
Markdown
125 lines
7.0 KiB
Markdown
# Acceptance test — real end-to-end `!testme` on the clean-room-rebuilt VM
|
||
|
||
**Owner:** the Builder + Adversary loops (they execute *and* independently verify this).
|
||
**When:** after **C4/C5 PASS** (genuine throwaway-VM clean-room rebuild verified). The Builder then
|
||
performs the tailnet swap (§1) and runs the e2e; the Adversary independently verifies. It is the
|
||
**functional acceptance** of D8/clean-room: proof that the rebuilt-from-git VM doesn't just match
|
||
byte-for-byte, but actually *serves a real CI run end-to-end through the public domain*.
|
||
**This file:** `/srv/cc-ci/cc-ci-plan/test-e2e-testme-acceptance.md`
|
||
|
||
---
|
||
|
||
## 0. Why
|
||
|
||
The reproducibility gates (C1–C5) prove the rebuilt VM is structurally identical and boots clean.
|
||
This test proves it is **operationally** a working CI server: a maintainer comment triggers a build,
|
||
the app deploys and is reachable on its real public URL through the operator's gateway, the test
|
||
passes, and it tears down — the whole `!testme` pipeline, on the from-git VM, over the real domain.
|
||
|
||
---
|
||
|
||
## 1. Setup — the Builder performs the tailnet swap (then the e2e)
|
||
|
||
The rebuilt throwaway must become the live `cc-nix-test` so that the public gateway routes real
|
||
`ci.commoninternet.net` traffic to it (the gateway TLS-passthroughs via MagicDNS to
|
||
`cc-nix-test.taila4a0bf.ts.net` and re-resolves every ~10s, so it auto-follows the name). The swap is
|
||
**two reversible `tailscale set --hostname` commands** on VMs you already control — the Builder does
|
||
it. **Do this only after C4/C5 PASS** and after the rebuilt VM's full stack
|
||
(traefik + bridge + drone + dashboard) is up and serving locally.
|
||
|
||
**Order matters** (rename the original *aside first*, or the throwaway will get `cc-nix-test-1`):
|
||
|
||
1. **Rename the original prod VM aside** (it stays running — do NOT destroy it; needed for swap-back):
|
||
```
|
||
ssh cc-ci 'tailscale set --hostname=cc-nix-test-orig'
|
||
```
|
||
(`ssh cc-ci` is pinned to the original's IP `100.90.116.4`, so it keeps reaching the original
|
||
regardless of the name change.)
|
||
2. **Rename the rebuilt throwaway → `cc-nix-test`.** Re-derive its current tailscale IP (throwaways
|
||
get a fresh IP each rebuild): pick the ONLINE throwaway node from
|
||
`tailscale status | grep -i throwaway`, then:
|
||
```
|
||
ssh -i /srv/incus-terraform-nix-vm-creator/terraform-secrets/vm_ssh_key \
|
||
root@<throwaway-ip> \
|
||
'tailscale set --hostname=cc-nix-test'
|
||
```
|
||
(The orchestrator VM is a direct tailnet peer — no ProxyCommand needed.)
|
||
|
||
**Heads-up — tailnet-wide effect:** after the swap, `cc-nix-test.taila4a0bf.ts.net` resolves to the
|
||
rebuilt VM for *everyone* on the tailnet, so any of your own tooling that targets cc-nix-test **by
|
||
MagicDNS name** will now hit the rebuilt VM (tooling pinned to the raw IP `100.90.116.4` still hits
|
||
the original). Account for that when you point `!testme`/deploys.
|
||
|
||
**Verify the swap took (P1+P2) before starting the e2e** — must pass:
|
||
```
|
||
tailscale status | grep cc-nix-test # → the throwaway's IP
|
||
curl -sS -o /dev/null -w '%{http_code} ssl_verify=%{ssl_verify_result}\n' https://ci.commoninternet.net/
|
||
# expect: 200 ssl_verify=0 (real public path now served by the rebuilt VM, valid cert)
|
||
```
|
||
|
||
**Swap-back when testing is done** (reversible): rename the throwaway back to its old name, then
|
||
`ssh cc-ci 'tailscale set --hostname=cc-nix-test'` to restore the original; the gateway re-follows.
|
||
|
||
---
|
||
|
||
## 2. Procedure
|
||
|
||
1. **Pick one fast, already-enrolled recipe.** Prefer the lightest enrolled app (e.g. `custom-html`)
|
||
so the run is quick and resource-cheap. Note the recipe + the repo/issue or PR where `!testme` is
|
||
recognised (the same place prior runs were triggered).
|
||
2. **Record the baseline.** Capture the recipe's *current* latest Drone run number and the dashboard
|
||
row (`https://ci.commoninternet.net/` and `https://drone.ci.commoninternet.net/...`) so you can
|
||
prove the run you trigger is **new**.
|
||
3. **Trigger via the real path.** Post `!testme` as the **bot** (the normal maintainer-comment
|
||
trigger) on that recipe — exactly as a real maintainer would. Do **not** invoke Drone directly or
|
||
shortcut the bridge; the comment→bridge→Drone path is part of what's under test.
|
||
4. **Confirm the bridge picked it up.** Within the bridge's poll interval, a **new** Drone build for
|
||
that recipe starts. Capture the new run number (must be > the baseline from step 2).
|
||
5. **Confirm the app deploys and is reachable on its PUBLIC URL.** While the build runs, the app is
|
||
deployed to its `*.ci.commoninternet.net` test domain. From **off the VM** (external — through the
|
||
gateway, not `localhost`/`127.0.0.1`), confirm a real request succeeds:
|
||
```
|
||
curl -sS -D- -o /dev/null https://<app-test-subdomain>.ci.commoninternet.net/
|
||
# expect: HTTP 200 (or the app's expected status), valid *.ci.commoninternet.net cert,
|
||
# served content from the deployed app — NOT a Traefik 404 / default-cert.
|
||
```
|
||
This is the crux: it proves routing public-DNS → gateway → MagicDNS → rebuilt VM → Traefik →
|
||
deployed app all works on the rebuilt server.
|
||
6. **Confirm the test logic passed.** The Drone build runs the recipe's real test assertions (app
|
||
state, not health-only) and finishes **success**.
|
||
7. **Confirm teardown.** After the run, the app is **undeployed** (no leftover stack/containers), per
|
||
the standard post-run cleanup — verify it's gone.
|
||
8. **Confirm the result was reported.** The outcome posts back to the trigger location and the
|
||
dashboard row updates to the new run with `success`.
|
||
|
||
---
|
||
|
||
## 3. Pass criteria (all must hold; Adversary verifies independently)
|
||
|
||
- [ ] **E1.** Self-check §1 passed (`ci.commoninternet.net` = 200, valid cert, on the rebuilt VM).
|
||
- [ ] **E2.** Posting `!testme` produced a **new** Drone build (run # > baseline) via the bridge —
|
||
not a manual Drone trigger.
|
||
- [ ] **E3.** The deployed app answered an **external** request on its real
|
||
`<app>.ci.commoninternet.net` URL (through the gateway) with the expected response + valid cert
|
||
— captured with headers/body evidence.
|
||
- [ ] **E4.** The Drone build's **real test assertions** ran and the build finished **success**
|
||
(no skipped/softened tests).
|
||
- [ ] **E5.** The app **undeployed** cleanly afterward (no residual stack).
|
||
- [ ] **E6.** Result reported back + dashboard updated to the new successful run.
|
||
|
||
Evidence (run #, the external `curl` headers/body, dashboard before/after, undeploy proof) is logged
|
||
in `JOURNAL-1c.md`, and the verdict in `REVIEW-1c.md` / `STATUS-1c.md` as **E2E-TESTME — PASS**.
|
||
|
||
## 4. If it fails
|
||
|
||
Treat as a clean-room finding, not a config patch: a failure here means the from-git rebuild is
|
||
missing something the running server had out-of-band (a secret, a manual step, drift). Capture the
|
||
failing stage + logs in `JOURNAL-1c.md`, raise it as a blocker, and fix it in the **git source**
|
||
(base or `cc-ci-secrets`) so the next rebuild includes it — do **not** hand-fix the live VM. Re-run
|
||
this test after the fix.
|
||
|
||
## 5. Bound
|
||
|
||
One recipe, one green run. This is a functional smoke test of the rebuilt VM, not a full recipe-test
|
||
campaign (that's Phase 2). Don't expand scope here.
|