Files
cc-ci-orchestrator/cc-ci-plan/test-e2e-testme-acceptance.md
autonomic-bot 01874821f2 decommission Pi: update all docs for VM-only setup
The orchestrator Pi is retired (2026-05-31). All agents now run on the
cc-ci-orchestrator VM (NixOS, loops user, /srv/cc-ci). The VM is a
direct tailnet peer to cc-ci — no SOCKS proxy, no userspace tailscaled,
no ProxyCommand. Updated across all affected files:

AGENTS.md
  - Remove Pi from reboot description; migration complete (not "parked")
  - cc-ci access: direct ssh, not via proxy

kickoff.md
  - Prerequisites: direct tailnet peer, not proxy
  - Host deps: NixOS (not apt)
  - Fallback/Incus: b1 reachable directly, no --proxy curl flag

plan.md §1 + §1.5
  - §1 bootstrap: direct SSH, check tailscale status (not restart proxy)
  - §1.5 intro: "VM" not "sandbox host"; no proxy
  - Credentials table: remove TS_AUTH_KEY row; update cc-ci SSH row
  - Replace "Tailscale connection (proxy)" subsection with direct-peer description

plan-orchestrator-migration.md
  - Mark COMPLETE (2026-05-31); historical record only

plan-phase1c-full-reproducibility.md
  - Incus access: direct, not via SOCKS proxy

prompts/builder.md + prompts/adversary.md
  - cc-ci access language only: direct ssh, no proxy restart instructions
  - adversary: *.ci.commoninternet.net via plain curl, no proxy flag

REBOOTS.md
  - Retitle for VM; note Pi retired; Pi entries marked historical

systemd/cc-ci-loops.service
  - User/Group/HOME/PATH: notplants → loops
  - Remove cc-ci-tailscaled.service dependency (no proxy on VM)
  - Add note about nix/configuration.nix as the authoritative VM declaration

test-e2e-testme-acceptance.md
  - tailscale status: no --socket flag
  - ssh to throwaway: no ProxyCommand

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 00:16:37 +00:00

125 lines
7.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Acceptance test — real end-to-end `!testme` on the clean-room-rebuilt VM
**Owner:** the Builder + Adversary loops (they execute *and* independently verify this).
**When:** after **C4/C5 PASS** (genuine throwaway-VM clean-room rebuild verified). The Builder then
performs the tailnet swap (§1) and runs the e2e; the Adversary independently verifies. It is the
**functional acceptance** of D8/clean-room: proof that the rebuilt-from-git VM doesn't just match
byte-for-byte, but actually *serves a real CI run end-to-end through the public domain*.
**This file:** `/srv/cc-ci/cc-ci-plan/test-e2e-testme-acceptance.md`
---
## 0. Why
The reproducibility gates (C1C5) prove the rebuilt VM is structurally identical and boots clean.
This test proves it is **operationally** a working CI server: a maintainer comment triggers a build,
the app deploys and is reachable on its real public URL through the operator's gateway, the test
passes, and it tears down — the whole `!testme` pipeline, on the from-git VM, over the real domain.
---
## 1. Setup — the Builder performs the tailnet swap (then the e2e)
The rebuilt throwaway must become the live `cc-nix-test` so that the public gateway routes real
`ci.commoninternet.net` traffic to it (the gateway TLS-passthroughs via MagicDNS to
`cc-nix-test.taila4a0bf.ts.net` and re-resolves every ~10s, so it auto-follows the name). The swap is
**two reversible `tailscale set --hostname` commands** on VMs you already control — the Builder does
it. **Do this only after C4/C5 PASS** and after the rebuilt VM's full stack
(traefik + bridge + drone + dashboard) is up and serving locally.
**Order matters** (rename the original *aside first*, or the throwaway will get `cc-nix-test-1`):
1. **Rename the original prod VM aside** (it stays running — do NOT destroy it; needed for swap-back):
```
ssh cc-ci 'tailscale set --hostname=cc-nix-test-orig'
```
(`ssh cc-ci` is pinned to the original's IP `100.90.116.4`, so it keeps reaching the original
regardless of the name change.)
2. **Rename the rebuilt throwaway → `cc-nix-test`.** Re-derive its current tailscale IP (throwaways
get a fresh IP each rebuild): pick the ONLINE throwaway node from
`tailscale status | grep -i throwaway`, then:
```
ssh -i /srv/incus-terraform-nix-vm-creator/terraform-secrets/vm_ssh_key \
root@<throwaway-ip> \
'tailscale set --hostname=cc-nix-test'
```
(The orchestrator VM is a direct tailnet peer — no ProxyCommand needed.)
**Heads-up — tailnet-wide effect:** after the swap, `cc-nix-test.taila4a0bf.ts.net` resolves to the
rebuilt VM for *everyone* on the tailnet, so any of your own tooling that targets cc-nix-test **by
MagicDNS name** will now hit the rebuilt VM (tooling pinned to the raw IP `100.90.116.4` still hits
the original). Account for that when you point `!testme`/deploys.
**Verify the swap took (P1+P2) before starting the e2e** — must pass:
```
tailscale status | grep cc-nix-test # → the throwaway's IP
curl -sS -o /dev/null -w '%{http_code} ssl_verify=%{ssl_verify_result}\n' https://ci.commoninternet.net/
# expect: 200 ssl_verify=0 (real public path now served by the rebuilt VM, valid cert)
```
**Swap-back when testing is done** (reversible): rename the throwaway back to its old name, then
`ssh cc-ci 'tailscale set --hostname=cc-nix-test'` to restore the original; the gateway re-follows.
---
## 2. Procedure
1. **Pick one fast, already-enrolled recipe.** Prefer the lightest enrolled app (e.g. `custom-html`)
so the run is quick and resource-cheap. Note the recipe + the repo/issue or PR where `!testme` is
recognised (the same place prior runs were triggered).
2. **Record the baseline.** Capture the recipe's *current* latest Drone run number and the dashboard
row (`https://ci.commoninternet.net/` and `https://drone.ci.commoninternet.net/...`) so you can
prove the run you trigger is **new**.
3. **Trigger via the real path.** Post `!testme` as the **bot** (the normal maintainer-comment
trigger) on that recipe — exactly as a real maintainer would. Do **not** invoke Drone directly or
shortcut the bridge; the comment→bridge→Drone path is part of what's under test.
4. **Confirm the bridge picked it up.** Within the bridge's poll interval, a **new** Drone build for
that recipe starts. Capture the new run number (must be > the baseline from step 2).
5. **Confirm the app deploys and is reachable on its PUBLIC URL.** While the build runs, the app is
deployed to its `*.ci.commoninternet.net` test domain. From **off the VM** (external — through the
gateway, not `localhost`/`127.0.0.1`), confirm a real request succeeds:
```
curl -sS -D- -o /dev/null https://<app-test-subdomain>.ci.commoninternet.net/
# expect: HTTP 200 (or the app's expected status), valid *.ci.commoninternet.net cert,
# served content from the deployed app — NOT a Traefik 404 / default-cert.
```
This is the crux: it proves routing public-DNS → gateway → MagicDNS → rebuilt VM → Traefik →
deployed app all works on the rebuilt server.
6. **Confirm the test logic passed.** The Drone build runs the recipe's real test assertions (app
state, not health-only) and finishes **success**.
7. **Confirm teardown.** After the run, the app is **undeployed** (no leftover stack/containers), per
the standard post-run cleanup — verify it's gone.
8. **Confirm the result was reported.** The outcome posts back to the trigger location and the
dashboard row updates to the new run with `success`.
---
## 3. Pass criteria (all must hold; Adversary verifies independently)
- [ ] **E1.** Self-check §1 passed (`ci.commoninternet.net` = 200, valid cert, on the rebuilt VM).
- [ ] **E2.** Posting `!testme` produced a **new** Drone build (run # > baseline) via the bridge —
not a manual Drone trigger.
- [ ] **E3.** The deployed app answered an **external** request on its real
`<app>.ci.commoninternet.net` URL (through the gateway) with the expected response + valid cert
— captured with headers/body evidence.
- [ ] **E4.** The Drone build's **real test assertions** ran and the build finished **success**
(no skipped/softened tests).
- [ ] **E5.** The app **undeployed** cleanly afterward (no residual stack).
- [ ] **E6.** Result reported back + dashboard updated to the new successful run.
Evidence (run #, the external `curl` headers/body, dashboard before/after, undeploy proof) is logged
in `JOURNAL-1c.md`, and the verdict in `REVIEW-1c.md` / `STATUS-1c.md` as **E2E-TESTME — PASS**.
## 4. If it fails
Treat as a clean-room finding, not a config patch: a failure here means the from-git rebuild is
missing something the running server had out-of-band (a secret, a manual step, drift). Capture the
failing stage + logs in `JOURNAL-1c.md`, raise it as a blocker, and fix it in the **git source**
(base or `cc-ci-secrets`) so the next rebuild includes it — do **not** hand-fix the live VM. Re-run
this test after the fix.
## 5. Bound
One recipe, one green run. This is a functional smoke test of the rebuilt VM, not a full recipe-test
campaign (that's Phase 2). Don't expand scope here.