Reboot survival for the Pi orchestrator host: - systemd unit cc-ci-plan/systemd/cc-ci-loops.service (installed + enabled): on boot records the reboot, starts loops+watchdog (RESUME_PHASE=1), and resumes the orchestrator session. - reboot-log.sh: boot_id-gated reboot record -> REBOOTS.md (manual restarts don't count). - launch-orchestrator.sh: injects an AGENTS.md startup nudge so an auto-resumed orchestrator announces itself (PushNotification) + reports reboots. - AGENTS.md: on-startup notify routine documented. Plans/tooling accumulated this session: - plan-phase1d (generic suite), 1e (harness corrections), phase4 (final review), sso-dep-testing, orchestrator-migration (parked), test-e2e-testme-acceptance. - launch.sh: 1d/1e/2/2b/3/4 phase sequence, machine-docs-aware state resolution, limit-stall re-nudge, INBOX side-channel detection. - plan.md §6.1/§7: artifact-layer isolation, INBOX, 5-min long-run polling, DEFERRED. - prompts: isolation discipline + INBOX + pacing. - .gitignore: harden (.sops/, cc-ci-secrets/, .claude/, *.tmp.*). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
124 lines
7.1 KiB
Markdown
124 lines
7.1 KiB
Markdown
# Acceptance test — real end-to-end `!testme` on the clean-room-rebuilt VM
|
||
|
||
**Owner:** the Builder + Adversary loops (they execute *and* independently verify this).
|
||
**When:** after **C4/C5 PASS** (genuine throwaway-VM clean-room rebuild verified). The Builder then
|
||
performs the tailnet swap (§1) and runs the e2e; the Adversary independently verifies. It is the
|
||
**functional acceptance** of D8/clean-room: proof that the rebuilt-from-git VM doesn't just match
|
||
byte-for-byte, but actually *serves a real CI run end-to-end through the public domain*.
|
||
**This file:** `/srv/cc-ci/cc-ci-plan/test-e2e-testme-acceptance.md`
|
||
|
||
---
|
||
|
||
## 0. Why
|
||
|
||
The reproducibility gates (C1–C5) prove the rebuilt VM is structurally identical and boots clean.
|
||
This test proves it is **operationally** a working CI server: a maintainer comment triggers a build,
|
||
the app deploys and is reachable on its real public URL through the operator's gateway, the test
|
||
passes, and it tears down — the whole `!testme` pipeline, on the from-git VM, over the real domain.
|
||
|
||
---
|
||
|
||
## 1. Setup — the Builder performs the tailnet swap (then the e2e)
|
||
|
||
The rebuilt throwaway must become the live `cc-nix-test` so that the public gateway routes real
|
||
`ci.commoninternet.net` traffic to it (the gateway TLS-passthroughs via MagicDNS to
|
||
`cc-nix-test.taila4a0bf.ts.net` and re-resolves every ~10s, so it auto-follows the name). The swap is
|
||
**two reversible `tailscale set --hostname` commands** on VMs you already control — the Builder does
|
||
it. **Do this only after C4/C5 PASS** and after the rebuilt VM's full stack
|
||
(traefik + bridge + drone + dashboard) is up and serving locally.
|
||
|
||
**Order matters** (rename the original *aside first*, or the throwaway will get `cc-nix-test-1`):
|
||
|
||
1. **Rename the original prod VM aside** (it stays running — do NOT destroy it; needed for swap-back):
|
||
```
|
||
ssh cc-ci 'tailscale set --hostname=cc-nix-test-orig'
|
||
```
|
||
(`ssh cc-ci` is pinned to the original's IP `100.90.116.4`, so it keeps reaching the original
|
||
regardless of the name change.)
|
||
2. **Rename the rebuilt throwaway → `cc-nix-test`.** Re-derive its current tailscale IP (throwaways
|
||
get a fresh IP each rebuild): pick the ONLINE throwaway node from
|
||
`tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock status | grep -i throwaway`, then:
|
||
```
|
||
ssh -i /srv/incus-terraform-nix-vm-creator/terraform-secrets/vm_ssh_key \
|
||
-o ProxyCommand='nc -X 5 -x 127.0.0.1:1055 %h %p' root@<throwaway-ip> \
|
||
'tailscale set --hostname=cc-nix-test'
|
||
```
|
||
|
||
**Heads-up — tailnet-wide effect:** after the swap, `cc-nix-test.taila4a0bf.ts.net` resolves to the
|
||
rebuilt VM for *everyone* on the tailnet, so any of your own tooling that targets cc-nix-test **by
|
||
MagicDNS name** will now hit the rebuilt VM (tooling pinned to the raw IP `100.90.116.4` still hits
|
||
the original). Account for that when you point `!testme`/deploys.
|
||
|
||
**Verify the swap took (P1+P2) before starting the e2e** — must pass:
|
||
```
|
||
tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock status | grep cc-nix-test # → the throwaway's IP
|
||
curl -sS -o /dev/null -w '%{http_code} ssl_verify=%{ssl_verify_result}\n' https://ci.commoninternet.net/
|
||
# expect: 200 ssl_verify=0 (real public path now served by the rebuilt VM, valid cert)
|
||
```
|
||
|
||
**Swap-back when testing is done** (reversible): rename the throwaway back to its old name, then
|
||
`ssh cc-ci 'tailscale set --hostname=cc-nix-test'` to restore the original; the gateway re-follows.
|
||
|
||
---
|
||
|
||
## 2. Procedure
|
||
|
||
1. **Pick one fast, already-enrolled recipe.** Prefer the lightest enrolled app (e.g. `custom-html`)
|
||
so the run is quick and resource-cheap. Note the recipe + the repo/issue or PR where `!testme` is
|
||
recognised (the same place prior runs were triggered).
|
||
2. **Record the baseline.** Capture the recipe's *current* latest Drone run number and the dashboard
|
||
row (`https://ci.commoninternet.net/` and `https://drone.ci.commoninternet.net/...`) so you can
|
||
prove the run you trigger is **new**.
|
||
3. **Trigger via the real path.** Post `!testme` as the **bot** (the normal maintainer-comment
|
||
trigger) on that recipe — exactly as a real maintainer would. Do **not** invoke Drone directly or
|
||
shortcut the bridge; the comment→bridge→Drone path is part of what's under test.
|
||
4. **Confirm the bridge picked it up.** Within the bridge's poll interval, a **new** Drone build for
|
||
that recipe starts. Capture the new run number (must be > the baseline from step 2).
|
||
5. **Confirm the app deploys and is reachable on its PUBLIC URL.** While the build runs, the app is
|
||
deployed to its `*.ci.commoninternet.net` test domain. From **off the VM** (external — through the
|
||
gateway, not `localhost`/`127.0.0.1`), confirm a real request succeeds:
|
||
```
|
||
curl -sS -D- -o /dev/null https://<app-test-subdomain>.ci.commoninternet.net/
|
||
# expect: HTTP 200 (or the app's expected status), valid *.ci.commoninternet.net cert,
|
||
# served content from the deployed app — NOT a Traefik 404 / default-cert.
|
||
```
|
||
This is the crux: it proves routing public-DNS → gateway → MagicDNS → rebuilt VM → Traefik →
|
||
deployed app all works on the rebuilt server.
|
||
6. **Confirm the test logic passed.** The Drone build runs the recipe's real test assertions (app
|
||
state, not health-only) and finishes **success**.
|
||
7. **Confirm teardown.** After the run, the app is **undeployed** (no leftover stack/containers), per
|
||
the standard post-run cleanup — verify it's gone.
|
||
8. **Confirm the result was reported.** The outcome posts back to the trigger location and the
|
||
dashboard row updates to the new run with `success`.
|
||
|
||
---
|
||
|
||
## 3. Pass criteria (all must hold; Adversary verifies independently)
|
||
|
||
- [ ] **E1.** Self-check §1 passed (`ci.commoninternet.net` = 200, valid cert, on the rebuilt VM).
|
||
- [ ] **E2.** Posting `!testme` produced a **new** Drone build (run # > baseline) via the bridge —
|
||
not a manual Drone trigger.
|
||
- [ ] **E3.** The deployed app answered an **external** request on its real
|
||
`<app>.ci.commoninternet.net` URL (through the gateway) with the expected response + valid cert
|
||
— captured with headers/body evidence.
|
||
- [ ] **E4.** The Drone build's **real test assertions** ran and the build finished **success**
|
||
(no skipped/softened tests).
|
||
- [ ] **E5.** The app **undeployed** cleanly afterward (no residual stack).
|
||
- [ ] **E6.** Result reported back + dashboard updated to the new successful run.
|
||
|
||
Evidence (run #, the external `curl` headers/body, dashboard before/after, undeploy proof) is logged
|
||
in `JOURNAL-1c.md`, and the verdict in `REVIEW-1c.md` / `STATUS-1c.md` as **E2E-TESTME — PASS**.
|
||
|
||
## 4. If it fails
|
||
|
||
Treat as a clean-room finding, not a config patch: a failure here means the from-git rebuild is
|
||
missing something the running server had out-of-band (a secret, a manual step, drift). Capture the
|
||
failing stage + logs in `JOURNAL-1c.md`, raise it as a blocker, and fix it in the **git source**
|
||
(base or `cc-ci-secrets`) so the next rebuild includes it — do **not** hand-fix the live VM. Re-run
|
||
this test after the fix.
|
||
|
||
## 5. Bound
|
||
|
||
One recipe, one green run. This is a functional smoke test of the rebuilt VM, not a full recipe-test
|
||
campaign (that's Phase 2). Don't expand scope here.
|