Files
cc-ci-orchestrator/cc-ci-plan/test-e2e-testme-acceptance.md
autonomic-bot 36a6c9872a orchestrator: reboot-resilience + session auto-resume + full session plan/tooling
Reboot survival for the Pi orchestrator host:
- systemd unit cc-ci-plan/systemd/cc-ci-loops.service (installed + enabled): on boot
  records the reboot, starts loops+watchdog (RESUME_PHASE=1), and resumes the
  orchestrator session.
- reboot-log.sh: boot_id-gated reboot record -> REBOOTS.md (manual restarts don't count).
- launch-orchestrator.sh: injects an AGENTS.md startup nudge so an auto-resumed
  orchestrator announces itself (PushNotification) + reports reboots.
- AGENTS.md: on-startup notify routine documented.

Plans/tooling accumulated this session:
- plan-phase1d (generic suite), 1e (harness corrections), phase4 (final review),
  sso-dep-testing, orchestrator-migration (parked), test-e2e-testme-acceptance.
- launch.sh: 1d/1e/2/2b/3/4 phase sequence, machine-docs-aware state resolution,
  limit-stall re-nudge, INBOX side-channel detection.
- plan.md §6.1/§7: artifact-layer isolation, INBOX, 5-min long-run polling, DEFERRED.
- prompts: isolation discipline + INBOX + pacing.
- .gitignore: harden (.sops/, cc-ci-secrets/, .claude/, *.tmp.*).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 20:28:10 +01:00

124 lines
7.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Acceptance test — real end-to-end `!testme` on the clean-room-rebuilt VM
**Owner:** the Builder + Adversary loops (they execute *and* independently verify this).
**When:** after **C4/C5 PASS** (genuine throwaway-VM clean-room rebuild verified). The Builder then
performs the tailnet swap (§1) and runs the e2e; the Adversary independently verifies. It is the
**functional acceptance** of D8/clean-room: proof that the rebuilt-from-git VM doesn't just match
byte-for-byte, but actually *serves a real CI run end-to-end through the public domain*.
**This file:** `/srv/cc-ci/cc-ci-plan/test-e2e-testme-acceptance.md`
---
## 0. Why
The reproducibility gates (C1C5) prove the rebuilt VM is structurally identical and boots clean.
This test proves it is **operationally** a working CI server: a maintainer comment triggers a build,
the app deploys and is reachable on its real public URL through the operator's gateway, the test
passes, and it tears down — the whole `!testme` pipeline, on the from-git VM, over the real domain.
---
## 1. Setup — the Builder performs the tailnet swap (then the e2e)
The rebuilt throwaway must become the live `cc-nix-test` so that the public gateway routes real
`ci.commoninternet.net` traffic to it (the gateway TLS-passthroughs via MagicDNS to
`cc-nix-test.taila4a0bf.ts.net` and re-resolves every ~10s, so it auto-follows the name). The swap is
**two reversible `tailscale set --hostname` commands** on VMs you already control — the Builder does
it. **Do this only after C4/C5 PASS** and after the rebuilt VM's full stack
(traefik + bridge + drone + dashboard) is up and serving locally.
**Order matters** (rename the original *aside first*, or the throwaway will get `cc-nix-test-1`):
1. **Rename the original prod VM aside** (it stays running — do NOT destroy it; needed for swap-back):
```
ssh cc-ci 'tailscale set --hostname=cc-nix-test-orig'
```
(`ssh cc-ci` is pinned to the original's IP `100.90.116.4`, so it keeps reaching the original
regardless of the name change.)
2. **Rename the rebuilt throwaway → `cc-nix-test`.** Re-derive its current tailscale IP (throwaways
get a fresh IP each rebuild): pick the ONLINE throwaway node from
`tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock status | grep -i throwaway`, then:
```
ssh -i /srv/incus-terraform-nix-vm-creator/terraform-secrets/vm_ssh_key \
-o ProxyCommand='nc -X 5 -x 127.0.0.1:1055 %h %p' root@<throwaway-ip> \
'tailscale set --hostname=cc-nix-test'
```
**Heads-up — tailnet-wide effect:** after the swap, `cc-nix-test.taila4a0bf.ts.net` resolves to the
rebuilt VM for *everyone* on the tailnet, so any of your own tooling that targets cc-nix-test **by
MagicDNS name** will now hit the rebuilt VM (tooling pinned to the raw IP `100.90.116.4` still hits
the original). Account for that when you point `!testme`/deploys.
**Verify the swap took (P1+P2) before starting the e2e** — must pass:
```
tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock status | grep cc-nix-test # → the throwaway's IP
curl -sS -o /dev/null -w '%{http_code} ssl_verify=%{ssl_verify_result}\n' https://ci.commoninternet.net/
# expect: 200 ssl_verify=0 (real public path now served by the rebuilt VM, valid cert)
```
**Swap-back when testing is done** (reversible): rename the throwaway back to its old name, then
`ssh cc-ci 'tailscale set --hostname=cc-nix-test'` to restore the original; the gateway re-follows.
---
## 2. Procedure
1. **Pick one fast, already-enrolled recipe.** Prefer the lightest enrolled app (e.g. `custom-html`)
so the run is quick and resource-cheap. Note the recipe + the repo/issue or PR where `!testme` is
recognised (the same place prior runs were triggered).
2. **Record the baseline.** Capture the recipe's *current* latest Drone run number and the dashboard
row (`https://ci.commoninternet.net/` and `https://drone.ci.commoninternet.net/...`) so you can
prove the run you trigger is **new**.
3. **Trigger via the real path.** Post `!testme` as the **bot** (the normal maintainer-comment
trigger) on that recipe — exactly as a real maintainer would. Do **not** invoke Drone directly or
shortcut the bridge; the comment→bridge→Drone path is part of what's under test.
4. **Confirm the bridge picked it up.** Within the bridge's poll interval, a **new** Drone build for
that recipe starts. Capture the new run number (must be > the baseline from step 2).
5. **Confirm the app deploys and is reachable on its PUBLIC URL.** While the build runs, the app is
deployed to its `*.ci.commoninternet.net` test domain. From **off the VM** (external — through the
gateway, not `localhost`/`127.0.0.1`), confirm a real request succeeds:
```
curl -sS -D- -o /dev/null https://<app-test-subdomain>.ci.commoninternet.net/
# expect: HTTP 200 (or the app's expected status), valid *.ci.commoninternet.net cert,
# served content from the deployed app — NOT a Traefik 404 / default-cert.
```
This is the crux: it proves routing public-DNS → gateway → MagicDNS → rebuilt VM → Traefik →
deployed app all works on the rebuilt server.
6. **Confirm the test logic passed.** The Drone build runs the recipe's real test assertions (app
state, not health-only) and finishes **success**.
7. **Confirm teardown.** After the run, the app is **undeployed** (no leftover stack/containers), per
the standard post-run cleanup — verify it's gone.
8. **Confirm the result was reported.** The outcome posts back to the trigger location and the
dashboard row updates to the new run with `success`.
---
## 3. Pass criteria (all must hold; Adversary verifies independently)
- [ ] **E1.** Self-check §1 passed (`ci.commoninternet.net` = 200, valid cert, on the rebuilt VM).
- [ ] **E2.** Posting `!testme` produced a **new** Drone build (run # > baseline) via the bridge —
not a manual Drone trigger.
- [ ] **E3.** The deployed app answered an **external** request on its real
`<app>.ci.commoninternet.net` URL (through the gateway) with the expected response + valid cert
— captured with headers/body evidence.
- [ ] **E4.** The Drone build's **real test assertions** ran and the build finished **success**
(no skipped/softened tests).
- [ ] **E5.** The app **undeployed** cleanly afterward (no residual stack).
- [ ] **E6.** Result reported back + dashboard updated to the new successful run.
Evidence (run #, the external `curl` headers/body, dashboard before/after, undeploy proof) is logged
in `JOURNAL-1c.md`, and the verdict in `REVIEW-1c.md` / `STATUS-1c.md` as **E2E-TESTME — PASS**.
## 4. If it fails
Treat as a clean-room finding, not a config patch: a failure here means the from-git rebuild is
missing something the running server had out-of-band (a secret, a manual step, drift). Capture the
failing stage + logs in `JOURNAL-1c.md`, raise it as a blocker, and fix it in the **git source**
(base or `cc-ci-secrets`) so the next rebuild includes it — do **not** hand-fix the live VM. Re-run
this test after the fix.
## 5. Bound
One recipe, one green run. This is a functional smoke test of the rebuilt VM, not a full recipe-test
campaign (that's Phase 2). Don't expand scope here.