decommission Pi: update all docs for VM-only setup
The orchestrator Pi is retired (2026-05-31). All agents now run on the cc-ci-orchestrator VM (NixOS, loops user, /srv/cc-ci). The VM is a direct tailnet peer to cc-ci — no SOCKS proxy, no userspace tailscaled, no ProxyCommand. Updated across all affected files: AGENTS.md - Remove Pi from reboot description; migration complete (not "parked") - cc-ci access: direct ssh, not via proxy kickoff.md - Prerequisites: direct tailnet peer, not proxy - Host deps: NixOS (not apt) - Fallback/Incus: b1 reachable directly, no --proxy curl flag plan.md §1 + §1.5 - §1 bootstrap: direct SSH, check tailscale status (not restart proxy) - §1.5 intro: "VM" not "sandbox host"; no proxy - Credentials table: remove TS_AUTH_KEY row; update cc-ci SSH row - Replace "Tailscale connection (proxy)" subsection with direct-peer description plan-orchestrator-migration.md - Mark COMPLETE (2026-05-31); historical record only plan-phase1c-full-reproducibility.md - Incus access: direct, not via SOCKS proxy prompts/builder.md + prompts/adversary.md - cc-ci access language only: direct ssh, no proxy restart instructions - adversary: *.ci.commoninternet.net via plain curl, no proxy flag REBOOTS.md - Retitle for VM; note Pi retired; Pi entries marked historical systemd/cc-ci-loops.service - User/Group/HOME/PATH: notplants → loops - Remove cc-ci-tailscaled.service dependency (no proxy on VM) - Add note about nix/configuration.nix as the authoritative VM declaration test-e2e-testme-acceptance.md - tailscale status: no --socket flag - ssh to throwaway: no ProxyCommand Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
10
AGENTS.md
10
AGENTS.md
@ -20,7 +20,7 @@ watches from outside.
|
||||
|
||||
**Every time you (the orchestrator) start or resume, send a `PushNotification`** that you are online —
|
||||
the operator wants to know the supervising session is back (especially after a reboot, which kills
|
||||
this session along with the Pi). Include the current phase and the reboot count. Steps on startup:
|
||||
this session). Include the current phase and the reboot count. Steps on startup:
|
||||
1. Read `cc-ci-plan/REBOOTS.md` (count the `## Reboots` entries) and `cc-ci-plan/launch.sh status`
|
||||
(current phase + whether the loops/watchdog are running).
|
||||
2. `PushNotification` (proactive), e.g.: *"cc-ci orchestrator online — phase 2, loops+watchdog
|
||||
@ -32,8 +32,8 @@ this session along with the Pi). Include the current phase and the reboot count.
|
||||
Reboot resilience is handled by **`cc-ci-loops.service`** (system unit): on boot it logs the reboot
|
||||
to `REBOOTS.md` (boot_id-gated) and runs `launch.sh start` with `RESUME_PHASE=1`, so the loops +
|
||||
watchdog auto-resume the saved phase. The orchestrator session itself is NOT auto-started — the
|
||||
operator reconnects to it (that's why the startup notification matters). The fuller "move the
|
||||
orchestrator onto its own VM" plan is parked at `cc-ci-plan/plan-orchestrator-migration.md`.
|
||||
operator reconnects to it (that's why the startup notification matters). The VM migration is
|
||||
complete; see `cc-ci-plan/plan-orchestrator-migration.md` (historical record).
|
||||
|
||||
## Keep the orchestrator open, under remote-control
|
||||
|
||||
@ -71,8 +71,8 @@ cc-ci VM"). The orchestrator is the human's steering wheel; the loops are the en
|
||||
|
||||
- `.testenv` (**NOT committed**): Tailscale auth key + Gitea bot creds. Load with
|
||||
`set -a; . .testenv; set +a` (never echo the values).
|
||||
- **cc-ci:** `ssh cc-ci` (root) tunnels through the persistent userspace-tailscaled SOCKS proxy on
|
||||
`127.0.0.1:1055` (`cc-ci-tailscaled.service`). If down: `sudo systemctl restart cc-ci-tailscaled`.
|
||||
- **cc-ci:** `ssh cc-ci` (root) directly — the orchestrator VM is a direct tailnet peer (`100.90.116.4`).
|
||||
No proxy. Key: `~/.ssh/cc-ci-root-ed25519`. If unreachable, check `tailscale status`.
|
||||
- **Incus/VM fallback:** mTLS certs at `/srv/incus-terraform-nix-vm-creator/terraform-secrets/`;
|
||||
b1 is on the same tailnet (reach via the same proxy). See kickoff "Fallback".
|
||||
- **Full credential map + how to use each:** `plan.md` §1.5.
|
||||
|
||||
@ -1,10 +1,13 @@
|
||||
# Reboot log — cc-ci orchestrator Pi
|
||||
# Reboot log — cc-ci orchestrator VM
|
||||
|
||||
One line per genuine reboot of the orchestrator Pi (`raspberrypi`), appended automatically by
|
||||
**Note:** the orchestrator Pi (`raspberrypi`) was decommissioned 2026-05-31. All agents now run on
|
||||
the `cc-ci-orchestrator` NixOS VM (tailnet `100.116.55.106`). The three Pi reboot entries below are
|
||||
historical. Entries from 2026-05-31 onward are VM reboots.
|
||||
|
||||
One line per genuine reboot of the orchestrator host, appended automatically by
|
||||
`reboot-log.sh` (ExecStartPre of `cc-ci-loops.service`, boot_id-gated so manual service restarts are
|
||||
NOT counted). The Pi hosts the Builder + Adversary loops + watchdog; a reboot drops the tmux sessions
|
||||
(and this orchestrator session), and `cc-ci-loops.service` restarts the loops on boot. Count the
|
||||
lines below to see how often it's happening.
|
||||
NOT counted). A reboot drops the tmux sessions (and the orchestrator session); `cc-ci-loops.service`
|
||||
restarts the loops on boot. Count the lines below to see how often it's happening.
|
||||
|
||||
## Reboots
|
||||
|
||||
|
||||
@ -70,8 +70,8 @@ account (`claude auth status`); set `REMOTE_CONTROL=0` to skip the remote surfac
|
||||
`claude` process, after which the watchdog brings it back (a fresh remote-control session) — and
|
||||
when `STATUS.md` shows `## DONE`, it kills the loops and exits.
|
||||
|
||||
Prerequisites the sessions inherit from your shell: SSH (root) to `cc-ci` via the Tailscale proxy
|
||||
(§1.5), Gitea bot creds, and `git.autonomic.zone` access. Plus **preconfigured** operator inputs the
|
||||
Prerequisites the sessions inherit from your shell: SSH (root) to `cc-ci` directly (the orchestrator
|
||||
VM is a direct tailnet peer — no proxy; §1.5), Gitea bot creds, and `git.autonomic.zone` access. Plus **preconfigured** operator inputs the
|
||||
loop depends on (plan §4.0/§4.4): the wildcard `*.ci.commoninternet.net` DNS record pointing at a
|
||||
gateway that TLS-passthroughs to cc-ci, and the **pre-issued wildcard cert** at
|
||||
`/var/lib/ci-certs/live/` on cc-ci. The operator owns the DNS record + gateway + cert
|
||||
@ -79,8 +79,7 @@ issuance/renewal; the agent builds Traefik (file provider → that cert) + routi
|
||||
**no ACME**. If any prerequisite is absent, the Builder parks at `STATUS.md ## Blocked` (plan §1/§9)
|
||||
rather than improvise.
|
||||
|
||||
> Host deps: `launch.sh` needs **tmux** (and `claude`) — tmux is installed on this sandbox host
|
||||
> (3.5a). On a fresh host: `sudo apt-get install -y tmux`. The script's `*_DIR`
|
||||
> Host deps: `launch.sh` needs **tmux** (and `claude`) — both are installed on the VM (NixOS). The script's `*_DIR`
|
||||
> defaults now point at `/srv/cc-ci/...` (Builder clone `/srv/cc-ci/cc-ci`, Adversary
|
||||
> `/srv/cc-ci/cc-ci-adv`); override the `*_DIR` env vars only if your layout differs.
|
||||
|
||||
@ -112,13 +111,12 @@ re-bootstrap.
|
||||
(`100.117.251.31:8443`, Incus project `terraform-ci`). Skill + Terraform live at
|
||||
`/srv/incus-terraform-nix-vm-creator/` (`skills/incus-terraform/SKILL.md`); read that for full usage.
|
||||
|
||||
- **Access:** b1 is on the *same* cc-ci tailnet, so reach the Incus API through the existing
|
||||
`cc-ci-tailscaled` SOCKS proxy (`127.0.0.1:1055`) with the mTLS certs in that repo's
|
||||
`terraform-secrets/` — no second tailscaled needed. Quick check:
|
||||
- **Access:** b1 (`100.117.251.31`) is reachable directly from the orchestrator VM (same tailnet).
|
||||
Use the mTLS certs at `terraform-secrets/` — no proxy. Quick check:
|
||||
```bash
|
||||
CRT=/srv/incus-terraform-nix-vm-creator/terraform-secrets/terraform.crt
|
||||
KEY=/srv/incus-terraform-nix-vm-creator/terraform-secrets/terraform.key
|
||||
curl --proxy socks5h://localhost:1055 --cert "$CRT" --key "$KEY" -k -s \
|
||||
curl --cert "$CRT" --key "$KEY" -k -s \
|
||||
https://100.117.251.31:8443/1.0/instances/cc-nix-test/state?project=terraform-ci
|
||||
```
|
||||
- **Soft restart (keeps the disk — preferred):** `POST .../1.0/instances/cc-nix-test/state?project=terraform-ci`
|
||||
|
||||
@ -10,7 +10,7 @@ relocating this orchestrator session there too.
|
||||
(claude CLI, proxy, loop supervisor) as systemd services that come back on boot — turning a reboot
|
||||
into a non-event. It also consolidates the orchestrator next to the infra it manages.
|
||||
|
||||
**Status:** IN PROGRESS (operator go-ahead 2026-05-30 — the Pi is OOM-thrashing/slow).
|
||||
**Status:** COMPLETE (2026-05-31). All agents run on the VM; Pi fully decommissioned. Kept as a historical record.
|
||||
|
||||
**Phase A ✅ COMPLETE (2026-05-30):** VM `cc-ci-orchestrator` (**2 GB / 2 vCPU / 30 GB**,
|
||||
`incus-base-vm`, NixOS 24.11) created via the Incus API + booted; **on the tailnet at
|
||||
|
||||
@ -124,8 +124,8 @@ When C1–C7 hold and are Adversary-verified, write `## DONE` to Phase-1c `STATU
|
||||
|
||||
The loops normally only `ssh cc-ci`. For 1c they MAY drive Incus on **b1** (resize `cc-nix-test`;
|
||||
create/destroy ONE throwaway VM in `terraform-ci`), using the mTLS certs at
|
||||
`/srv/incus-terraform-nix-vm-creator/terraform-secrets/` through the existing SOCKS proxy
|
||||
(`127.0.0.1:1055`) — see the incus skill (`/srv/incus-terraform-nix-vm-creator/skills/incus-terraform/SKILL.md`)
|
||||
`/srv/incus-terraform-nix-vm-creator/terraform-secrets/` (b1 is reachable directly from the VM —
|
||||
direct tailnet peer, no proxy) — see the incus skill (`/srv/incus-terraform-nix-vm-creator/skills/incus-terraform/SKILL.md`)
|
||||
and [[cc-ci-vm-incus]]. Guardrails: only `terraform-ci`; keep total running RAM within the **~12 GB
|
||||
guideline** (doc-only — terraform-ci has no enforced `limits.memory`; b1 is 16 GB physical) — hence
|
||||
`cc-nix-test`→4 GB + throwaway 4 GB + lichen-staging 4 GB = 12 GB; **destroy the throwaway VM when
|
||||
|
||||
@ -33,9 +33,9 @@ Do these in order. Each step is idempotent; re-running is safe.
|
||||
|
||||
1. **Verify access.** (Full credential map + how each is used is in **§1.5** — read it first.)
|
||||
- `ssh cc-ci 'hostname && whoami'` — you log in as **root** on cc-ci (NixOS), so there is no
|
||||
separate sudo step. `ssh cc-ci` is preconfigured to tunnel through the userspace-tailscaled
|
||||
SOCKS proxy (§1.5); if it fails, the proxy/daemon is probably down — restart it (§1.5) before
|
||||
declaring blocked.
|
||||
separate sudo step. `ssh cc-ci` reaches cc-ci directly (the orchestrator VM is a direct tailnet
|
||||
peer — no proxy; key `~/.ssh/cc-ci-root-ed25519`). If it fails, check `tailscale status`
|
||||
before declaring blocked.
|
||||
- `ssh cc-ci 'nixos-version'` — confirm NixOS.
|
||||
- Confirm you can reach the Gitea API with the bot creds from `.testenv` (§1.5):
|
||||
`curl -s https://$GITEA_URL/api/v1/version`. The bot authenticates with
|
||||
@ -72,45 +72,35 @@ Do these in order. Each step is idempotent; re-running is safe.
|
||||
|
||||
## 1.5 Credentials & access — where everything lives and how to use it
|
||||
|
||||
The loops run **on the sandbox host** (not on cc-ci) and reach cc-ci over Tailscale. This section
|
||||
is the authoritative map of what credentials exist, where, and how to use them. **Never copy any
|
||||
secret value into the repo, a commit, a log, or the dashboard** (§9) — reference locations only.
|
||||
The loops run **on the cc-ci-orchestrator VM** (`100.116.55.106`, NixOS, `loops` user) and reach
|
||||
cc-ci directly over Tailscale (direct tailnet peer — no proxy). This section is the authoritative
|
||||
map of what credentials exist, where, and how to use them. **Never copy any secret value into the
|
||||
repo, a commit, a log, or the dashboard** (§9) — reference locations only.
|
||||
|
||||
### Provided credentials (already in place)
|
||||
|
||||
| What | Where | How to use |
|
||||
|---|---|---|
|
||||
| **Tailscale auth key** (joins cc-ci's tailnet `taila4a0bf.ts.net`) | `/srv/cc-ci/.testenv` → `TS_AUTH_KEY` (Tailscale SaaS key, keyID ends `CNTRL`) | Used to bring up the userspace tailscaled (below). It's reusable; re-run `tailscale up` with it if the node drops. |
|
||||
| **cc-ci SSH (root)** | private key `~/.ssh/cc-ci-root-ed25519`; config `Host cc-ci` in `~/.ssh/config` | Just run `ssh cc-ci` (logs in as **root**). The pubkey is already in cc-ci's `/root/.ssh/authorized_keys`. |
|
||||
| **cc-ci SSH (root)** | private key `~/.ssh/cc-ci-root-ed25519`; `Host cc-ci` in `~/.ssh/config` (HostName `100.90.116.4`, no ProxyCommand) | Just run `ssh cc-ci` (logs in as **root**). The orchestrator VM is a direct tailnet peer — direct route, no proxy. Pubkey already in cc-ci's `/root/.ssh/authorized_keys`. |
|
||||
| **Gitea bot account** | `/srv/cc-ci/.testenv` → `GITEA_USERNAME` (`autonomic-bot`), `GITEA_PASSWORD`, `GITEA_URL` (`git.autonomic.zone`) | Basic-auth to the Gitea API, or mint a scoped token: `POST https://$GITEA_URL/api/v1/users/$GITEA_USERNAME/tokens`. Used to push the `cc-ci` project repo, read recipe repos, comment on PRs, and poll for `!testme` (read-level; the bot does not register webhooks). |
|
||||
|
||||
Load them in a shell with: `set -a; . /srv/cc-ci/.testenv; set +a` (don't echo the values).
|
||||
|
||||
### The Tailscale connection (how `ssh cc-ci` and the proxy work)
|
||||
### The Tailscale connection (how `ssh cc-ci` works)
|
||||
|
||||
cc-ci (`cc-nix-test`, **100.90.116.4**) is on a *different* tailnet than the sandbox host's default
|
||||
one, so it is reached via a **second, userspace tailscaled** — this keeps the host's own tailnet
|
||||
untouched. State lives in `~/.cc-ci-ts/`; it exposes a **SOCKS5/HTTP proxy on `127.0.0.1:1055`**,
|
||||
which is the only route to that tailnet (userspace networking ⇒ the host OS can't route the tailnet
|
||||
IPs directly).
|
||||
cc-ci (`cc-nix-test`, **100.90.116.4`) is on the same tailnet as the orchestrator VM
|
||||
(`taila4a0bf.ts.net`), so it is reached **directly** — no SOCKS proxy, no userspace tailscaled.
|
||||
The VM's system tailscaled is on that tailnet; `ssh cc-ci` routes straight to cc-ci.
|
||||
|
||||
It runs as a **persistent systemd service** (`cc-ci-tailscaled.service`, enabled, `Restart=always`,
|
||||
starts on boot; unit at `/etc/systemd/system/cc-ci-tailscaled.service`, runs as user `notplants`).
|
||||
It reuses the already-authenticated state in `~/.cc-ci-ts/`, so it reconnects across reboots/crashes
|
||||
without the auth key.
|
||||
|
||||
- `ssh cc-ci` works out of the box (its `ProxyCommand` uses the proxy; logs in as root).
|
||||
- For HTTP(S) to cc-ci / `*.ci.commoninternet.net` from the sandbox, go through the proxy, e.g.
|
||||
`curl --proxy socks5h://localhost:1055 https://<app>.ci.commoninternet.net`.
|
||||
- **If connectivity is down:** `sudo systemctl restart cc-ci-tailscaled` (diagnose with
|
||||
`systemctl status cc-ci-tailscaled` / `journalctl -u cc-ci-tailscaled`). A dead proxy is an access
|
||||
failure to recover, not a `## Blocked`-and-stop condition — *unless* the auth key itself is
|
||||
rejected (then re-auth with `tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock up
|
||||
--auth-key="$TS_AUTH_KEY" --hostname=cc-ci-claude-sandbox --accept-routes --accept-dns=false`, and
|
||||
if that fails the key is a class-A1 blocker).
|
||||
- **DNS gotcha:** this host's `/etc/resolv.conf` lists only Tailscale resolvers, so direct
|
||||
`dig @1.1.1.1 …` queries get no answer and look falsely empty. Use `getent hosts <name>` to
|
||||
resolve from the sandbox. `commoninternet.net` itself is a normal public zone hosted at **Gandi**.
|
||||
- `ssh cc-ci` works out of the box (`~/.ssh/config` has `Host cc-ci` pinned to `100.90.116.4`,
|
||||
no ProxyCommand; key `~/.ssh/cc-ci-root-ed25519`; logs in as root).
|
||||
- For HTTP(S) to cc-ci / `*.ci.commoninternet.net` from the VM, use plain `curl` — no proxy flag
|
||||
needed. The VM uses public DNS resolvers (`1.1.1.1`/`8.8.8.8`) so `*.ci.commoninternet.net`
|
||||
resolves normally.
|
||||
- **If `ssh cc-ci` fails:** run `tailscale status` (as loops or root) to confirm the VM is still
|
||||
on the tailnet and cc-ci is listed; check `systemctl status tailscaled`. A connectivity failure
|
||||
is recoverable, not an immediate `## Blocked`-and-stop, unless the VM has lost tailnet membership
|
||||
entirely (then that IS a class-A1 blocker).
|
||||
|
||||
### Credentials the loop GENERATES itself (do not wait on a human for these)
|
||||
|
||||
|
||||
@ -7,7 +7,7 @@ LIVENESS PROTOCOL (the watchdog ENFORCES this — see plan.md §7):
|
||||
- **Declare every wait.** Immediately before going idle, your FINAL output line MUST be exactly `WAITING-UNTIL: <ISO-8601 UTC>` — the time you will resume (≤10 min out, matching your ScheduleWakeup). Compute it from the clock (`date -u -d '+10 min' +%FT%TZ`). If the watchdog sees you idle ≥5 min with no current marker as your last line, OR idle past the time it names, it kills + reboots you (you resume cleanly from git + your REVIEW/STATUS files).
|
||||
- **Compact proactively.** If context usage climbs high (≳80%), run `/compact` before continuing — your loop state is in git + REVIEW/STATUS, so compaction is lossless and prevents wedging at the context limit.
|
||||
|
||||
Credentials/access: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv and ~/.ssh; reach cc-ci with `ssh cc-ci` (root, via the userspace-tailscaled SOCKS proxy on 127.0.0.1:1055), and hit the dashboard / *.ci.commoninternet.net through that proxy (`curl --proxy socks5h://localhost:1055 ...`). If the proxy is down, restart it per §1.5. Verify from a COLD START but you may rely on this shared access path.
|
||||
Credentials/access: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv and ~/.ssh; reach cc-ci with `ssh cc-ci` (root, direct tailnet peer — no proxy). Hit the dashboard / *.ci.commoninternet.net via regular HTTP(S) from the VM (uses public DNS, no proxy needed). Verify from a COLD START but you may rely on this shared access path.
|
||||
|
||||
You run as a SEPARATE process and coordinate ONLY through the git repo per §6.1:
|
||||
- Keep your OWN clone at /srv/cc-ci/cc-ci-adv. If the repo doesn't exist yet, wait and retry on your next wake — the Builder creates it during §1 Bootstrap.
|
||||
|
||||
@ -27,7 +27,7 @@ Overriding rules:
|
||||
- Never weaken, skip, or delete a test to make a run pass. A red test is information.
|
||||
- Only cc-ci is yours to reconfigure. Never push code to recipe repos; never touch production servers/domains. Keep server state Nix-declared and reversible.
|
||||
- 3rd identical failure → stop, record dead-end in DECISIONS.md, change approach or mark blocked.
|
||||
- Credentials: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv (TS_AUTH_KEY, GITEA_USERNAME/PASSWORD/URL) and ~/.ssh (cc-ci-root-ed25519). Reach cc-ci with `ssh cc-ci` (root, via the userspace-tailscaled SOCKS proxy on 127.0.0.1:1055); if it fails, restart the proxy per §1.5 before declaring blocked. There is NO ready-made $GITEA_TOKEN — mint one from the bot creds if you want a token.
|
||||
- Credentials: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv (GITEA_USERNAME/PASSWORD/URL) and ~/.ssh (cc-ci-root-ed25519). Reach cc-ci with `ssh cc-ci` (root, direct tailnet peer — no proxy). There is NO ready-made $GITEA_TOKEN — mint one from the bot creds if you want a token.
|
||||
- Secret classes (§4.4), handled differently:
|
||||
• Class A1 EXTERNAL infra inputs (cc-ci SSH/root access, TS auth key, Gitea bot creds, the pre-issued wildcard TLS cert at /var/lib/ci-certs/live/, registry creds; plus the preconfigured DNS/gateway facts): if missing/invalid → STATUS.md ## Blocked and stop. Do NOT improvise/invent. NEVER attempt ACME/DNS-01 for commoninternet.net — the cert is pre-provided and renewed out-of-band; point Traefik's file provider at /var/lib/ci-certs/live/{fullchain.pem,privkey.pem}.
|
||||
• Class A2 INTERNAL infra secrets (Drone RPC, webhook HMAC, Gitea OAuth app, host age key): you GENERATE these yourself — never block on them.
|
||||
|
||||
@ -1,22 +1,23 @@
|
||||
[Unit]
|
||||
# Canonical, version-controlled copy of the unit installed at /etc/systemd/system/cc-ci-loops.service.
|
||||
# Canonical, version-controlled copy of the unit for the cc-ci-orchestrator VM.
|
||||
# Install: sudo install -m0644 cc-ci-plan/systemd/cc-ci-loops.service /etc/systemd/system/ \
|
||||
# && sudo systemctl daemon-reload && sudo systemctl enable cc-ci-loops.service
|
||||
# Brings the WHOLE rig back after a reboot of the orchestrator Pi: loops + watchdog (launch.sh) AND
|
||||
# NOTE: the VM's actual reboot-resilience service is declared in nix/configuration.nix (systemd.services.cc-ci-loops).
|
||||
# This file is the repo reference copy — keep both in sync when making changes.
|
||||
# Brings the WHOLE rig back after a reboot of the cc-ci-orchestrator VM: loops + watchdog (launch.sh) AND
|
||||
# the orchestrator supervisory session (launch-orchestrator.sh), plus a reboot record (reboot-log.sh).
|
||||
Description=cc-ci autonomous loops + watchdog + orchestrator (reboot-resilient)
|
||||
Documentation=file:///srv/cc-ci/cc-ci-plan/plan.md
|
||||
After=network-online.target cc-ci-tailscaled.service
|
||||
After=network-online.target tailscaled.service
|
||||
Wants=network-online.target
|
||||
Requires=cc-ci-tailscaled.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
User=notplants
|
||||
Group=notplants
|
||||
Environment=HOME=/home/notplants
|
||||
Environment=PATH=/home/notplants/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
|
||||
User=loops
|
||||
Group=loops
|
||||
Environment=HOME=/home/loops
|
||||
Environment=PATH=/home/loops/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
|
||||
# RESUME_PHASE=1 so a reboot resumes the SAVED phase (e.g. phase 2), never restarts from phase 0/1c.
|
||||
Environment=RESUME_PHASE=1
|
||||
# 1) record the reboot (boot_id-gated); 2) start loops + watchdog; 3) resume the orchestrator session.
|
||||
|
||||
@ -37,12 +37,13 @@ it. **Do this only after C4/C5 PASS** and after the rebuilt VM's full stack
|
||||
regardless of the name change.)
|
||||
2. **Rename the rebuilt throwaway → `cc-nix-test`.** Re-derive its current tailscale IP (throwaways
|
||||
get a fresh IP each rebuild): pick the ONLINE throwaway node from
|
||||
`tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock status | grep -i throwaway`, then:
|
||||
`tailscale status | grep -i throwaway`, then:
|
||||
```
|
||||
ssh -i /srv/incus-terraform-nix-vm-creator/terraform-secrets/vm_ssh_key \
|
||||
-o ProxyCommand='nc -X 5 -x 127.0.0.1:1055 %h %p' root@<throwaway-ip> \
|
||||
root@<throwaway-ip> \
|
||||
'tailscale set --hostname=cc-nix-test'
|
||||
```
|
||||
(The orchestrator VM is a direct tailnet peer — no ProxyCommand needed.)
|
||||
|
||||
**Heads-up — tailnet-wide effect:** after the swap, `cc-nix-test.taila4a0bf.ts.net` resolves to the
|
||||
rebuilt VM for *everyone* on the tailnet, so any of your own tooling that targets cc-nix-test **by
|
||||
@ -51,7 +52,7 @@ the original). Account for that when you point `!testme`/deploys.
|
||||
|
||||
**Verify the swap took (P1+P2) before starting the e2e** — must pass:
|
||||
```
|
||||
tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock status | grep cc-nix-test # → the throwaway's IP
|
||||
tailscale status | grep cc-nix-test # → the throwaway's IP
|
||||
curl -sS -o /dev/null -w '%{http_code} ssl_verify=%{ssl_verify_result}\n' https://ci.commoninternet.net/
|
||||
# expect: 200 ssl_verify=0 (real public path now served by the rebuilt VM, valid cert)
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user