Add cc-ci-upgrader agent: observable one-shot weekly upgrade-run agent

The weekly upgrade run now executes inside a dedicated, remote-control agent
(cc-ci-upgrader) — viewable/steerable at claude.ai/code like the Builder — rather
than buried in headless cron output.

- launch-upgrader.sh: spins up the cc-ci-upgrader tmux session under
  --remote-control with a kickoff that runs /upgrade-all (DEFAULT mode) to
  completion. On finish the agent STOPS and stays idle (does NOT self-terminate)
  so the run + summary stay reviewable in the web UI. `start` = use-or-create:
  leaves an in-flight (busy) run alone, else clears a finished/idle/wedged
  session and runs fresh; `fresh` always restarts. UPGRADER_ARGS passes flags
  (e.g. --dry-run); never --with-tests.
- launch.sh: orchestrator_alive() now also skips the cc-ci-upgrader
  remote-control name, so the upgrader job isn't mistaken for the orchestrator.
- upgrade-all skill: documents it runs as the cc-ci-upgrader agent; the weekly
  cron invokes `launch-upgrader.sh start` (not /upgrade-all inline).
- Phase 5: V8a verifies the agent lifecycle (launch → run to completion → stay
  idle/viewable → next start clears it); V9 stops the verification session.
- cron memory: weekly task = launch-upgrader.sh start at 0 3 * * 6 UTC.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-05-29 21:12:47 +01:00
parent 4f74676c72
commit bf71420106
4 changed files with 151 additions and 7 deletions

View File

@ -12,6 +12,14 @@ recipe, and writes one summary of every PR to review. It **never pushes upstream
Drives cc-ci over `ssh cc-ci`. Logs/summary go to `/srv/cc-ci/.cc-ci-logs/upgrades/`. Drives cc-ci over `ssh cc-ci`. Logs/summary go to `/srv/cc-ci/.cc-ci-logs/upgrades/`.
**Runs as the `cc-ci-upgrader` agent.** This skill is normally executed by a dedicated, observable
**one-shot job agent**`cc-ci-upgrader` — spun up under remote-control (viewable/steerable at
claude.ai/code, like the Builder) by `cc-ci-plan/launch-upgrader.sh`. That agent runs this skill to
completion, then **stops and stays idle** so the run + summary remain reviewable in the web UI (it does
NOT self-terminate). The weekly cron just invokes `launch-upgrader.sh start`; the next week's run
clears the idle session and starts fresh. You can also run `/upgrade-all` inline in any `/srv/cc-ci`
session, but the agent is the intended path so the weekly run isn't buried in headless output.
## Arguments (optional `$ARGUMENTS`) ## Arguments (optional `$ARGUMENTS`)
- A space-separated list of recipe names → only those (else all enrolled recipes). - A space-separated list of recipe names → only those (else all enrolled recipes).
- `--dry-run` → survey + print what WOULD upgrade; spawn nothing. - `--dry-run` → survey + print what WOULD upgrade; spawn nothing.
@ -104,16 +112,18 @@ End with the report path and a reminder that **nothing was merged**.
- **Never merges**; failures/ skips are surfaced and retried next week — safe to re-run anytime. - **Never merges**; failures/ skips are surfaced and retried next week — safe to re-run anytime.
## Cron ## Cron
Designed for a weekly Claude Code scheduled task that invokes `/upgrade-all` in `/srv/cc-ci`. Designed for a weekly Claude Code scheduled task that runs **`cc-ci-plan/launch-upgrader.sh start`** —
which spins up the `cc-ci-upgrader` remote-control agent to run this skill to completion (the agent
then stays idle/viewable). The cron does NOT invoke `/upgrade-all` inline.
**Agreed schedule:** **Saturday 03:00 UTC** (`0 3 * * 6`) — low-traffic weekend window, PRs waiting by **Agreed schedule:** **Saturday 03:00 UTC** (`0 3 * * 6`) — low-traffic weekend window, PRs waiting by
Monday. Monday.
**Activation trigger (operator, 2026-05-29):** do NOT activate while the build loops are still **Activation trigger (operator, 2026-05-29):** do NOT activate while the build loops are still
constructing cc-ci — it would contend with them for the shared host. **Activate this weekly cron only constructing cc-ci — it would contend with them for the shared host. **Activate this weekly cron only
once the cc-ci build is complete (loops finished / cc-ci stable.)** Until then it's run manually / once the cc-ci build is complete (loops finished / cc-ci stable.)** Until then run it manually /
on-demand. When activating, create a scheduled task that runs `/upgrade-all` in `/srv/cc-ci` at on-demand via `launch-upgrader.sh start` (or `fresh`). When activating, create a scheduled task that
`0 3 * * 6` UTC. runs `/srv/cc-ci/cc-ci-plan/launch-upgrader.sh start` at `0 3 * * 6` UTC.
Re-running is idempotent: already-current recipes report `SKIPPED — up-to-date`; recipes with an open Re-running is idempotent: already-current recipes report `SKIPPED — up-to-date`; recipes with an open
PR for the same branch report the existing PR rather than duplicating it. PR for the same branch report the existing PR rather than duplicating it.

125
cc-ci-plan/launch-upgrader.sh Executable file
View File

@ -0,0 +1,125 @@
#!/usr/bin/env bash
#
# launch-upgrader.sh — spin up the cc-ci UPGRADER agent in tmux under remote-control.
#
# The Upgrader is a ONE-SHOT job agent (not a perpetual loop like the Builder/Adversary): it runs the
# weekly recipe-upgrade sequence — the /upgrade-all skill in DEFAULT mode — to completion, then STOPS
# and stays idle (it does NOT self-terminate) so the run + summary remain viewable/steerable at
# claude.ai/code exactly like the Builder, instead of being buried in headless cron output. The next
# weekly run starts a fresh session: `start` leaves an in-flight run alone but clears a finished/idle
# (or wedged) session and starts clean. The weekly cron (Sat 03:00 UTC, once cc-ci is built — see
# [[cc-ci-upgrade-all-cron]]) invokes `launch-upgrader.sh start`.
#
# Naming: tmux session AND remote-control name are both "cc-ci-upgrader" (matching
# cc-ci-builder / cc-ci-adv / cc-ci-watchdog / cc-ci-orchestrator).
#
# Usage:
# ./launch-upgrader.sh start # use-or-create: if a run is actively in flight leave it,
# # else (no session / idle-stale) kill any stale + start fresh
# ./launch-upgrader.sh fresh # always kill any existing + start a fresh run
# ./launch-upgrader.sh status | attach | stop
#
# Env:
# UPGRADER_ARGS="" passthrough args to /upgrade-all (e.g. "--dry-run", "ghost n8n"); default none
# = full default fleet run. NEVER pass --with-tests here (the cron must not
# auto-edit tests; that's the operator's per-recipe opt-in).
set -euo pipefail
SESSION="${UPGRADER_SESSION:-cc-ci-upgrader}" # tmux session name == remote-control name
WORKDIR="${UPGRADER_DIR:-/srv/cc-ci}" # cwd: where .claude/skills/ + .testenv live
CLAUDE_BIN="${CLAUDE_BIN:-claude}"
CLAUDE_FLAGS="${CLAUDE_FLAGS:---dangerously-skip-permissions}"
REMOTE_CONTROL="${REMOTE_CONTROL:-1}" # 1 => --remote-control (viewable at claude.ai/code)
LOG_DIR="${LOG_DIR:-/srv/cc-ci/.cc-ci-logs}"
UPGRADER_ARGS="${UPGRADER_ARGS:-}"
log() { printf '[upgrader %(%H:%M:%S)T] %s\n' -1 "$*"; }
die() { log "ERROR: $*"; exit 1; }
session_alive() { tmux has-session -t "$SESSION" 2>/dev/null; }
# "actively working" = the TUI shows the interrupt hint (a turn in flight). Absent => idle/finished/wedged.
session_busy() { tmux capture-pane -pt "$SESSION" 2>/dev/null | grep -q 'esc to interrupt'; }
preflight() {
command -v tmux >/dev/null 2>&1 || die "missing dependency: tmux"
command -v "$CLAUDE_BIN" >/dev/null 2>&1 || die "claude CLI not found (set CLAUDE_BIN)"
[[ -d "$WORKDIR" ]] || die "workdir not found: $WORKDIR"
[[ -d "$WORKDIR/.claude/skills/upgrade-all" ]] || die "upgrade-all skill not found under $WORKDIR/.claude/skills"
mkdir -p "$LOG_DIR"
}
write_kickoff() {
local kf="$LOG_DIR/.kickoff-$SESSION.txt"
cat > "$kf" <<KICK
*** cc-ci UPGRADER — weekly recipe-upgrade job ***
You are the cc-ci Upgrader: a ONE-SHOT job agent, NOT a perpetual loop. Run the recipe-upgrade
sequence to completion, then STOP. Your cwd is ${WORKDIR}; reach the CI server with \`ssh cc-ci\`;
creds are in ${WORKDIR}/.testenv; the skills live in ${WORKDIR}/.claude/skills/.
DO THIS:
1. Invoke the **/upgrade-all** skill in DEFAULT mode${UPGRADER_ARGS:+ with arguments: ${UPGRADER_ARGS}}
(read ${WORKDIR}/.claude/skills/upgrade-all/SKILL.md for the full procedure). It surveys every
enrolled recipe and, for each upgradeable one, runs /recipe-upgrade in DEFAULT mode — recipe PR
only, verified by posting \`!testme\` on the PR (results visible in the PR, iterate up to 3x). A
genuinely stale test gets an explanatory PR COMMENT, never a test edit.
2. Process recipes via per-recipe SUBAGENTS (as the skill specifies) so your own context stays light.
If your context usage climbs (~80%), run /compact before continuing.
3. Write + push the weekly summary (the PR list is the actionable output for the operator).
4. WHEN THE RUN IS COMPLETE: STOP. Print the final summary (lead with the PR list) and an
\`UPGRADE RUN COMPLETE\` line, then go idle. Do NOT loop, do NOT re-run, and do NOT kill your own
session — leave it up so the operator can review your output + the summary in the web UI
(claude.ai/code). Next week's run starts a fresh session (the launcher clears this idle one).
GUARDRAILS: NEVER merge any PR. NEVER weaken a test. DEFAULT mode only — do NOT pass --with-tests
(updating cc-ci tests is the operator's per-recipe opt-in). Single-writer: dedicated branches +
separate clones, never push main, never touch the build loops' /cc-ci /cc-ci-adv clones. The shared
Swarm is stateful — go sequentially and tear down what you deploy.
KICK
echo "$kf"
}
start() {
local mode="${1:-use-or-create}"
preflight
if session_alive; then
if [[ "$mode" == "use-or-create" ]] && session_busy; then
log "$SESSION already running a job (busy) — leaving it"; return 0
fi
log "$SESSION exists but idle/stale (or fresh requested) — killing it first"
tmux kill-session -t "$SESSION" 2>/dev/null || true; sleep 1
fi
local kf rc=""
kf="$(write_kickoff)"
[[ "$REMOTE_CONTROL" == "1" ]] && rc="--remote-control '$SESSION'"
log "starting $SESSION (cwd=$WORKDIR, rc=$REMOTE_CONTROL, args='${UPGRADER_ARGS:-<none>}')"
tmux new-session -d -s "$SESSION" -c "$WORKDIR" \
"$CLAUDE_BIN $rc $CLAUDE_FLAGS \"\$(cat '$kf')\""
tmux pipe-pane -o -t "$SESSION" "cat >> '$LOG_DIR/$SESSION.log'"
log "started. status: $0 status | attach: tmux attach -t $SESSION | log: $LOG_DIR/$SESSION.log"
}
case "${1:-start}" in
start) start use-or-create ;;
fresh) start fresh ;;
stop) if session_alive; then log "killing $SESSION"; tmux kill-session -t "$SESSION" || true; else log "$SESSION not running"; fi ;;
status)
if session_alive; then
log "$SESSION: RUNNING $(session_busy && echo '(busy)' || echo '(idle/finishing)')"
ps -eo pid,etime,args | grep "[r]emote-control $SESSION" || true
else log "$SESSION: stopped"; fi ;;
attach) exec tmux attach -t "$SESSION" ;;
*)
cat <<EOF
cc-ci upgrader launcher — one-shot weekly recipe-upgrade job agent (remote-control)
$0 start use-or-create: leave an in-flight run alone, else (re)start fresh (DEFAULT; what the cron calls)
$0 fresh always kill any existing + start a fresh run
$0 status show tmux + remote-control state
$0 attach tmux attach to the session
$0 stop kill the session
Env: SESSION=$SESSION WORKDIR=$WORKDIR REMOTE_CONTROL=$REMOTE_CONTROL UPGRADER_ARGS='${UPGRADER_ARGS:-<none>}'
The agent runs /upgrade-all (DEFAULT mode) to completion, then STOPS and stays idle (viewable in the
web UI). It does NOT self-terminate; the next weekly `start` clears the idle session and runs fresh.
EOF
;;
esac

View File

@ -247,8 +247,9 @@ orchestrator_alive() {
local pid args local pid args
for pid in $(pgrep -x claude 2>/dev/null); do for pid in $(pgrep -x claude 2>/dev/null); do
args="$(tr '\0' ' ' < "/proc/$pid/cmdline" 2>/dev/null || true)" args="$(tr '\0' ' ' < "/proc/$pid/cmdline" 2>/dev/null || true)"
# skip the two loops (matched by their remote-control session NAME, not a stray path mention) # skip the loops + the one-shot upgrader job (matched by remote-control session NAME, not a
printf '%s' "$args" | grep -qE -- "--remote-control +'?cc-ci-(builder|adv)'?" && continue # stray path mention) — none of these is the orchestrator.
printf '%s' "$args" | grep -qE -- "--remote-control +'?cc-ci-(builder|adv|upgrader)'?" && continue
return 0 # a non-loop claude process => orchestrator (or operator) is alive return 0 # a non-loop claude process => orchestrator (or operator) is alive
done done
tmux has-session -t "$ORCH_SESSION" 2>/dev/null && return 0 tmux has-session -t "$ORCH_SESSION" 2>/dev/null && return 0

View File

@ -60,8 +60,16 @@ and **close every PR opened during verification afterward** — do not pollute r
surfaces "tests look stale" PRs in its own summary section, runs **sequentially with teardown** surfaces "tests look stale" PRs in its own summary section, runs **sequentially with teardown**
between recipes, and the summary leads with the PR list. Confirms the weekly cron never between recipes, and the summary leads with the PR list. Confirms the weekly cron never
auto-edits tests. auto-edits tests.
- [ ] **V8a — the `cc-ci-upgrader` agent.** `cc-ci-plan/launch-upgrader.sh start` spins up a
remote-control `cc-ci-upgrader` session (viewable at claude.ai/code) that runs `/upgrade-all`
(DEFAULT) **to completion, then STOPS and stays idle** (does NOT self-terminate) with the
summary visible. A second `start` while a run is **in flight (busy)** leaves it alone; a `start`
against a **finished/idle (or wedged)** session **kills it and runs fresh**. (Use
`UPGRADER_ARGS=--dry-run` for a cheap check, then a real small-set run.) This is exactly what the
weekly cron invokes — verify the cron-equivalent path end-to-end.
- [ ] **V9 — cleanup.** Every PR opened during verification (recipe + any cc-ci test PR) is **closed** - [ ] **V9 — cleanup.** Every PR opened during verification (recipe + any cc-ci test PR) is **closed**
and any sandbox deploy is **torn down**; the box is left clean. and any sandbox deploy is **torn down**; the verification `cc-ci-upgrader` session is stopped
(`launch-upgrader.sh stop`); the box is left clean.
## 2. Method / notes ## 2. Method / notes
- **Sandbox first.** Prefer `custom-html-tiny` / a throwaway recipe for V3V8 so real recipe mirrors - **Sandbox first.** Prefer `custom-html-tiny` / a throwaway recipe for V3V8 so real recipe mirrors