upgrade-all: simplify to a rolling pool, alphabetical (drop waves + heavy/light)
Per operator: just work through recipes alphabetically keeping CAP (= DRONE_RUNNER_CAPACITY=2) subagents running at once, starting the next the moment one finishes (rolling pool via run_in_background). Removes the wave-barrier and the heavy/light classification entirely — simpler and no slot ever idles.
This commit is contained in:
@ -1,6 +1,6 @@
|
||||
---
|
||||
name: upgrade-all
|
||||
description: Weekly autonomous upgrade run for the cc-ci CI server. Surveys every enrolled recipe for available upstream upgrades, then runs /recipe-upgrade on each upgradeable one via a subagent — plan, implement, verify green on cc-ci, open a recipe PR (and, only if a cc-ci test went stale, a verified cc-ci test PR). Collects results into one summary listing every PR to review. Concurrency-bounded by default — runs up to DRONE_RUNNER_CAPACITY (the drone runner's slots, currently 2) recipe subagents at a time; --sequential for one-at-a-time, --capacity N to override, --parallel to fan out all, --dry-run to preview. NEVER merges. Built to run once weekly on a cron. Invoke as /upgrade-all.
|
||||
description: Weekly autonomous upgrade run for the cc-ci CI server. Surveys every enrolled recipe for available upstream upgrades, then runs /recipe-upgrade on each upgradeable one via a subagent — plan, implement, verify green on cc-ci, open a recipe PR (and, only if a cc-ci test went stale, a verified cc-ci test PR). Collects results into one summary listing every PR to review. Rolling pool by default — works through recipes ALPHABETICALLY keeping DRONE_RUNNER_CAPACITY (the drone runner's slots, currently 2) subagents running at once, starting the next as each finishes; --sequential for one-at-a-time, --capacity N to override the pool size, --parallel to start all at once, --dry-run to preview. NEVER merges. Built to run once weekly on a cron. Invoke as /upgrade-all.
|
||||
---
|
||||
|
||||
# upgrade-all
|
||||
@ -23,8 +23,9 @@ session, but the agent is the intended path so the weekly run isn't buried in he
|
||||
## Arguments (optional `$ARGUMENTS`)
|
||||
- A space-separated list of recipe names → only those (else all enrolled recipes).
|
||||
- `--dry-run` → survey + print what WOULD upgrade; spawn nothing.
|
||||
- **Default = concurrency-bounded:** run up to **`DRONE_RUNNER_CAPACITY`** recipe subagents at a
|
||||
time (the drone runner's slots — currently `2`). The 2026-06-10 concurrency restructure
|
||||
- **Default = rolling pool, alphabetical:** work through the recipes in alphabetical order keeping
|
||||
**`DRONE_RUNNER_CAPACITY`** subagents running at once (the drone runner's slots — currently `2`),
|
||||
starting the next recipe the moment one finishes. The 2026-06-10 concurrency restructure
|
||||
(`docs/concurrency.md`) makes concurrent recipe runs SAFE (per-run recipe trees + app-domain
|
||||
locks + isolation), and the capacity knob is the operator's resource-tuned ceiling — so matching
|
||||
the subagent pool to it uses all available concurrency without oversubscribing.
|
||||
@ -126,7 +127,7 @@ CAP=${CAP:-2} # fallback to the documented default if the query fails
|
||||
```
|
||||
`--sequential` → `CAP=1`; `--capacity N` → `CAP=N`; `--parallel` → `CAP=∞` (all at once).
|
||||
Then print a table — Recipe | Status (will upgrade / skipped:reason) | Available upgrade(s) — plus
|
||||
the mode (`Concurrency-bounded (N=<CAP>)` default / `Sequential` / `Parallel`). If `--dry-run`,
|
||||
the mode (`Rolling pool (N=<CAP>, alphabetical)` default / `Sequential` / `Parallel`). If `--dry-run`,
|
||||
**stop here**.
|
||||
|
||||
## 3. Upgrade each recipe via a subagent
|
||||
@ -147,31 +148,20 @@ test change deserves a human decision. So the cron opens recipe PRs only; where
|
||||
test looks stale, the operator sees the explanation in the PR comment and can re-run that one recipe
|
||||
with `/recipe-upgrade <recipe> --with-tests` to also get a verified test-update PR.
|
||||
|
||||
- **Concurrency-bounded (default), `CAP` at a time:** process `RECIPES_TO_UPGRADE` in waves of
|
||||
`CAP`. For each wave, emit `CAP` Agent calls **in one message** (they run concurrently — no
|
||||
`run_in_background`, you need their `RESULT:` lines); wait for all `CAP` to return; collect their
|
||||
results; then start the next wave. This keeps the drone runner's `CAP` slots busy without
|
||||
oversubscribing. `CAP=1` degenerates to one-at-a-time (sequential). An Agent tool-call error →
|
||||
record `FAILED — agent tool error` and continue; one failure never aborts the run or its wave.
|
||||
- **Parallel (`--parallel`):** `CAP=∞` — emit ALL Agent calls in one message. Heaviest load;
|
||||
failures still isolated per recipe.
|
||||
|
||||
**Wave composition — alternate heavy and light (do NOT just go heaviest-first).** The binding limit
|
||||
is host memory, so never put two HEAVY recipes in the same wave: pair each heavy with a light one so
|
||||
peak memory stays bounded while both slots stay busy. Classify from the survey's image set:
|
||||
- **HEAVY** (multi-GB images / many services / slow deploy): discourse, immich, matrix-synapse,
|
||||
lasuite-drive, mattermost-lts, ghost.
|
||||
- **LIGHT / moderate** (one small image or a few light services): custom-html*, cryptpad, n8n,
|
||||
uptime-kuma, lasuite-docs, lasuite-meet, keycloak, mailu, plausible.
|
||||
Build the wave order by drawing one HEAVY + one LIGHT per wave (capacity 2) until a bucket empties,
|
||||
then pair up the remainder. **Always fill all `CAP` slots** — once only heavies are left, run two (or
|
||||
`CAP`) heavies per wave rather than leaving a slot idle; the box is tuned for `CAP` concurrent builds.
|
||||
The heavy/light alternation is only to *spread* heavies across waves while a light is available, not a
|
||||
hard cap. For `CAP>2`, spread heavies across waves where possible but still fill the slots.
|
||||
|
||||
Note (wave barrier): a wave waits for its slowest recipe before the next starts, so the light slot
|
||||
idles while its heavy wave-mate finishes — accepted for the weekly run. Do NOT maintain a rolling
|
||||
pool with `run_in_background` (you'd lose the `RESULT:` lines).
|
||||
**Rolling pool (default), `CAP` running at once, recipes in ALPHABETICAL order.** Keep exactly `CAP`
|
||||
recipe subagents in flight at all times — as soon as one finishes, immediately start the next
|
||||
alphabetical recipe. No waves, no heavy/light classification — just two (=`CAP`) always running.
|
||||
How:
|
||||
1. Sort `RECIPES_TO_UPGRADE` alphabetically.
|
||||
2. Start the first `CAP` as background Agents (`run_in_background: true`) — you are notified when
|
||||
each completes and can read its final output for the `RESULT:` line.
|
||||
3. On each completion: read that agent's `RESULT:` and record it; if recipes remain, immediately
|
||||
start the next alphabetical one as a background Agent (keeping `CAP` in flight).
|
||||
4. Repeat until every recipe has been started AND every subagent has completed, then go to §4.
|
||||
`CAP=1` → strictly one-at-a-time (sequential). An Agent that dies/errors → record `FAILED — agent
|
||||
tool error` and still start its replacement; one failure never aborts the run.
|
||||
- **Parallel (`--parallel`):** no pool cap — start ALL recipes as background Agents at once. Heaviest
|
||||
load (every step-2b chaos deploy concurrent); only when the box is otherwise idle.
|
||||
|
||||
## 4. Collect results
|
||||
Parse each final `RESULT:` line into SUCCESS / SUCCESS-PENDING-TESTS / FAILED / SKIPPED (default mode
|
||||
|
||||
Reference in New Issue
Block a user