cc-ci-orchestrator

Author	SHA1	Message	Date
autonomic-bot	65ee741869	plan: queue prevb (dynamic upgrade base + previous/ config, opus) + regall (all-recipe regression, sonnet) Operator 2026-06-16. Replaces the static UPGRADE_BASE_VERSION + leaky single compose.ccci.yml overlay model: dynamic base = last-green(warm canonical) -> main fallback -> skip; optional minimal per-recipe previous/ folder for base-only version repairs (ignored for head, version-guarded, removable when stale). Validated on discourse PR #4 (official-image switch the current overlay masks). regall then sweeps all recipes for regressions on sonnet.	2026-06-16 23:55:28 +00:00
autonomic-bot	3c1d48e0b8	journal: gtea phase DONE — gitea fully-tested + LFS PR #1 verified; SEQUENCE-COMPLETE	2026-06-15 22:58:30 +00:00
autonomic-bot	24b3a25ce6	plan: queue gtea — enroll gitea as a fully-tested recipe + verify LFS PR #1 Operator-requested 2026-06-15. gitea is currently only a dep provider for drone; this phase builds the full recipe-under-test suite (install/upgrade/ backup/restore/custom + lint + screenshot), ports the upstream parity corpus (health_check, git_push), and verifies recipe-maintainers/gitea PR #1 ('feat: support Git LFS on plain gitea', branch lfs-plain-gitea) via an LFS round-trip + JWT-stability capstone test — red/skipped on main, green on the PR. Central constraint: do NOT break drone's gitea-dep path (shared recipe_meta.py). builder+adversary on sonnet. The agents.toml diff also records the already-live aoeng/aotest/porepo/poe2e queue head (agent-orchestrator workstream; their plan files remain owned by that effort).	2026-06-15 19:32:14 +00:00
autonomic-bot	2f2225e466	recipe-upgrade: defer version bump to operator's final 'abra recipe release' §2 no longer bumps the coop-cloud version label in the recipe PR (no --dry-run compute, no tag). It records the recommended 'abra recipe release <recipe> -<x\|y\|z>' (no --dry-run) in the PR body (§3) as the operator's final publish step — run after the upstream PR merges, it bumps the label + tags + pushes in one go. Bumps recipe-maintainer submodule to 9daddac (same change across its /recipe-upstream, -upgrade-apply, -upgrade-plan, -new-tag).	2026-06-15 18:38:27 +00:00
autonomic-bot	6acad7b35b	recipe-upgrade: abra recipe release for version bump + upstream release-notes links in PR body cc-ci recipe-upgrade skill now computes the version via 'abra recipe release --dry-run' (not a hand-edit) and requires the PR body to link upstream release notes per service. Bumps the recipe-maintainer submodule pointer to the matching change.	2026-06-15 18:09:41 +00:00
autonomic-bot	6a2464469f	upgrade-all: skip 'external' recipes (uptime-kuma) + add used-recipes.md inventory Operator: uptime-kuma is maintained elsewhere — drop it from the weekly upgrade but keep it in the used-recipes inventory. New cc-ci-plan/used-recipes.md is the canonical list of every recipe cc-ci deploys/tests, with a weekly\|external tier; upgrade-all §1 now excludes 'external' rows from the candidate list (explicit --args still override). uptime-kuma = external; all others weekly.	2026-06-15 17:00:28 +00:00
autonomic-bot	489f6670da	journal: pxgate cold-boot proof passed (real reboot, deploy-proxy active before dashboard)	2026-06-13 13:52:56 +00:00
autonomic-bot	6005a212d6	memory+journal: cc-ci host rebuild procedure; pxgate M2 deployed + verified on live host	2026-06-13 13:46:19 +00:00
autonomic-bot	1aee81b4f3	plan: queue pxgate — fix deploy-proxy/dashboard health-gate circular dependency (D8) Re-target the traefik health gate off ci.commoninternet.net (the dashboard, which is After=deploy-proxy) onto a traefik-self endpoint, breaking the fresh-boot deadlock while keeping health-gated rollback. M1 controlled repro by the loops; M2 from-scratch cold-boot proof owned by the orchestrator.	2026-06-13 12:38:40 +00:00
autonomic-bot	97303abc25	watchdog: suppress scheduled wakes once the build sequence is complete The unified agents.py watchdog kept firing the hourly orchestrator supervision ping even after SEQUENCE-COMPLETE (the old launch.py watchdog exited on completion, which stopped them). Gate the wake loop on the SEQUENCE-COMPLETE marker so a finished build stays fully at rest — no pings. Resumes automatically when new work is queued (that clears the marker, line 631).	2026-06-13 12:04:49 +00:00
autonomic-bot	84e13a7f23	fix(pvcheck/A2): update upgrade-all SKILL.md guard description The durable /16 proxy fix landed in phase pvfix (2026-06-13). Update the guard description from "safety net until that lands" to "belt-and-suspenders even after the /16 fix" — guard logic unchanged, description now accurate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-13 05:58:27 +00:00
autonomic-bot	1e9337ce89	agents.toml: re-add cf48 (opus cfold review) dropped during the launch-system migration The unification transcribed phases from .phases-spec before cf48 was added, so the operator's just-requested opus 4.8 cfold review got dropped. Re-append it after ghost (system is past cf55/on pvfix, so can't insert before pvfix without shifting the live phase index). agents.py re-reads config each tick.	2026-06-13 05:32:47 +00:00
autonomic-bot	b4a6aaea7e	plan: queue cf48 — Opus 4.8 post-cfold coverage-loss review (cross-check of cf55 GPT-5.5) Second independent review of the cfold custom-folder collapse, by Opus 4.8 instead of GPT-5.5, inserted after cf55 (queue ...cfold;cf55;cf48;pvfix;...). Per-phase overrides .loop-model[-adv]-cf48=claude-opus-4-8 on the claude backend.	2026-06-13 05:15:05 +00:00
autonomic-bot	2c64cd69f0	fix(watchdog): detect idle opencode turns	2026-06-12 21:47:06 +00:00
autonomic-bot	85498931d1	plan: add gpt55 cfold review phase	2026-06-12 16:07:48 +00:00
autonomic-bot	dea6359bcd	plan: queue proxy and ghost follow-up phases	2026-06-12 15:56:03 +00:00
autonomic-bot	a186f23b37	orchestrator: restore opencode web launcher	2026-06-12 15:45:09 +00:00
autonomic-bot	03343ed3cf	plan(ghost-pr): fold in upgrader's diagnosis — mysql 8.0->8.4 data-dir upgrade race (update_config.monitor too tight); PR#4 open	2026-06-12 13:45:52 +00:00
autonomic-bot	5141335fb3	upstream(mattermost-lts): update ESR notes — 10.11 ESR (Aug 2026), 10.12 expired, 11.7 next ESR; backup format notes	2026-06-12 04:14:04 +00:00
autonomic-bot	e89884beba	upstream(n8n): update standing notes for 2.26.3 upgrade	2026-06-12 04:12:32 +00:00
autonomic-bot	c7fae6cbee	upstream(matrix-synapse): update notes for 7.3.0+v1.154.0 PR#2	2026-06-12 04:00:59 +00:00
autonomic-bot	4f31abc0c7	upstream(mailu): update Redis standing note — operator approved 8.8.x jump	2026-06-12 03:48:52 +00:00
autonomic-bot	d3a9455eb3	upstream(lasuite-meet): document livekit v1.13.1 TURN-auth note + redis 8.8.0	2026-06-12 03:43:29 +00:00
autonomic-bot	ca02a0dd6f	upgrade-all: proxy VIP-exhaustion guard in Step 0; runbooks for proxy /16 enlarge + ghost PR debug Root-caused (empirically, dockerd logs) the discourse/ghost deploy wedges: the shared proxy overlay (/24=254 VIPs) exhausts as concurrent stack rm leaks endpoints over many days -> tasks stuck in Swarm 'New'. Add a per-run safety net to Step 0 (network prune + docker restart when VIP-allocation failures are logged). Plans + memory for the durable fix (enlarge proxy to /16 in swarm.nix, maintenance window) and for debugging/fixing the ghost PR afterward.	2026-06-12 03:30:00 +00:00
autonomic-bot	7ce898e0e4	upstream(immich): document concurrent app+db restart update_config fix	2026-06-12 03:26:37 +00:00
autonomic-bot	28b9431035	upstream(immich): note pgvectors0.3.0 bump in PR #2 + new digest (2026-06-12)	2026-06-12 02:50:45 +00:00
autonomic-bot	2c5e08f78c	upgrade-all: simplify to a rolling pool, alphabetical (drop waves + heavy/light) Per operator: just work through recipes alphabetically keeping CAP (= DRONE_RUNNER_CAPACITY=2) subagents running at once, starting the next the moment one finishes (rolling pool via run_in_background). Removes the wave-barrier and the heavy/light classification entirely — simpler and no slot ever idles.	2026-06-12 01:58:22 +00:00
autonomic-bot	894d829313	upgrade-all: at the tail, fill slots with two heavies rather than serialize Per operator: always fill all CAP slots. Heavy/light alternation only spreads heavies across waves while a light is available; once only heavies remain, run two-per-wave (capacity is the tuned ceiling) instead of one-per-wave.	2026-06-12 01:55:29 +00:00
autonomic-bot	f744c79e2d	upgrade-all: alternate heavy/light per wave (not heaviest-first) Host memory is the binding limit, so never schedule two HEAVY recipes in the same capacity wave — pair each heavy (discourse/immich/matrix-synapse/ lasuite-drive/mattermost-lts/ghost) with a light one to bound peak memory while keeping both slots busy. Heaviest-first could co-schedule two heavies and OOM/ wedge the box (the disc-50cc8a 'New'-state wedge). For CAP>2 cap heavies at ~CAP/2; if only heavies remain, run one-per-wave.	2026-06-12 01:47:22 +00:00
autonomic-bot	a45517b432	upgrade-all: default to concurrency-bounded (DRONE_RUNNER_CAPACITY) subagents Now that the 2026-06-10 concurrency restructure makes concurrent recipe runs safe (per-run trees, app-domain locks, isolation), default /upgrade-all to run up to DRONE_RUNNER_CAPACITY (the drone runner's slots, currently 2) recipe subagents at a time instead of strictly sequential — using all available concurrency without oversubscribing. Query the live capacity from 'systemctl show drone-runner-exec' (fallback 2); process recipes in waves of CAP (emit CAP Agent calls per message, await, next wave). Flags: --capacity N, --sequential (CAP=1, old default — use when the build loops share the box), --parallel (unbounded). Applies to the NEXT run; the in-flight run is unaffected.	2026-06-12 01:39:44 +00:00
autonomic-bot	1eb720e95a	journal: unstuck weekly upgrade wedged on discourse Swarm scheduling hiccup	2026-06-12 00:31:29 +00:00
autonomic-bot	a1cceef3d4	ops: pause cfold until /upgrade-all finishes (serialize — they conflict on CI); journal+memory	2026-06-11 22:56:27 +00:00
autonomic-bot	af2b2e8156	plan: phase 'cfold' — collapse functional/+playwright/ into custom/ + full !testme recipe sweep (queued after drone) The functional/playwright split is purely organizational (discovery globs both with no branching; same custom tier -> L4 rung, same fixtures, same failure semantics). Migrate all custom tests to one custom/ folder; M1 proves coverage identical before/after (no silent drops), M2 is a full real-CI !testme sweep across all recipes confirming levels unchanged. cfold becomes the last phase so the queued /upgrade-all fires after it (folder change verified before upgrade).	2026-06-11 22:52:45 +00:00
autonomic-bot	79134a94e8	memory: drop drone P0 host-deploy note — /etc/timezone present on cc-ci, prerequisite satisfied (drone phase deploying gitea fine)	2026-06-11 21:55:16 +00:00
autonomic-bot	34fc68d4b8	journal: coordination files moved to machine-docs/; memory committed	2026-06-11 20:57:57 +00:00
autonomic-bot	c33b21fe8d	memory: commit session notes (drone P0, weekly-upgrade-queued, mailu/index updates) Per AGENTS.md 'Agent memory lives in memory/ (in this repo)' — memory notes must be committed + pushed like any repo change, not left only in the local ~/.claude symlink target.	2026-06-11 20:56:24 +00:00
autonomic-bot	e144354668	loops: mandate machine-docs/ for ALL coordination files (kickoff/prompts/plan/AGENTS) Recent phases wrote STATUS/BACKLOG/REVIEW/JOURNAL to the repo ROOT because build_kickoff + plan.md's tree used bare filenames, even though the loops' AGENTS.md + INBOX/DECISIONS/DEFERRED conventions already said machine-docs/. Make machine-docs/ the single mandated home everywhere: build_kickoff now emits machine-docs/ paths + an explicit FILE-LOCATION RULE; both loop prompts and plan.md (tree + seed step) updated; orchestrator AGENTS.md documents + enforces it. resolve_state/INBOX handoff already read machine-docs/ first.	2026-06-11 20:56:24 +00:00
autonomic-bot	23b5fc4753	journal: weekly upgrade skipped tonight, queued after phase queue via watchdog hook	2026-06-11 20:50:25 +00:00
autonomic-bot	3fa3178546	watchdog: one-shot /upgrade-all trigger on phase-sequence completion When LOG_DIR/.run-upgrade-on-complete exists, the watchdog launches launch-upgrader.py start the moment the last phase reaches ## DONE (then consumes the flag). Lets the operator replace a scheduled weekly cron run with 'run as soon as the current phase queue finishes' — used tonight: the cc-ci-upgrade-all.timer was stopped (stamp forwarded past tonight's slot) and this flag set instead.	2026-06-11 20:49:54 +00:00
autonomic-bot	0005ce81af	journal: mailu false-completion incident + fix + re-queue	2026-06-11 18:20:54 +00:00
autonomic-bot	4275adc4a5	watchdog: phase_done ignores placeholder '## DONE' sections (skipped mailu) A Builder scaffolded 'STATUS-mailu.md' with a '## DONE / Not yet. Written here only when ...' placeholder section; phase_done's startswith('## DONE') matched it and auto-advanced past mailu without any of its work being done (no recipe PR, no claim, no review). Harden phase_done: a '## DONE' heading counts only when its first non-empty body line is not a placeholder/negation (Not yet / pending / TBD / when all / <...> etc). Verified against all shipped STATUS files (real DONEs still detected; mailu placeholder rejected).	2026-06-11 18:20:21 +00:00
autonomic-bot	211b4e231c	launch: per-phase model override (.loop-model[-adv]-<pid>) Lets a single phase pin a different model, read fresh each role_model call so a phase transition flips it automatically with no watchdog bounce. Operator wants builder on opus for the complex dstamp phase, reverting to sonnet from mailu on: .loop-model-dstamp=opus while base .loop-model stays sonnet.	2026-06-11 16:15:18 +00:00
autonomic-bot	5c260d225c	launch-orchestrator: persisted .orch-model file (ORCH_MODEL > LOOP_MODEL > file) Operator switching models near weekly limits: loops -> sonnet, orchestrator -> opus. Dotfiles updated (.loop-model/.loop-model-adv=sonnet, .orch-model=opus) so watchdog restarts keep the choice.	2026-06-11 16:03:29 +00:00
autonomic-bot	327b9f4efe	plan: phases dstamp, mailu, kuma, drone (queued after bsky) + journal - dstamp: attribute + fix the discourse abra-stamp drift (env change 06-05→ 06-10, harness-neutral, currently pinning discourse at L1); blast-radius sweep; HC1 keeps its teeth - mailu: backupbot v2 labels recipe PR, restore proven on real seeded mail, backup rung earned instead of skipped (operator approved re-entry) - kuma: uptime-kuma first-run wizard + create-a-monitor functional test (Socket.IO or Playwright, real probe evidence, flake-checked) - drone: gitea-dep enrollment, maximal subset per Phase-2 scoping; P0 /etc/timezone host deploy is orchestrator-owned (3bde76f committed)	2026-06-11 11:43:03 +00:00
autonomic-bot	f395247da4	docs(bsky): upstream registry for bluesky-pds — :0.4 is a moving tag now tracking main (0.5.1/node24/index.ts); exact tags through 0.4.219 keep classic index.js layout	2026-06-11 11:35:56 +00:00
autonomic-bot	c89cd6366b	plan: phase 'bsky' — fix bluesky-pds recipe + its screenshot (queued after lvl5) Root-cause the upstream image breakage (Cannot find module /app/index.js, Node v24 under the pinned tag — proven harness/ref-neutral in rcust M2), research upstream releases (persist to cc-ci-plan/upstream/bluesky-pds.md), fix via recipe-mirror PR (NEVER merge — operator does), prove full lifecycle green incl. the new L5 lint rung via !testme at PR head, then verify a real credential-free screenshot on those runs (hook only if needed). Close both DEFERRED bluesky entries; crisp operator handoff in STATUS-bsky.md.	2026-06-11 11:30:49 +00:00
autonomic-bot	0900c439d4	wake prompt: remove temporary limit-system night-watch line (condition met) The 2026-06-11 night watch is over: the limit-wait system was verified end-to-end on a real monthly-spend-limit window (hit -> hold without reboots -> flat probes -> prompt resume on lift), and the three bugs it surfaced are fixed (`5ea17fc`, `969eb60`). Standing supervision continues without the extra check.	2026-06-11 06:55:08 +00:00
autonomic-bot	969eb60df1	watchdog: probe-resumed tick returns True — don't evaluate stale pane after resume The tick whose probe resumed a session was continuing into stall logic with its pre-resume pane capture; a 4h-old WAITING-UNTIL in that stale data got the freshly-resumed adversary kill+rebooted (05:52). Treat probe-resume as handled-this-tick; the next 30s tick sees the live session.	2026-06-11 05:53:44 +00:00
autonomic-bot	5ea17fca21	watchdog: fix limit-probe self-match + scrollback dedupe wedge; plan(lvl5): badge shows level only Night-watch findings (monthly-spend-limit window, ~01:49-04:45): - probe text said 'usage limit' which matches LIMIT_RE, so a submitted probe kept limited_now true forever -> reworded to 'quota window' with a CAUTION note (nudge text must never match LIMIT_RE) - dedupe scanned all 40 captured lines, so once a probe scrolled into the conversation no further probe ever fired (builder/adv frozen at nudges=1, orchestrator probes degraded to hourly riding the wake scroll) -> dedupe now only checks the bottom 8 lines (input area) Core invariant HELD: zero kill+reboots during the limit window. plan(lvl5): operator addition - the top-corner level badge (card, dashboard pill, badge SVG) shows only the level number+color, zero capping info; the inline per-rung table keeps intentional-skip/unverified detail.	2026-06-11 05:52:26 +00:00
autonomic-bot	76aa104dbd	plan(lvl5): N/A split — intentional skip climbs, unintentional (unverified) blocks Operator refinement: only declared/structural skips (not backup-capable, no previous version) let the climb continue; a rung that should have run but didn't (infra error, abra missing, tier abort, timeout) blocks the level at the last verified rung. Every N/A source in derive_rungs gets an explicit classification (DECISIONS.md, adversary-reviewed); unclassifiable defaults to unverified. Unit tests + one synthesized tier-abort run prove the rule.	2026-06-11 01:47:26 +00:00

1 2 3 4 5 ...

251 Commits