cc-ci-orchestrator

Author	SHA1	Message	Date
autonomic-bot	a22ae8deed	journal: redfix DONE — all 6 canon-sweep failures fixed + verified (4 recipe PRs, 2 harness); SEQUENCE-COMPLETE	2026-06-18 14:54:28 +00:00
autonomic-bot	359f3f0978	journal: SEQUENCE-COMPLETE — regall/samever/canon/dash/settings/nixenv all DONE, host healthy	2026-06-17 22:36:13 +00:00
autonomic-bot	3c1d48e0b8	journal: gtea phase DONE — gitea fully-tested + LFS PR #1 verified; SEQUENCE-COMPLETE	2026-06-15 22:58:30 +00:00
autonomic-bot	6acad7b35b	recipe-upgrade: abra recipe release for version bump + upstream release-notes links in PR body cc-ci recipe-upgrade skill now computes the version via 'abra recipe release --dry-run' (not a hand-edit) and requires the PR body to link upstream release notes per service. Bumps the recipe-maintainer submodule pointer to the matching change.	2026-06-15 18:09:41 +00:00
autonomic-bot	489f6670da	journal: pxgate cold-boot proof passed (real reboot, deploy-proxy active before dashboard)	2026-06-13 13:52:56 +00:00
autonomic-bot	6005a212d6	memory+journal: cc-ci host rebuild procedure; pxgate M2 deployed + verified on live host	2026-06-13 13:46:19 +00:00
autonomic-bot	1aee81b4f3	plan: queue pxgate — fix deploy-proxy/dashboard health-gate circular dependency (D8) Re-target the traefik health gate off ci.commoninternet.net (the dashboard, which is After=deploy-proxy) onto a traefik-self endpoint, breaking the fresh-boot deadlock while keeping health-gated rollback. M1 controlled repro by the loops; M2 from-scratch cold-boot proof owned by the orchestrator.	2026-06-13 12:38:40 +00:00
autonomic-bot	1e9337ce89	agents.toml: re-add cf48 (opus cfold review) dropped during the launch-system migration The unification transcribed phases from .phases-spec before cf48 was added, so the operator's just-requested opus 4.8 cfold review got dropped. Re-append it after ghost (system is past cf55/on pvfix, so can't insert before pvfix without shifting the live phase index). agents.py re-reads config each tick.	2026-06-13 05:32:47 +00:00
autonomic-bot	b4a6aaea7e	plan: queue cf48 — Opus 4.8 post-cfold coverage-loss review (cross-check of cf55 GPT-5.5) Second independent review of the cfold custom-folder collapse, by Opus 4.8 instead of GPT-5.5, inserted after cf55 (queue ...cfold;cf55;cf48;pvfix;...). Per-phase overrides .loop-model[-adv]-cf48=claude-opus-4-8 on the claude backend.	2026-06-13 05:15:05 +00:00
autonomic-bot	2c64cd69f0	fix(watchdog): detect idle opencode turns	2026-06-12 21:47:06 +00:00
autonomic-bot	85498931d1	plan: add gpt55 cfold review phase	2026-06-12 16:07:48 +00:00
autonomic-bot	dea6359bcd	plan: queue proxy and ghost follow-up phases	2026-06-12 15:56:03 +00:00
autonomic-bot	a186f23b37	orchestrator: restore opencode web launcher	2026-06-12 15:45:09 +00:00
autonomic-bot	ca02a0dd6f	upgrade-all: proxy VIP-exhaustion guard in Step 0; runbooks for proxy /16 enlarge + ghost PR debug Root-caused (empirically, dockerd logs) the discourse/ghost deploy wedges: the shared proxy overlay (/24=254 VIPs) exhausts as concurrent stack rm leaks endpoints over many days -> tasks stuck in Swarm 'New'. Add a per-run safety net to Step 0 (network prune + docker restart when VIP-allocation failures are logged). Plans + memory for the durable fix (enlarge proxy to /16 in swarm.nix, maintenance window) and for debugging/fixing the ghost PR afterward.	2026-06-12 03:30:00 +00:00
autonomic-bot	1eb720e95a	journal: unstuck weekly upgrade wedged on discourse Swarm scheduling hiccup	2026-06-12 00:31:29 +00:00
autonomic-bot	a1cceef3d4	ops: pause cfold until /upgrade-all finishes (serialize — they conflict on CI); journal+memory	2026-06-11 22:56:27 +00:00
autonomic-bot	af2b2e8156	plan: phase 'cfold' — collapse functional/+playwright/ into custom/ + full !testme recipe sweep (queued after drone) The functional/playwright split is purely organizational (discovery globs both with no branching; same custom tier -> L4 rung, same fixtures, same failure semantics). Migrate all custom tests to one custom/ folder; M1 proves coverage identical before/after (no silent drops), M2 is a full real-CI !testme sweep across all recipes confirming levels unchanged. cfold becomes the last phase so the queued /upgrade-all fires after it (folder change verified before upgrade).	2026-06-11 22:52:45 +00:00
autonomic-bot	34fc68d4b8	journal: coordination files moved to machine-docs/; memory committed	2026-06-11 20:57:57 +00:00
autonomic-bot	23b5fc4753	journal: weekly upgrade skipped tonight, queued after phase queue via watchdog hook	2026-06-11 20:50:25 +00:00
autonomic-bot	0005ce81af	journal: mailu false-completion incident + fix + re-queue	2026-06-11 18:20:54 +00:00
autonomic-bot	327b9f4efe	plan: phases dstamp, mailu, kuma, drone (queued after bsky) + journal - dstamp: attribute + fix the discourse abra-stamp drift (env change 06-05→ 06-10, harness-neutral, currently pinning discourse at L1); blast-radius sweep; HC1 keeps its teeth - mailu: backupbot v2 labels recipe PR, restore proven on real seeded mail, backup rung earned instead of skipped (operator approved re-entry) - kuma: uptime-kuma first-run wizard + create-a-monitor functional test (Socket.IO or Playwright, real probe evidence, flake-checked) - drone: gitea-dep enrollment, maximal subset per Phase-2 scoping; P0 /etc/timezone host deploy is orchestrator-owned (3bde76f committed)	2026-06-11 11:43:03 +00:00
autonomic-bot	c89cd6366b	plan: phase 'bsky' — fix bluesky-pds recipe + its screenshot (queued after lvl5) Root-cause the upstream image breakage (Cannot find module /app/index.js, Node v24 under the pinned tag — proven harness/ref-neutral in rcust M2), research upstream releases (persist to cc-ci-plan/upstream/bluesky-pds.md), fix via recipe-mirror PR (NEVER merge — operator does), prove full lifecycle green incl. the new L5 lint rung via !testme at PR head, then verify a real credential-free screenshot on those runs (hook only if needed). Close both DEFERRED bluesky entries; crisp operator handoff in STATUS-bsky.md.	2026-06-11 11:30:49 +00:00
autonomic-bot	5ea17fca21	watchdog: fix limit-probe self-match + scrollback dedupe wedge; plan(lvl5): badge shows level only Night-watch findings (monthly-spend-limit window, ~01:49-04:45): - probe text said 'usage limit' which matches LIMIT_RE, so a submitted probe kept limited_now true forever -> reworded to 'quota window' with a CAUTION note (nudge text must never match LIMIT_RE) - dedupe scanned all 40 captured lines, so once a probe scrolled into the conversation no further probe ever fired (builder/adv frozen at nudges=1, orchestrator probes degraded to hourly riding the wake scroll) -> dedupe now only checks the bottom 8 lines (input area) Core invariant HELD: zero kill+reboots during the limit window. plan(lvl5): operator addition - the top-corner level badge (card, dashboard pill, badge SVG) shows only the level number+color, zero capping info; the inline per-rung table keeps intentional-skip/unverified detail.	2026-06-11 05:52:26 +00:00
autonomic-bot	76aa104dbd	plan(lvl5): N/A split — intentional skip climbs, unintentional (unverified) blocks Operator refinement: only declared/structural skips (not backup-capable, no previous version) let the climb continue; a rung that should have run but didn't (infra error, abra missing, tier abort, timeout) blocks the level at the last verified rung. Every N/A source in derive_rungs gets an explicit classification (DECISIONS.md, adversary-reviewed); unclassifiable defaults to unverified. Unit tests + one synthesized tier-abort run prove the rule.	2026-06-11 01:47:26 +00:00
autonomic-bot	1f7fc7eb39	plan(lvl5): fold in de-capping — level = highest passed rung, N/A skips, fail blocks Operator decision (explicit Q&A 2026-06-11): remove cap/cap_reason/capped entirely. New formula: level = max i with rung_i==pass and all j<i in {pass,na}. N/A no longer stops the climb (the confusing part — e.g. non-backup-capable recipes were stuck at L2); a real FAIL still blocks. Per-rung table + verdict carry the completeness story. Added: de-cap implementation reqs, both-schema rendering, before/after level table for all recipes, N/A-skip proof run, bad-canary designed-levels re-derivation under the new formula.	2026-06-11 01:45:54 +00:00
autonomic-bot	0aab78d3a2	plan: phase 'lvl5' — L5 level rung: abra recipe lint passes on the PR (queued after shot) New top rung after install/upgrade/backup-restore/functional: lint the exact recipe ref under test; gap-caps per ladder semantics; verdict-neutral and time-bounded; mirror-origin R014 plumbing must not pollute recipe lint results (abra.py:109-114); all consumers (results/card/dashboard/badge/docs/tests) updated; old artifacts still render. M1 = adversary-cold-verified implementation pre-merge; M2 = real-CI proof incl. a genuine L5, a genuine lint-capped L4, and 2 drone-path runs. Recipe lint failures -> mirror PRs or DEFERRED, never merged.	2026-06-11 01:39:27 +00:00
autonomic-bot	7c042c2f2a	plan: phase 'shot' — recipe screenshot audit & repair (queued after rcust) Audit every enrolled recipe's CI badge/card screenshot, diagnose defects (plausible null-every-run; ~4.8KB blank-frame SPAs: immich/lasuite-meet/ cryptpad/flaky n8n), fix via harness default-wait improvement first, per-recipe SCREENSHOT hooks second; M1 audit matrix + M2 visually-verified PNGs on fresh real-CI runs (>=2 !testme). Cosmetics-never-block and secret-safety guardrails binding. Also: temporary hourly-wake instruction to verify the new limit-wait system tonight; journal entry.	2026-06-11 01:17:32 +00:00
autonomic-bot	a6e177e286	journal: phase conc DONE — concurrency restructure landed, M1+M2 adversary-verified	2026-06-10 13:58:25 +00:00
autonomic-bot	335ea1d7c1	journal: session wrap — concurrent CI fixed, immich (245) + plausible (247) both GREEN	2026-06-09 23:18:32 +00:00
autonomic-bot	926e4553b7	journal: immich PR #2 GREEN (build 245, level=4); cc-ci PR #9 merged; plausible unblocked	2026-06-09 23:13:56 +00:00
autonomic-bot	e3e0a9ee80	journal: two harness convergence fixes (UpdateStatus settle + paused-is-settled); immich build 245 in flight	2026-06-09 23:08:59 +00:00
autonomic-bot	1580738c97	journal: concurrent-CI fixes landed on cc-ci main (build 236 green)	2026-06-09 22:02:08 +00:00
autonomic-bot	ec3e0c35dd	journal: orchestrator handover — concurrent-CI fixes + immich/plausible drive	2026-06-09 19:45:21 +00:00
autonomic-bot	f20a066f5c	journal: recipe-report v2 newspaper front page (CVE-led editorial) live Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 23:20:48 +00:00
autonomic-bot	856df8cb37	journal: /recipe-report + report.ci.commoninternet.net shipped; first opus report live Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 23:06:22 +00:00
autonomic-bot	d38f80048a	journal: bridge one-comment-per-!testme deployed; note cc-ci deploy-path gap Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 17:30:13 +00:00
autonomic-bot	bfe3a97301	journal: overnight /upgrade-all complete — 10 GREEN, 2 stale-test, 2 failed, 4 skipped Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 11:41:47 +00:00
autonomic-bot	cdbc5bb42f	journal: mirror+regression phases DONE (build sequence complete); overnight /upgrade-all running Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 03:43:46 +00:00
autonomic-bot	d219b0972c	journal: BUILD COMPLETE + weekly-upgrade cron cutover to NixOS timer (Sun 02:00 UTC) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 23:26:59 +00:00
autonomic-bot	d8f558e987	journal: backend reverted to claude, waker folded into watchdog, boot service fixed Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 21:48:09 +00:00
autonomic-bot	2235110e29	journal: phase-5 progress-monitor events (19:04, 19:08) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 21:46:29 +00:00
autonomic-bot	8f7265e948	feat(orchestrator): wake the live monitor session	2026-06-01 18:51:05 +00:00
autonomic-bot	9fe9d49cac	journal: record Hetzner rescue recovery for cc-ci	2026-06-01 13:55:15 +00:00
autonomic-bot	8093a95184	journal: session 2026-06-01 03:34 UTC handoff (opencode gpt-5.4 visible)	2026-06-01 13:03:51 +00:00
autonomic-bot	2aa3fbda8d	journal: session 2026-05-31 18:30 UTC handoff (opencode/deepseek running, phase 5)	2026-05-31 18:27:17 +00:00
autonomic-bot	bca51071bd	refactor: rewrite launchers as Python; add orchestrator JOURNAL.md Bash scripts are now one-liner wrappers: exec python3 <script>.py "$@" All logic lives in the Python scripts (pure stdlib, no deps). launch.py — loops + watchdog: Full port of launch.sh: phase sequencing, start/stop/status/logs/watchdog, handoff signalling, stall detection, heal_session, heal_orchestrator. Cleaner structure: config block → helpers → phase/kickoff/agent/healing/ handoff/watchdog/main. LOOP_BACKEND + LOOP_MODEL switches throughout. launch-orchestrator.py — orchestrator session: claude path: --resume <id> preserved (conversation survives reboots). opencode path: run --attach --title (no --resume; STARTUP_PROMPT orients the new session; reads JOURNAL.md for context). STARTUP_PROMPT updated to reference JOURNAL.md on startup. launch-upgrader.py — one-shot upgrade job: LOOP_BACKEND / LOOP_MODEL take precedence over UPGRADER_BACKEND / UPGRADER_MODEL. Both claude and opencode paths supported. cc-ci-plan/JOURNAL.md — new orchestrator handoff file: Persistent across conversation resets. Documents the handoff format and carries the current session's summary: migration complete, phase 5 in progress (V3/V7 PASS), phase 4 deferred, open items for next session. AGENTS.md: step 1 on startup = read JOURNAL.md; step 5 = append on handoff. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 17:50:09 +00:00

46 Commits