Operator 2026-06-17: all recipes WARM_CANONICAL (watch warm-volume disk),
weekly timer, and an explicit M2 requirement to prove the sweep<->samever
interaction — skip uses COMMIT equality, samever uses VERSION equality; M2
must demonstrate all 3 sweep paths (skip / version-bump upgrade / same-version-
label step-back) and the commit-vs-version boundary.
Operator 2026-06-17. The nightly-sweep timer fires green but is a no-op: only
custom-html is WARM_CANONICAL and zero canonical.json records exist -> no
canonical has ever been promoted end-to-end. canon makes it real + proven:
fix/prove the promote path, broaden enrollment, add upstream mirror-sync +
skip-when-unchanged, verify end-to-end (incl. run-twice no-op). Schedule is
incidental; correctness is the deliverable. Replaces the hollow sweep. opus.
The nightly cold-on-latest run promotes canonical->LATEST, so every subsequent
nightly (until a new version ships) finds canonical==latest==version-under-test
-> base==head -> same-version no-op. This is the common path, not a rare edge.
M2 now proves the fix on that nightly scenario (canonical already==latest ->
step back to previous published).
Operator 2026-06-17. Closes the prevb resolver gap: when the last-green
warm-canonical base version equals the PR head version, step back to the
newest published version strictly older than head (design A) instead of a
same-version no-op or a skip. Design B (canonical history for a green-verified
older base) saved to IDEAS. Auto-runs after regall (watchdog advances + switches
to opus).
Operator 2026-06-16. Replaces the static UPGRADE_BASE_VERSION + leaky single
compose.ccci.yml overlay model: dynamic base = last-green(warm canonical) ->
main fallback -> skip; optional minimal per-recipe previous/ folder for
base-only version repairs (ignored for head, version-guarded, removable when
stale). Validated on discourse PR #4 (official-image switch the current overlay
masks). regall then sweeps all recipes for regressions on sonnet.
Operator-requested 2026-06-15. gitea is currently only a dep provider for
drone; this phase builds the full recipe-under-test suite (install/upgrade/
backup/restore/custom + lint + screenshot), ports the upstream parity corpus
(health_check, git_push), and verifies recipe-maintainers/gitea PR #1
('feat: support Git LFS on plain gitea', branch lfs-plain-gitea) via an LFS
round-trip + JWT-stability capstone test — red/skipped on main, green on the
PR. Central constraint: do NOT break drone's gitea-dep path (shared
recipe_meta.py). builder+adversary on sonnet.
The agents.toml diff also records the already-live aoeng/aotest/porepo/poe2e
queue head (agent-orchestrator workstream; their plan files remain owned by
that effort).
cc-ci recipe-upgrade skill now computes the version via 'abra recipe release --dry-run'
(not a hand-edit) and requires the PR body to link upstream release notes per service.
Bumps the recipe-maintainer submodule pointer to the matching change.
Operator: uptime-kuma is maintained elsewhere — drop it from the weekly upgrade
but keep it in the used-recipes inventory. New cc-ci-plan/used-recipes.md is the
canonical list of every recipe cc-ci deploys/tests, with a weekly|external tier;
upgrade-all §1 now excludes 'external' rows from the candidate list (explicit
--args still override). uptime-kuma = external; all others weekly.
Re-target the traefik health gate off ci.commoninternet.net (the dashboard,
which is After=deploy-proxy) onto a traefik-self endpoint, breaking the
fresh-boot deadlock while keeping health-gated rollback. M1 controlled repro by
the loops; M2 from-scratch cold-boot proof owned by the orchestrator.
The unified agents.py watchdog kept firing the hourly orchestrator supervision
ping even after SEQUENCE-COMPLETE (the old launch.py watchdog exited on
completion, which stopped them). Gate the wake loop on the SEQUENCE-COMPLETE
marker so a finished build stays fully at rest — no pings. Resumes
automatically when new work is queued (that clears the marker, line 631).
The unification transcribed phases from .phases-spec before cf48 was added,
so the operator's just-requested opus 4.8 cfold review got dropped. Re-append
it after ghost (system is past cf55/on pvfix, so can't insert before pvfix
without shifting the live phase index). agents.py re-reads config each tick.
Second independent review of the cfold custom-folder collapse, by Opus 4.8
instead of GPT-5.5, inserted after cf55 (queue ...cfold;cf55;cf48;pvfix;...).
Per-phase overrides .loop-model[-adv]-cf48=claude-opus-4-8 on the claude backend.
Root-caused (empirically, dockerd logs) the discourse/ghost deploy wedges:
the shared proxy overlay (/24=254 VIPs) exhausts as concurrent stack rm leaks
endpoints over many days -> tasks stuck in Swarm 'New'. Add a per-run safety
net to Step 0 (network prune + docker restart when VIP-allocation failures are
logged). Plans + memory for the durable fix (enlarge proxy to /16 in swarm.nix,
maintenance window) and for debugging/fixing the ghost PR afterward.
The functional/playwright split is purely organizational (discovery globs both
with no branching; same custom tier -> L4 rung, same fixtures, same failure
semantics). Migrate all custom tests to one custom/ folder; M1 proves coverage
identical before/after (no silent drops), M2 is a full real-CI !testme sweep
across all recipes confirming levels unchanged. cfold becomes the last phase so
the queued /upgrade-all fires after it (folder change verified before upgrade).
Recent phases wrote STATUS/BACKLOG/REVIEW/JOURNAL to the repo ROOT because
build_kickoff + plan.md's tree used bare filenames, even though the loops'
AGENTS.md + INBOX/DECISIONS/DEFERRED conventions already said machine-docs/.
Make machine-docs/ the single mandated home everywhere: build_kickoff now
emits machine-docs/ paths + an explicit FILE-LOCATION RULE; both loop prompts
and plan.md (tree + seed step) updated; orchestrator AGENTS.md documents +
enforces it. resolve_state/INBOX handoff already read machine-docs/ first.
When LOG_DIR/.run-upgrade-on-complete exists, the watchdog launches
launch-upgrader.py start the moment the last phase reaches ## DONE (then
consumes the flag). Lets the operator replace a scheduled weekly cron run with
'run as soon as the current phase queue finishes' — used tonight: the
cc-ci-upgrade-all.timer was stopped (stamp forwarded past tonight's slot) and
this flag set instead.
A Builder scaffolded 'STATUS-mailu.md' with a '## DONE / Not yet. Written
here only when ...' placeholder section; phase_done's startswith('## DONE')
matched it and auto-advanced past mailu without any of its work being done
(no recipe PR, no claim, no review). Harden phase_done: a '## DONE' heading
counts only when its first non-empty body line is not a placeholder/negation
(Not yet / pending / TBD / when all / <...> etc). Verified against all shipped
STATUS files (real DONEs still detected; mailu placeholder rejected).
Lets a single phase pin a different model, read fresh each role_model call so
a phase transition flips it automatically with no watchdog bounce. Operator
wants builder on opus for the complex dstamp phase, reverting to sonnet from
mailu on: .loop-model-dstamp=opus while base .loop-model stays sonnet.
Root-cause the upstream image breakage (Cannot find module /app/index.js,
Node v24 under the pinned tag — proven harness/ref-neutral in rcust M2),
research upstream releases (persist to cc-ci-plan/upstream/bluesky-pds.md),
fix via recipe-mirror PR (NEVER merge — operator does), prove full lifecycle
green incl. the new L5 lint rung via !testme at PR head, then verify a real
credential-free screenshot on those runs (hook only if needed). Close both
DEFERRED bluesky entries; crisp operator handoff in STATUS-bsky.md.
The 2026-06-11 night watch is over: the limit-wait system was verified
end-to-end on a real monthly-spend-limit window (hit -> hold without reboots
-> flat probes -> prompt resume on lift), and the three bugs it surfaced are
fixed (5ea17fc, 969eb60). Standing supervision continues without the extra
check.
The tick whose probe resumed a session was continuing into stall logic with
its pre-resume pane capture; a 4h-old WAITING-UNTIL in that stale data got
the freshly-resumed adversary kill+rebooted (05:52). Treat probe-resume as
handled-this-tick; the next 30s tick sees the live session.
Night-watch findings (monthly-spend-limit window, ~01:49-04:45):
- probe text said 'usage limit' which matches LIMIT_RE, so a submitted probe
kept limited_now true forever -> reworded to 'quota window' with a CAUTION
note (nudge text must never match LIMIT_RE)
- dedupe scanned all 40 captured lines, so once a probe scrolled into the
conversation no further probe ever fired (builder/adv frozen at nudges=1,
orchestrator probes degraded to hourly riding the wake scroll) -> dedupe
now only checks the bottom 8 lines (input area)
Core invariant HELD: zero kill+reboots during the limit window.
plan(lvl5): operator addition - the top-corner level badge (card, dashboard
pill, badge SVG) shows only the level number+color, zero capping info; the
inline per-rung table keeps intentional-skip/unverified detail.
Operator refinement: only declared/structural skips (not backup-capable, no
previous version) let the climb continue; a rung that should have run but
didn't (infra error, abra missing, tier abort, timeout) blocks the level at
the last verified rung. Every N/A source in derive_rungs gets an explicit
classification (DECISIONS.md, adversary-reviewed); unclassifiable defaults to
unverified. Unit tests + one synthesized tier-abort run prove the rule.
Operator decision (explicit Q&A 2026-06-11): remove cap/cap_reason/capped
entirely. New formula: level = max i with rung_i==pass and all j<i in
{pass,na}. N/A no longer stops the climb (the confusing part — e.g.
non-backup-capable recipes were stuck at L2); a real FAIL still blocks.
Per-rung table + verdict carry the completeness story. Added: de-cap
implementation reqs, both-schema rendering, before/after level table for all
recipes, N/A-skip proof run, bad-canary designed-levels re-derivation under
the new formula.
New top rung after install/upgrade/backup-restore/functional: lint the exact
recipe ref under test; gap-caps per ladder semantics; verdict-neutral and
time-bounded; mirror-origin R014 plumbing must not pollute recipe lint results
(abra.py:109-114); all consumers (results/card/dashboard/badge/docs/tests)
updated; old artifacts still render. M1 = adversary-cold-verified implementation
pre-merge; M2 = real-CI proof incl. a genuine L5, a genuine lint-capped L4, and
2 drone-path runs. Recipe lint failures -> mirror PRs or DEFERRED, never merged.
Operator request: the hourly supervision prompt should land regardless of
limit state, as a fallback that keeps things on track if the limit-state
machinery ever breaks. If the limit is genuinely still in force the wake is
harmless (the banner just re-prints and limit_tick re-arms); once it lifts,
the queued wake doubles as a resume nudge.