Commit Graph

153 Commits

Author SHA1 Message Date
f0716764db feat(recipe-upgrade): upstream release-notes registry + recipe-README read (recipe-maintainer parity)
Close the two gaps vs recipe-maintainer's recipe-upgrade-plan:
- Per-recipe release-notes registry at cc-ci-plan/upstream/<recipe>.md (discover the source repo +
  releases/changelog URL for each image once, persist+commit, reuse) — fetch release notes FROM those
  URLs instead of rediscovering ad-hoc each run. Format doc + cryptpad seed included.
- Explicitly read the recipe's README for shipped upgrade/migration notes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:28:27 +00:00
f4b1befbdd chore(nix): weekly timer = Thu 22:00 America/New_York (Boston 10pm, DST-aware)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:21:41 +00:00
0338dc23fd chore(nix): move weekly upgrade timer to Thursdays 22:00 UTC (was Sun 02:00)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:18:20 +00:00
d8ad5a2805 feat(recipe-report): link recipe names in all story sections (security/needs/routine), not just the lead
_stories() now auto-links whole-word recipe mentions in story titles + bodies to their mirror
repos (same single-pass linkify as the lead); explicit PR/build links are untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 02:21:31 +00:00
a6efcec720 feat(recipe-report): link recipe names in the lead to their mirror repos; 3-para concise lead
render() auto-links whole-word recipe mentions in the editorial lead to
git.autonomic.zone/recipe-maintainers/<recipe> (single regex pass, longest-name-first,
no href corruption). Skill: lead is ~3 short paragraphs (~150-180 words) incl. an
'anything strange worth looking into' paragraph. example-spec.json lead updated to the
concise target.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 02:17:19 +00:00
ea2d8c8210 feat(recipe-report): use approved 2026-06-02 report as the style template; tighter lead for future runs
Save the operator-approved 2026-06-02 spec as example-spec.json (gold standard
for voice/structure/specificity). Skill now tells the agent to match it, with
one deliberate change: keep the editorial lead TIGHT (~2 short paragraphs,
~120 words). The live 2026-06-02 page stays as the reference.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 02:06:45 +00:00
f20a066f5c journal: recipe-report v2 newspaper front page (CVE-led editorial) live
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:20:48 +00:00
6cf59130db feat(recipe-report): newspaper front-page layout — editorial lead + CVE security bulletin first
Masthead + opus 'lead' editorial (overall fleet state + what to focus on), a Security Bulletin of
critical-CVE upgrades up top (mined from per-recipe upgrade_notes_md), then needs-attention/routine,
and the comprehensive table as 'the full wire' at the end. survey now includes each recipe's
upgrade_notes_md (breaking-change/CVE analysis) so opus can lead with security.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:13:40 +00:00
856df8cb37 journal: /recipe-report + report.ci.commoninternet.net shipped; first opus report live
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:06:22 +00:00
c7301a9e39 feat(recipe-report): /recipe-report skill + helper + launcher (default opus); wire into upgrade-all
- recipe-report.py: survey (run + per-recipe PRs + CI verdicts) / render (spec->HTML) / publish
  (copy to cc-ci:/var/lib/cc-ci-reports + regen index).
- skill .claude/skills/recipe-report: review the weekly run, classify needs-attention vs routine,
  publish one public HTML page per week + index at report.ci.commoninternet.net. Read-only.
- launch-report.py: one-shot cc-ci-report agent, REPORT_MODEL default opus (separate from the
  sonnet upgrader), REPORT_BACKEND default claude.
- upgrade-all SKILL: closing step launches the report agent.
Serving (nix/modules/reports.nix) already deployed + live.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:02:22 +00:00
73aa20e8ab plan(recipe-report): separate configurable report model (default opus); link CI results, no embedded images
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 22:52:27 +00:00
81984c84da plan: /recipe-report skill + report.ci.commoninternet.net weekly report
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 22:37:56 +00:00
d38f80048a journal: bridge one-comment-per-!testme deployed; note cc-ci deploy-path gap
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 17:30:13 +00:00
bfe3a97301 journal: overnight /upgrade-all complete — 10 GREEN, 2 stale-test, 2 failed, 4 skipped
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 11:41:47 +00:00
9e88927e5b ideas: Co-op Cloud NixOS modules — mkCcApp factory + health-gated rollback
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-02 05:06:30 +00:00
5c691cdb66 fix(upgrade skills): real abra-auth fix — embed git.autonomic.zone creds in origin (go-git)
The actual 'abra auth error' that skipped 8 recipes was go-git failing to
fetch tags from the PRIVATE git.autonomic.zone mirror ('authentication
required: Unauthorized'), NOT the TTY issue. abra/go-git reads
remote.origin.url literally and IGNORES git url.insteadOf + credential
helpers (confirmed: insteadOf left immich Unauthorized; literal embedded URL
fixed it). Skill now bakes $GITEA_USERNAME:$GITEA_PASSWORD into origin for
git.autonomic.zone recipes before the version check, and stashes the
untracked cc-ci overlay so it isn't mis-counted as dirty-worktree.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:40:59 +00:00
c0852d2302 feat(logs): readable greppable per-agent transcript logs (agent-log.py)
The raw 'tmux pipe-pane' logs are TUI-escape soup (the 191MB builder log).
agent-log.py renders Claude's own JSONL transcript into a clean one-event-
per-line <agent>.clean.log — read-only on a file the agent writes anyway, so
zero agent slowdown and zero extra tokens. Resolves each agent's transcript
(disambiguating the shared project dir by kickoff signature; tracks restarts).
'follow-all' runs as the cc-ci-cleanlogs session, wired into launch.py start
so it comes up with the loops. render/tail subcommands for ad-hoc use.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:35:17 +00:00
027fdbd161 fix(upgrade skills): run abra over a pseudo-TTY (fixes FATA inappropriate ioctl)
abra over plain 'ssh cc-ci abra ...' has no TTY -> FATA 'inappropriate ioctl
for device' (the abra error). The working harness (runner/harness/abra.py)
wraps abra in util-linux 'script' for a pseudo-TTY + passes -n. Apply the
same in the recipe-upgrade and upgrade-all skills: every abra call becomes
ssh cc-ci 'script -qec "abra <args> -n" /dev/null'. Confirmed: abra server
ls FATAs plain, works pty-wrapped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:06:38 +00:00
cdbc5bb42f journal: mirror+regression phases DONE (build sequence complete); overnight /upgrade-all running
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 03:43:46 +00:00
04fdefcd39 plan: overnight run — after assistant, run /upgrade-all + morning report
Bash runner (cheap polling, no claude budget) that gates on the assistant's
PR-consolidation done-marker, waits past the usage-limit reset (~03:30 UTC)
and for the loops to idle, runs the weekly /upgrade-all (DEFAULT, never
merges), then writes overnight-report-<date>.md and pings the orchestrator
to notify. One-off; the Sunday 02:00 timer is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 02:10:13 +00:00
7789e44252 task: assistant — consolidate open recipe PRs to one per recipe
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 02:02:00 +00:00
ad7ba8375a fix(recipe-upgrade): extend open upgrade PRs by commit-on-top, no force-push
Instead of force-pushing HEAD onto the existing PR branch (history rewrite),
add a commit ON TOP of the branch tip (fast-forward) when it already exists,
so the PR's history is preserved and it re-tests. Fresh branches still push
normally. The only remaining force-push is the mirror-main->upstream sync
(intentional mirroring), never a PR branch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 01:58:10 +00:00
5f814307ad fix(recipe-upgrade): default to extending an existing open upgrade PR, not a parallel one
When an open upgrade PR already exists for a recipe (branch upgrade-*), push
the new work onto ITS branch and update+re-test that PR — one evolving
upgrade PR per recipe instead of spawning a second parallel PR. Only open a
fresh upgrade-<version> PR when none exists. Unrelated open PRs (e.g. backup
fixes) are still never touched; merged-upstream PRs still close.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 01:54:58 +00:00
35f83a4b74 docs: orchestration.md as the root agent map; wake prompt + AGENTS.md point to it
One root doc maps every agent (Builder, Adversary, Orchestrator, Assistant,
Upgrader) -> its prompt + plan, with the watchdog and git coordination
protocol as the subtlety beneath. Fold the orchestrator supervision routine
into it (remove orchestrator-supervision.md). The hourly wake prompt and
AGENTS.md now just point at orchestration.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 01:42:49 +00:00
37a422bc31 refactor(wake): thin wake prompt -> points at orchestrator-supervision.md
The hourly wake prompt was hardcoding phase 5 / STATUS-5.md and going stale
as the build advanced. Make it a one-line pointer to a maintained doc
(orchestrator-supervision.md) that looks the CURRENT phase up live via
launch.py status — so the wake prompt never needs editing as phases change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 01:37:32 +00:00
7bdeb74449 plan(regression): add per-tier RED canaries (install/upgrade/backup/restore)
One deliberately-broken custom-html-tiny fixture per lifecycle tier so the
suite proves the server reports RED at EVERY tier (not just one) — each
asserts RED at the intended tier with prior tiers PASS, so it's 'catches a
failure at this tier', not 'fails somewhere'. Fast (simplest recipe); the
fast subset of the suite vs the slow good canaries.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 01:28:23 +00:00
2f9d7df78f ideas: package cc-ci itself as a Co-op Cloud recipe (parked, not implementing)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:43:44 +00:00
ad2ade842c plan(mirror): remove the operator deploy gate — loops deploy+verify autonomously
The gate existed because a wrong-target nixos-rebuild #cc-ci once dropped
the cc-ci server into emergency mode. That footgun is fixed (be4f451 maps
#cc-ci -> the Hetzner host config), and deploying cc-ci is the loops'
normal operation, so Phase 4 now runs autonomously with verify + rollback
as the safety net.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:38:59 +00:00
fd86baea2a plan: regression canaries are milestone-cadence (polish/review/release), not per-commit
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:30:09 +00:00
947e7f55b9 plan: server regression canaries (codified E2E good+bad self-tests)
E2E pytest canaries proving the server confirms a healthy app healthy
(semantic per-tier assertions, not just exit codes) AND catches a broken
one (false-green guard). Good canaries: custom-html-tiny + lasuite-docs;
known-bad fixture must report RED. Queued as the loops' next phase after
mirror-enroll.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:29:01 +00:00
2b617ba19f feat(launch): persist PHASES_SPEC to .phases-spec (status/watchdog/reboot agree)
Mirror the .loop-backend pattern: env wins, else the persisted file, else
the default build sequence. Without this, a custom single-phase run was
invisible to bare 'launch.py status' and would NOT survive a reboot (the
service has no PHASES_SPEC env). Now the current phase set is durable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:17:34 +00:00
d349656c3b feat(launch): forward PHASES_SPEC/backend to watchdog; mark plan Phase 4 as operator gate
The watchdog is spawned into the existing tmux server and didn't reliably
inherit a custom PHASES_SPEC — it would fall back to the default 11-phase
spec and mis-detect completion. Forward PHASES_SPEC/PHASE_IDX_FILE/
LOOP_BACKEND/LOOP_MODEL explicitly in the watchdog command so custom
single-phase runs (like the mirror-enroll plan) work end-to-end. Also make
the mirror-enroll plan's live-host-deploy step an explicit claim-and-wait
operator gate for the loops.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:15:42 +00:00
8007053d94 plan: mirror + enroll ALL recipes before resuming per-recipe debugging
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:13:00 +00:00
e2551f3d79 chore(nix): infra polish — bake cc-ci IP, mark stale Incus config, park nginx vhost
- SSH config: replace REPLACE_WITH_CC_CI_HETZNER_TAILNET_IP placeholder with
  the real tailnet IP 100.95.31.88 (so a fresh re-provision is correct).
- nix/configuration.nix + nix/README.md: mark HISTORICAL/dead (old Incus VM,
  superseded by the Hetzner host) to prevent a wrong-host deploy.
- nginx oc.commoninternet.net vhost: note it's PARKED alongside opencode-web
  (kept for one-step re-enable, not deleted).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:07:05 +00:00
19fda8d2b8 fix(recipe-upgrade): stop auto-closing superseded/unrelated open PRs
Per operator: opening a new upgrade PR should stack ON TOP of any other
still-open PRs, not close them. Only PRs already merged into upstream
main are closed (merging them is a no-op). This prevents the phase-7
incident where an unrelated open ghost PR was auto-closed as 'superseded'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:07:05 +00:00
2304628375 chore(nix): park opencode-web (wantedBy=[]) — loops are on claude now
Keep the unit definition in the flake for easy re-enable; just stop it
auto-starting. Restore wantedBy = [ "multi-user.target" ] to bring it back.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 23:32:41 +00:00
d219b0972c journal: BUILD COMPLETE + weekly-upgrade cron cutover to NixOS timer (Sun 02:00 UTC)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 23:26:59 +00:00
ee58027c3e feat(nix): weekly /upgrade-all as a reboot-safe systemd timer (Sun 02:00 UTC)
Replace the boot-fragile busybox-crond-in-tmux (phase 5 §4) with a
systemd service+timer. Service is timer-triggered only (not wantedBy
multi-user.target) so it never runs on boot/activation; mirrors the
cc-ci-loops env fix (CLAUDE_BIN + /home/loops/.local/bin on PATH).
Timer fires Sundays 02:00 UTC, Persistent=true so a missed run (box
down) fires once on next boot. Runs launch-upgrader.py start ->
cc-ci-upgrader agent -> /upgrade-all DEFAULT (opens recipe PRs, never
merges). Activate via nixos-rebuild + retire the old Monday crond after
the phase-5 T0-fire verification completes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 22:54:52 +00:00
d8f558e987 journal: backend reverted to claude, waker folded into watchdog, boot service fixed
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 21:48:09 +00:00
2235110e29 journal: phase-5 progress-monitor events (19:04, 19:08)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 21:46:29 +00:00
1f96eba577 fix(ci-test-review): resolve PR ref to commit sha in verify-pr.sh
Resolve the recipe branch/ref to its head commit sha via the Gitea API
before invoking the cold full-suite run, so the upgrade tier deploys the
exact PR head. From the phase-5 upgrade-flow verification.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 21:46:29 +00:00
ed849096a6 fix(nix): put claude on the cc-ci-loops service PATH so loops start on boot
The service path lacked /home/loops/.local/bin, so launch.py preflight's
which(claude) failed on every boot and the loops never auto-started
(they were restarted by hand). Set CLAUDE_BIN to the standalone CLI's
absolute path and prepend the dir to PATH so the tmux server every agent
session inherits resolves bare claude.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 21:46:29 +00:00
ca6e68c08d feat(orchestrator): fold hourly supervision wake into the watchdog
The standalone ai-progress-monitor.sh waker pinged a hardcoded
orchestrator session every 15m. Move that into the watchdog loop:
ORCH_WAKE_INTERVAL (default 3600s) types the supervision prompt into
the live orchestrator session, retrying each tick until it lands so a
busy or briefly-absent orchestrator is never interrupted and no hour is
skipped. Delete the now-redundant waker script; the prompt file is now
driven by the watchdog. Reboot-safe by inheritance (the watchdog is
started by cc-ci-loops.service).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 21:46:20 +00:00
8f7265e948 feat(orchestrator): wake the live monitor session 2026-06-01 18:51:05 +00:00
9fe9d49cac journal: record Hetzner rescue recovery for cc-ci 2026-06-01 13:55:15 +00:00
9574972f1d feat(skill): add Hetzner server recovery playbook 2026-06-01 13:48:23 +00:00
8093a95184 journal: session 2026-06-01 03:34 UTC handoff (opencode gpt-5.4 visible) 2026-06-01 13:03:51 +00:00
837fed17d2 fix(orchestrator): attach opencode session from orchestrator repo 2026-06-01 13:03:51 +00:00
a896ee9476 fix(testme-on-pr): wait for a fresh cc-ci status update 2026-06-01 13:03:41 +00:00
2486b7c368 fix(ci-test-review): resolve remote cc-ci worktree 2026-06-01 13:03:41 +00:00