Commit Graph

183 Commits

Author SHA1 Message Date
330378d30d ideas: fail-fast on crash-looping deploy + don't let one wedged run starve the CI queue
After a live incident: plausible build 220 (ClickHouse exit-1 crash-loop) held the
single serial runner for its full 1200s DEPLOY_TIMEOUT, starving immich PR-2's
queued builds for ~12min until manually torn down. Logs the two fixes (fail-fast
on crash-loop; head-of-line blocking on the serial runner) + the interim
mitigations (step-2b dev loop for debugging; SIGINT to free a wedged run).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 16:29:30 +00:00
a2c1cb550a upstream(immich): release-notes sources + DB-pin & VectorChord backup/restore notes 2026-06-09 15:49:20 +00:00
c60fc6d056 change(cleanup): reap dev deploys at start+end of /upgrade-all instead of a timer
Per operator: drop the hourly cc-ci-reap-dev-deploys systemd timer; instead run the
dev-* reaper at the START (Step 0, alongside the orphan sweep) and END (new step 4b)
of each /upgrade-all run, with THRESHOLD=0 (the run is quiescent then, so clear all
dev-* unconditionally). The reaper keeps its safe default (4h) for ad-hoc use.
Step-2b mandatory teardown is unchanged (primary mechanism); this is the backstop.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:47:16 +00:00
23bba98be4 feat(cleanup): guarantee step-2b dev deploys get reaped
- /recipe-upgrade step 2b: teardown is now MANDATORY on every exit path (finally),
  with a verify-no-leak check; tear down even on failure before reporting.
- reap-dev-deploys.sh: safe, age-gated backstop that removes only idle dev-* stacks
  (never CI per-run stacks, warm-*, infra; an active dev loop stays fresh).
- orchestrator: hourly cc-ci-reap-dev-deploys systemd timer runs it against cc-ci,
  bounding any leaked dev deploy from a crashed/abandoned loop.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:42:23 +00:00
77ba7ee075 guardrail: upgrader never modifies cc-ci tests/harness unless --with-tests
Absolute, mode-gated rule reinforced in /recipe-upgrade (Guardrails + the new
step-2b direct-deploy loop where the upgrader has cc-ci host access) and noted as
the interim safeguard in IDEAS.md until the deploy loop moves to isolated infra.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:32:50 +00:00
98276124e5 ideas: isolate the upgrader's direct deploy onto separate infra (can't tamper with tests)
The step-2b direct deploy-and-inspect runs on the cc-ci server's own swarm today, so
the upgrader holds write access to the host that owns the tests + CI verdict — a
trust hole (could hack the tests). Parked idea: a dedicated throwaway test server
with scoped creds, so the upgrader can deploy+inspect but not modify the gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:31:20 +00:00
de0baa00b1 feat(upgrade): add direct deploy-and-inspect dev loop (recipe-maintainer style) before CI
The upgrader now deploys the WIP recipe directly on cc-ci (abra app deploy --chaos
under a dev-<recipe> domain on the local swarm) and inspects live logs
(docker service logs) to SEE what the upgrade does, before/alongside the !testme
CI gate. ADDITIONAL to — not a replacement for — the 3-attempt !testme verification;
it front-loads diagnosis so fewer CI attempts are spent on basics. Always torn down
(orphan-sweep is the backstop). /upgrade-all dispatch references the new step 2b.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:28:06 +00:00
f1c63f1ca0 feat(survey): don't skip recipes on abra tag+digest FATA — check upstream directly
abra hard-FATAs on image refs with both a tag and a digest (immich:
postgres:14-vectorchord...@sha256:..., valkey:9@sha256:...), aborting the whole
recipe survey so immich was silently dropped. Per operator: don't normalize the
recipe; catch the failure and check the upstream registry directly.

- /upgrade-all box item 4: a tag+digest parse FATA is NOT not-fetchable. Use abra
  for the images it parses; for the rest, list upstream tags (Docker Hub / ghcr /
  buildx imagetools) and judge availability (match the variant the app supports,
  not blindly the max). Upgradeable if abra OR the direct check finds a newer tag.
- /recipe-upgrade implement: hand-bump tag+digest pins (abra can't), and re-resolve
  + re-append the digest for the new tag so the pin is preserved (never drop it).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 15:21:36 +00:00
f687174b53 feat(recipe-report): TESTS rename + live binary STATUS column
Rename the table's Status column -> TESTS (the CI/test verdict, unchanged
content). Add a new STATUS column showing the PR's LIVE state, fetched
client-side: 'open' vs a ✓ for any not-open state (merged or closed). The cell
is a JS hook (data-repo/data-pr) derived from existing recipe+pr fields; an
inline, dependency-free, CSP-safe script GETs the same-origin /pr/<recipe>/<n>
proxy (cc-ci nix/modules/reports.nix) on load and every 30s, and degrades to a
muted '?' if the proxy/repo is unreachable. Blank cell when a row has no PR.
Doc + SKILL updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 13:15:02 +00:00
eb1439324e plan: finalize report PR-STATUS column (binary open/✓; proxy in reports.nix; decisions locked)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:59:41 +00:00
a76aca80e2 plan: Recipe Report TESTS rename + live PR-STATUS column (public mirrors + same-origin proxy)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 12:52:59 +00:00
1f52795534 skill(ci-dev-workflow): capture the cc-ci feature-dev flow + adversary plan template
Documents the end-to-end workflow used to land the intentional-skips/4-rung-ladder
feature: explore harness → branch a local cc-ci clone → implement + unit-verify
cold on cc-ci → live full-stage check → open PR (never push main) → independent
adversary verdict → squash-merge on PASS → deploy via /root/builder-clone rebuild.
Includes the adversary-verify-pr6.md plan as a reusable template.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 03:16:47 +00:00
dbafcddb62 feat(upgrade-all): sweep orphans from previous runs at the start of each weekly run
Adds sweep-orphans.sh (safe-by-allowlist: removes orphan test stacks, standalone
debug containers >30m old, leaked dangling volumes, and reparented docker-run
wrappers; spares infra + warm-* canonicals and their retained volumes) and wires
it as Step 0 of /upgrade-all so a prior run's leaked stack/container/process can't
contend for the shared Swarm or skew the survey. Idempotent; no-op when clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 02:39:43 +00:00
d31378b180 feat(recipe-report): restructure page — priority-sorted wire table w/ CVE column, addendum, per-recipe changes
New page order: short lead -> the full wire table (sorted by priority-to-address,
CVE recipes first, new CVEs count column) -> Addendum (bullets of real special
issues, omitted if clean) -> Security Bulletin -> per-recipe "What changed".

- recipe-report.py: _table() gains a CVEs column + recipe-name linking; new
  _changes() helper; render() reordered; docstring SPEC SHAPE updated
  (cve/addendum/changes added, needs_attention/routine removed).
- recipe-report/SKILL.md + example-spec.json: new procedure, spec shape, and
  gold-standard template (2026-06-05, new format).
- launch-report.py: kickoff text reflects the new priority-ordered structure.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 17:06:43 +00:00
49491fcb90 fix(recipe-report): weekly trigger uses launch-report.py 'fresh'; start kills idle/leftover session
A stale cc-ci-report session (from a prior week's run, gone idle) caused this week's
launch-report.py 'start' (use-or-create) to leave it and never run a fresh report.
Fix: upgrade-all step 6 now calls 'fresh', and start only leaves a session that's
actively busy producing a report — an idle/leftover session is killed + restarted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 12:06:46 +00:00
f397968f47 upstream(uptime-kuma): release-notes sources 2026-06-05 06:18:33 +00:00
a22dc4fc93 upstream(plausible): release-notes sources 2026-06-05 04:35:44 +00:00
4a2af99147 upstream(n8n): release-notes sources 2026-06-05 04:28:33 +00:00
509b36b242 upstream(mattermost-lts): release-notes sources
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 04:25:02 +00:00
538c41810b upstream(matrix-synapse): release-notes sources 2026-06-05 03:13:27 +00:00
9bd5a2baf0 upstream(mailu): release-notes sources 2026-06-05 03:11:09 +00:00
44e396c3fd upstream(lasuite-meet): release-notes sources 2026-06-05 02:59:46 +00:00
b63edbbd7f upstream(lasuite-drive): release-notes sources 2026-06-05 02:49:50 +00:00
a3740e1fdf upstream(lasuite-docs): release-notes sources 2026-06-05 02:43:03 +00:00
f5da8ac3ff upstream(keycloak): release-notes sources 2026-06-05 02:27:00 +00:00
287fb51d91 upstream(ghost): release-notes sources 2026-06-05 02:20:11 +00:00
d24feb0671 upstream(discourse): release-notes sources 2026-06-05 02:02:19 +00:00
85065880a5 upstream(custom-html-tiny): release-notes sources
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 01:58:29 +00:00
65a96453fc fix(recipe-upgrade): reconcile mirror from TRUE coopcloud upstream, not from the mirror itself
The reconcile that's supposed to make the mirror main == upstream main was fetching origin/main —
but origin is the cc-ci MIRROR, so it synced the mirror to itself (a no-op) and never pulled real
upstream. Fix: fetch coopcloud explicitly (git.coopcloud.tech/coop-cloud/<recipe>, default branch
main OR master) via an 'upstream' remote and force-sync the mirror main + tags from it. Every recipe
has a coopcloud correspondent; none are forked. Also reorder the skill so the reconcile runs BEFORE
the upgrade check, so the check sees the real current recipe. Verified by divergence test (diverged a
mirror, reconcile snapped it back to coopcloud HEAD).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:44:45 +00:00
167ce2d881 upstream(custom-html): release-notes sources 2026-06-05 01:32:13 +00:00
f0716764db feat(recipe-upgrade): upstream release-notes registry + recipe-README read (recipe-maintainer parity)
Close the two gaps vs recipe-maintainer's recipe-upgrade-plan:
- Per-recipe release-notes registry at cc-ci-plan/upstream/<recipe>.md (discover the source repo +
  releases/changelog URL for each image once, persist+commit, reuse) — fetch release notes FROM those
  URLs instead of rediscovering ad-hoc each run. Format doc + cryptpad seed included.
- Explicitly read the recipe's README for shipped upgrade/migration notes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:28:27 +00:00
f4b1befbdd chore(nix): weekly timer = Thu 22:00 America/New_York (Boston 10pm, DST-aware)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:21:41 +00:00
0338dc23fd chore(nix): move weekly upgrade timer to Thursdays 22:00 UTC (was Sun 02:00)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 01:18:20 +00:00
d8ad5a2805 feat(recipe-report): link recipe names in all story sections (security/needs/routine), not just the lead
_stories() now auto-links whole-word recipe mentions in story titles + bodies to their mirror
repos (same single-pass linkify as the lead); explicit PR/build links are untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 02:21:31 +00:00
a6efcec720 feat(recipe-report): link recipe names in the lead to their mirror repos; 3-para concise lead
render() auto-links whole-word recipe mentions in the editorial lead to
git.autonomic.zone/recipe-maintainers/<recipe> (single regex pass, longest-name-first,
no href corruption). Skill: lead is ~3 short paragraphs (~150-180 words) incl. an
'anything strange worth looking into' paragraph. example-spec.json lead updated to the
concise target.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 02:17:19 +00:00
ea2d8c8210 feat(recipe-report): use approved 2026-06-02 report as the style template; tighter lead for future runs
Save the operator-approved 2026-06-02 spec as example-spec.json (gold standard
for voice/structure/specificity). Skill now tells the agent to match it, with
one deliberate change: keep the editorial lead TIGHT (~2 short paragraphs,
~120 words). The live 2026-06-02 page stays as the reference.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 02:06:45 +00:00
f20a066f5c journal: recipe-report v2 newspaper front page (CVE-led editorial) live
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:20:48 +00:00
6cf59130db feat(recipe-report): newspaper front-page layout — editorial lead + CVE security bulletin first
Masthead + opus 'lead' editorial (overall fleet state + what to focus on), a Security Bulletin of
critical-CVE upgrades up top (mined from per-recipe upgrade_notes_md), then needs-attention/routine,
and the comprehensive table as 'the full wire' at the end. survey now includes each recipe's
upgrade_notes_md (breaking-change/CVE analysis) so opus can lead with security.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:13:40 +00:00
856df8cb37 journal: /recipe-report + report.ci.commoninternet.net shipped; first opus report live
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:06:22 +00:00
c7301a9e39 feat(recipe-report): /recipe-report skill + helper + launcher (default opus); wire into upgrade-all
- recipe-report.py: survey (run + per-recipe PRs + CI verdicts) / render (spec->HTML) / publish
  (copy to cc-ci:/var/lib/cc-ci-reports + regen index).
- skill .claude/skills/recipe-report: review the weekly run, classify needs-attention vs routine,
  publish one public HTML page per week + index at report.ci.commoninternet.net. Read-only.
- launch-report.py: one-shot cc-ci-report agent, REPORT_MODEL default opus (separate from the
  sonnet upgrader), REPORT_BACKEND default claude.
- upgrade-all SKILL: closing step launches the report agent.
Serving (nix/modules/reports.nix) already deployed + live.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:02:22 +00:00
73aa20e8ab plan(recipe-report): separate configurable report model (default opus); link CI results, no embedded images
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 22:52:27 +00:00
81984c84da plan: /recipe-report skill + report.ci.commoninternet.net weekly report
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 22:37:56 +00:00
d38f80048a journal: bridge one-comment-per-!testme deployed; note cc-ci deploy-path gap
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 17:30:13 +00:00
bfe3a97301 journal: overnight /upgrade-all complete — 10 GREEN, 2 stale-test, 2 failed, 4 skipped
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 11:41:47 +00:00
9e88927e5b ideas: Co-op Cloud NixOS modules — mkCcApp factory + health-gated rollback
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-02 05:06:30 +00:00
5c691cdb66 fix(upgrade skills): real abra-auth fix — embed git.autonomic.zone creds in origin (go-git)
The actual 'abra auth error' that skipped 8 recipes was go-git failing to
fetch tags from the PRIVATE git.autonomic.zone mirror ('authentication
required: Unauthorized'), NOT the TTY issue. abra/go-git reads
remote.origin.url literally and IGNORES git url.insteadOf + credential
helpers (confirmed: insteadOf left immich Unauthorized; literal embedded URL
fixed it). Skill now bakes $GITEA_USERNAME:$GITEA_PASSWORD into origin for
git.autonomic.zone recipes before the version check, and stashes the
untracked cc-ci overlay so it isn't mis-counted as dirty-worktree.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:40:59 +00:00
c0852d2302 feat(logs): readable greppable per-agent transcript logs (agent-log.py)
The raw 'tmux pipe-pane' logs are TUI-escape soup (the 191MB builder log).
agent-log.py renders Claude's own JSONL transcript into a clean one-event-
per-line <agent>.clean.log — read-only on a file the agent writes anyway, so
zero agent slowdown and zero extra tokens. Resolves each agent's transcript
(disambiguating the shared project dir by kickoff signature; tracks restarts).
'follow-all' runs as the cc-ci-cleanlogs session, wired into launch.py start
so it comes up with the loops. render/tail subcommands for ad-hoc use.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:35:17 +00:00
027fdbd161 fix(upgrade skills): run abra over a pseudo-TTY (fixes FATA inappropriate ioctl)
abra over plain 'ssh cc-ci abra ...' has no TTY -> FATA 'inappropriate ioctl
for device' (the abra error). The working harness (runner/harness/abra.py)
wraps abra in util-linux 'script' for a pseudo-TTY + passes -n. Apply the
same in the recipe-upgrade and upgrade-all skills: every abra call becomes
ssh cc-ci 'script -qec "abra <args> -n" /dev/null'. Confirmed: abra server
ls FATAs plain, works pty-wrapped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:06:38 +00:00
cdbc5bb42f journal: mirror+regression phases DONE (build sequence complete); overnight /upgrade-all running
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 03:43:46 +00:00
04fdefcd39 plan: overnight run — after assistant, run /upgrade-all + morning report
Bash runner (cheap polling, no claude budget) that gates on the assistant's
PR-consolidation done-marker, waits past the usage-limit reset (~03:30 UTC)
and for the loops to idle, runs the weekly /upgrade-all (DEFAULT, never
merges), then writes overnight-report-<date>.md and pings the orchestrator
to notify. One-off; the Sunday 02:00 timer is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 02:10:13 +00:00