Files
cc-ci/machine-docs/JOURNAL-5.md
autonomic-bot 5972ee1033
Some checks failed
continuous-integration/drone/push Build is failing
claim(5): A5-7 fix — CronCreate mechanism verified (T0-refire 23:18Z, upgrader-cron.log created)
2026-06-01 23:19:32 +00:00

34 KiB

JOURNAL — cc-ci Phase 5

2026-05-31 — Phase 5 boot

Phase 5 starting. System state verified:

  • cc-ci: systemctl is-system-running → running; 0 failed units
  • Docker services: ccci-bridge 1/1, ccci-dashboard 1/1, drone 1/1, traefik 1/1
  • Bridge: 1/1 (container-based, logs via docker service logs ccci-bridge_app)

Sandbox recipe chosen: custom-html-tiny (simple static-web-server; short timeouts; existing install_steps.sh hook; generic harness; ideal for upgrade-flow testing with minimal CI runtime).

Existing open PRs on custom-html-tiny mirror:

  • #1 serve-hidden-files branch — "chore: publish 1.0.2+2.38.0 release" (feature + version bump, NOT from upstream main, NOT merged upstream, from 2026-05-25). Will be closed as superseded when we open the upgrade PR (expected V7 behavior).

Available upgrades for custom-html-tiny:

  • app service (joseluisq/static-web-server): 2.38.0 → 2.42.0
  • git service (alpine/git, compose.git-pull.yml): v2.36.3 → v2.52.0
  • New version label: 1.1.0+2.42.0

2026-05-31 — V3: recipe-upgrade flow starting

Following SKILL.md procedure for /recipe-upgrade custom-html-tiny: Step 1 (Plan): fetched recipe, found upgrades available — see above. Step 2 (Implement): upgrading image tags on cc-ci; bumping version label; committing. Step 3: open-recipe-pr.sh:

  • First attempt: FAILED — script uses python3 which is not installed on cc-ci. Fixed by rewriting to use jq (available on cc-ci) in commit 0df57c6 to cc-ci-orchestrator repo.
  • Second attempt: SUCCESS. Closed PR #1 (serve-hidden-files) as superseded, pushed branch upgrade-1.1.0+2.42.0, opened PR #2 at recipe-maintainers/custom-html-tiny#2 Step 4: testme-on-pr.sh:
  • Initial post: posted !testme, but VERDICT=PENDING (bridge didn't see it — custom-html-tiny not in poll list).
  • Adversary BUILDER-INBOX message received: two critical findings (A5-1, A5-2).

2026-05-31 — Adversary findings A5-1, A5-2 — both FIXED

A5-2 (CRITICAL): testme-on-pr.sh cannot read verdicts — bridge never posts commit statuses.

  • Root cause: bridge only posts PR comments; testme-on-pr.sh reads Gitea commit statuses.
  • Fix: Added post_commit_status() to bridge.py. Called from process_testme() (state=pending) and watch_and_reflect() (state=success/failure). Commit 5d48436.
  • Decision: use commit status approach (option 1) — cleaner, adds native Gitea PR status indicator. Recorded in DECISIONS.md.

A5-1: custom-html-tiny not in bridge poll list.

  • Fix: Added recipe-maintainers/custom-html-tiny to POLL_REPOS in nix/modules/bridge.nix. Commit 5d48436.
  • Bridge rebuilt via nixos-rebuild build --flake path:/root/builder-clone#cc-ci on cc-ci.
  • Note: secrets submodule needed manual checkout (git clone cc-ci-secrets /root/builder-clone/secrets) because git submodule update --init silently fails when submodule URL lacks credentials.
  • Bridge redeployed via /nix/store/asn4.../cc-ci-reconcile-bridge, new image cc-ci-bridge:3761c4221042.
  • Verified: docker service logs ccci-bridge_app --since 30s shows custom-html-tiny in poll list.

Next: re-post !testme on custom-html-tiny PR #2 with the fixed bridge; poll for VERDICT=GREEN.

2026-05-31 — V3 COMPLETE; V1/V2 partial; testme-on-pr.sh fix

testme-on-pr.sh fix committed (orchestrator repo 6910b19): now reads cc-ci/testme context URL.

Build #29 evidence:

  • Params: RECIPE=custom-html-tiny REF=156a49acc... PR=2 stages=install,upgrade,backup,restore,custom
  • Results: install PASS, upgrade PASS (1.0.0+2.38.0→1.1.0+2.42.0), backup/restore/custom N/A
  • Bridge commit status posted: cc-ci/testme state=success url=.../cc-ci/29 @2026-05-31T13:56:19
  • PR comment updated with 🌻 success banner

V2 GREEN verified: POST=0 → VERDICT=GREEN BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/29

V7 verified: mirror main = upstream main (435df8fc); PR#1 (serve-hidden-files) closed as superseded.

Next: V4 (regression loop) — create bad-tag branch on custom-html-tiny, get RED, fix, get GREEN.

2026-05-31 — Bootstrap/access checks + V4 regression loop complete

Bootstrap probes from the builder clone:

  • ssh cc-ci "hostname && whoami && nixos-version"cc-ci / root / 24.11.20250630.50ab793 (Vicuna)
  • set -a; . /srv/cc-ci/.testenv; set +a; curl -s https://$GITEA_URL/api/v1/version{"version":"1.24.2"}
  • getent ahostsv4 probe-12345.ci.commoninternet.net91.98.47.73 (STREAM/DGRAM/RAW)

V4 red side:

  • POST=0 MAX_WAIT=15 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5VERDICT=REDBUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/34
  • curl -fsSL https://ci.commoninternet.net/runs/34/results.json → install=pass, upgrade=fail, clean_teardown=true, no_secret_leak=true

V4 fix on cc-ci host (same recipe PR branch):

  • git -C /root/.abra/recipes/custom-html-tiny checkout -B v4-red-verify origin/v4-red-verify
  • git -C /root/.abra/recipes/custom-html-tiny checkout origin/upgrade-1.1.0+2.42.0 -- compose.yml compose.git-pull.yml
  • git -C /root/.abra/recipes/custom-html-tiny -c user.name='autonomic-bot' -c user.email='autonomic-bot@git.autonomic.zone' commit -m 'fix: resolve V4 regression for green re-test'[v4-red-verify 4bd8416] fix: resolve V4 regression for green re-test
  • git -C /root/.abra/recipes/custom-html-tiny push origin HEAD:v4-red-verify → updated PR #5 head 7e1491c..4bd8416

V4 green side:

  • MAX_WAIT=300 INTERVAL=10 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5VERDICT=GREENBUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/37

Adversary follow-up:

  • REVIEW-5.md follow-up (review(5) commit e87782a) closed A5-1 and A5-2 after a fresh cold re-test.
  • BUILDER-INBOX.md noted that POST=0 must be env-prefixed in STATUS-5.md; corrected here and the inbox is being consumed now.

Next: V5 default stale-test case, then V6 --with-tests.

2026-06-01 — Adversary finding A5-3 fixed; helper paths corrected

Adversary review+inbox reported a real V2 rerun bug: on a re-!testme against the same PR head, POST=1 testme-on-pr.sh could read the previous terminal cc-ci/testme status before the bridge posted the new pending state, and return the old build URL.

Fix authored in the orchestration repo helper:

  • testme-on-pr.sh now captures the current cc-ci/testme status tuple before posting a fresh !testme, then ignores that unchanged tuple while polling. It returns only once the status changes to the new run's state/URL.
  • ci-test-review/{verify-pr.sh,run-all-recipes.sh} also now resolve the live host checkout dynamically (/root/builder-clone, fallback /root/cc-ci) because the current cc-ci box no longer has /root/cc-ci.

Verification:

  • bash -n /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh && bash -n /srv/cc-ci-orch/.claude/skills/ci-test-review/verify-pr.sh && bash -n /srv/cc-ci-orch/.claude/skills/ci-test-review/run-all-recipes.sh → exit 0
  • cmp -s /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh && echo samesame
  • BEFORE=$(...) ; POST=1 MAX_WAIT=80 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5 ; RC=$? ; AFTER=$(...) ; printf 'RC=%s\nBEFORE=%s\nAFTER=%s\n' "$RC" "$BEFORE" "$AFTER"VERDICT=GREENBUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/43RC=0BEFORE=4AFTER=5

Next: consume BUILDER-INBOX.md in git, then continue with V5 stale-test candidate selection.

2026-06-01 — Adversary re-test PASS; V5/V6 helpers added; n8n live probe

Adversary review update:

  • REVIEW-5.md 2026-06-01T03:31:30Z closed A5-3 after a cold re-test. The rerun helper now returns the fresh build URL on same-head re-!testme.

V5/V6 automation gap closed in the orchestration repo (new files only; did not rewrite the already-dirty helper scripts):

  • /srv/cc-ci-orch/.claude/skills/recipe-upgrade/post-pr-comment.sh
  • /srv/cc-ci-orch/.claude/skills/ci-test-review/open-cc-ci-pr.sh
  • Verification: bash -n on both new scripts exited 0 after chmod +x.

Live stale-test candidate exploration:

  • ssh cc-ci "export PATH=/run/current-system/sw/bin:$PATH; abra recipe upgrade n8n -m -n" showed a real available upgrade: app 2.20.6 -> 2.23.1, db 17-alpine -> 18-alpine.
  • On cc-ci ~/.abra/recipes/n8n, created a scratch upgrade commit:
    • compose.yml: n8nio/n8n:2.20.6 -> 2.23.1
    • compose.yml: version label 3.2.0+2.20.6 -> 3.3.0+2.23.1
    • compose.postgres.yml: pgautoupgrade/pgautoupgrade:17-alpine -> 18-alpine
  • Opened mirror PR via open-recipe-pr.sh:
    • PR_URL=https://git.autonomic.zone/recipe-maintainers/n8n/pulls/2
    • branch upgrade-3.3.0+2.23.1, head c8d27a2
  • Triggered real cc-ci gate:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh n8n 2 -> VERDICT=PENDING -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47
    • POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh n8n 2 -> VERDICT=GREEN -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47

Conclusion:

  • n8n remains the best V5/V6 sandbox candidate because its tests have real version-shape assertions, but the natural upgrade path did NOT yield a stale-test failure. Per Phase 5 §2, the next move is to seed a stale-test case explicitly on a sandbox/scratch branch and then run the DEFAULT comment-only and --with-tests paths against that seeded case.

2026-06-01 — Resume loop: cryptpad green, lasuite-meet not enrolled

Pulled the latest Adversary review (REVIEW-5.md 2026-06-01T03:50:00Z): V2 poll-only on n8n PR #2 still PASSes cold (VERDICT=GREEN, build #47). No new finding to fix.

Live cryptpad probe:

  • Registry check showed a real app upgrade beyond the current recipe head: cryptpad/cryptpad:version-2026.2.0 -> version-2026.5.1 (plus nginx 1.29 -> 1.31).
  • On cc-ci ~/.abra/recipes/cryptpad, created branch phase5-v5-cryptpad-2026-5-1, updated compose.yml, and committed:
    • cryptpad/cryptpad:version-2026.2.0 -> version-2026.5.1
    • nginx:1.29 -> 1.31
    • recipe version label 0.5.4+v2026.2.0 -> 0.5.5+v2026.5.1
    • commit: 9db61d3 feat: upgrade to 0.5.5+v2026.5.1
  • Opened mirror PR via open-recipe-pr.sh:
    • PR_URL=https://git.autonomic.zone/recipe-maintainers/cryptpad/pulls/3
    • branch upgrade-0.5.5+v2026.5.1
  • Real cc-ci verdict:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh cryptpad 3 -> VERDICT=PENDING -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/50
    • POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh cryptpad 3 -> VERDICT=GREEN -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/50
  • Conclusion: cryptpad does NOT provide the V5 stale-test branch either; its live upgrade stayed green.

Live lasuite-meet probe:

  • ssh cc-ci "export PATH=/run/current-system/sw/bin:$PATH; abra recipe upgrade lasuite-meet -m -n" showed a real app upgrade: frontend/backend/celery v1.16.0 -> v1.17.0, redis 8.6.3 -> 8.8.0.
  • On cc-ci ~/.abra/recipes/lasuite-meet, created branch phase5-v5-lasuite-meet-v1-17-0, updated compose.yml, and committed:
    • frontend/backend/celery v1.16.0 -> v1.17.0
    • redis:8.6.3 -> 8.8.0
    • recipe version label 0.3.0+v1.16.0 -> 0.3.1+v1.17.0
    • commit: 2d0c707 feat: upgrade to 0.3.1+v1.17.0
  • Opened mirror PR via open-recipe-pr.sh:
    • PR_URL=https://git.autonomic.zone/recipe-maintainers/lasuite-meet/pulls/2
    • branch upgrade-0.3.1+v1.17.0
  • Real trigger attempts:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=PENDING -> BUILD=?
    • POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=PENDING -> BUILD=?
    • after an extra 60s delay, POST=0 MAX_WAIT=240 INTERVAL=10 ... still returned VERDICT=PENDING BUILD=?
  • Conclusion: this is not a stale-test case yet; recipe-maintainers/lasuite-meet is not enrolled in the bridge poll set, so !testme never entered the real CI path. Keep V5/V6 search on already-enrolled recipes.

2026-06-01 — Operator steer: enroll lasuite-meet; activation left host offline

Re-oriented from the current Phase 5 SSOT and the phase ledgers. There is no separate plan-phase6-* file in /srv/cc-ci/cc-ci-plan; the operator steer maps to Phase 5 V5/V6.

Minimal code change:

  • nix/modules/bridge.nix: added recipe-maintainers/lasuite-meet to POLL_REPOS
  • committed + pushed as f28a2a3 fix(bridge): enroll lasuite-meet for !testme

Host rollout attempts:

  • ssh cc-ci "test -d /root/builder-clone && git -C /root/builder-clone pull --rebase" -> fast-forwarded host clone to f28a2a3
  • ssh cc-ci "nixos-rebuild build --flake path:/root/builder-clone#cc-ci" -> build completed (new system store path created)
  • ssh cc-ci "nixos-rebuild switch --flake path:/root/builder-clone#cc-ci" -> activation reached the known bootloader failure: efiSysMountPoint = '/boot' is not a mounted partition Failed to install bootloader but did not roll the bridge task
  • ssh cc-ci "systemctl show -P ExecStart deploy-bridge.service" showed the old active helper path, and the running swarm task still used cc-ci-bridge:3761c4221042
  • ssh cc-ci "nixos-rebuild test --flake path:/root/builder-clone#cc-ci" was used to activate the updated config without touching the bootloader; it restarted multiple units, including deploy-bridge.service, and then the SSH session dropped with: Timeout, server 100.95.31.88 not responding.

Post-activation reachability probes from the orchestrator:

  • ssh cc-ci "systemctl status deploy-bridge.service --no-pager" -> connect to host 100.95.31.88 port 22: Connection timed out
  • tailscale status -> 100.95.31.88 cc-ci ... active; relay "nue"; offline
  • tailscale ping -c 3 cc-ci -> no reply
  • after a 2-minute warm poll: SSH still timed out

Current state:

  • The repo-side enrollment fix is durable on origin/main.
  • Live verification that the bridge poller now watches recipe-maintainers/lasuite-meet is blocked on host reachability returning.

2026-06-01 — Host recovered; lasuite-meet enrolled and green

Recovery point:

  • ssh cc-ci "hostname && systemctl is-system-running" -> nixos -> running

Bridge rollout verification after recovery:

  • Initial live check still showed the old poll set in the running task logs, even though the host source and built stack contained recipe-maintainers/lasuite-meet.
  • Located the updated built artifacts on the host:
    • stack with lasuite-meet: /nix/store/377c59lcpjj8bgs0dlq7l1z128y53016-cc-ci-bridge-stack.yml
    • corresponding reconcile helper: /nix/store/rk9vwyfvdryp4zln0ywlg6q2vyjmwfw4-cc-ci-reconcile-bridge/bin/cc-ci-reconcile-bridge
  • Ran that helper directly on cc-ci; service spec then showed:
    • POLL_REPOS=...recipe-maintainers/lasuite-docs,recipe-maintainers/lasuite-meet,recipe-maintainers/n8n...
  • Waited for the new task banner:
    • docker service logs ccci-bridge_app --since 20s -> poller (primary) watching ['recipe-maintainers/cc-ci', 'recipe-maintainers/custom-html', 'recipe-maintainers/custom-html-tiny', 'recipe-maintainers/keycloak', 'recipe-maintainers/cryptpad', 'recipe-maintainers/matrix-synapse', 'recipe-maintainers/lasuite-docs', 'recipe-maintainers/lasuite-meet', 'recipe-maintainers/n8n', 'recipe-maintainers/hedgedoc'] every 30s

Real lasuite-meet trigger after enrollment:

  • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=RED -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/55

Authenticated Drone build inspection from cc-ci:

  • curl -H "Authorization: Bearer $(cat /run/secrets/bridge_drone_token)" \ https://drone.ci.commoninternet.net/api/repos/recipe-maintainers/cc-ci/builds/55 showed a real run failure, not a trigger issue.
  • Step-log fetch (.../builds/55/logs/1/2) showed the root cause:
    • tests/lasuite-meet/install_steps.sh failed at abra app secret insert oidc_rpcs@v2
    • exact error: FATA unable to fetch tags in /root/.abra/recipes/lasuite-meet: authentication required: Unauthorized
  • Classification: NOT a stale-test case; this was a harness/install-hook issue.

Harness fix:

  • Patched the La Suite OIDC secret-insert hooks to use offline/current-checkout mode (-C -o), matching the rest of the harness and avoiding private-origin tag fetches:
    • tests/lasuite-meet/install_steps.sh
    • tests/lasuite-drive/install_steps.sh
    • tests/lasuite-docs/setup_custom_tests.sh
  • Verified syntax:
    • bash -n on all three scripts -> exit 0
  • Committed + pushed:
    • 7225138 fix(tests): keep La Suite OIDC secret inserts offline

Re-test on the real path:

  • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=PENDING -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/58
  • POST=0 MAX_WAIT=360 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=GREEN -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/58

Conclusion:

  • lasuite-meet is now fully enrolled in the live bridge poll path.
  • The RED after enrollment was a real harness bug, now fixed.
  • After the fix, the actual recipe upgrade PR is GREEN, so lasuite-meet still does NOT provide the V5 stale-test branch.

2026-06-01 — V5 candidate: matrix-synapse default-mode stale-test comment

Investigated the already-open enrolled live upgrade PR:

  • PR: https://git.autonomic.zone/recipe-maintainers/matrix-synapse/pulls/1
  • head: 21e5d84430bdc52f8fa8aa9a40fa5bda8adf06c0
  • recipe branch: upgrade-7.2.0+v1.153.0

Authenticated Drone inspection from cc-ci:

  • curl -H "Authorization: Bearer $(cat /run/secrets/bridge_drone_token)" \ https://drone.ci.commoninternet.net/api/repos/recipe-maintainers/cc-ci/builds/53 -> build #53, status failure, params RECIPE=matrix-synapse PR=1 REF=21e5d844...
  • curl -H "Authorization: Bearer $(cat /run/secrets/bridge_drone_token)" \ https://drone.ci.commoninternet.net/api/repos/recipe-maintainers/cc-ci/builds/53/logs/1/2 -> RUN SUMMARY:
    • install : pass
    • upgrade : fail
    • backup : pass
    • restore : pass
    • custom : pass

The only failing assertion was:

  • tests/matrix-synapse/test_upgrade.py::test_upgrade_preserves_data
  • exact failure: ERROR: relation "ci_marker" does not exist

Why this appears to be the V5 stale-test branch rather than an obvious recipe regression:

  • the failing upgrade assertion checks a synthetic cc-ci-only postgres table ci_marker (tests/matrix-synapse/ops.py seeds it; tests/matrix-synapse/test_upgrade.py reads it back)
  • install, generic upgrade reconverge, backup, restore, and all real Matrix functional tests passed
  • the failure is isolated to the synthetic DB marker surviving the DB upgrade path, not to a real Matrix user/room/message data path

Default-mode Phase-5 action taken:

  • posted explanatory no-test-edit comment on the recipe PR via helper:
    • command: BODY_FILE=<tmp> /srv/cc-ci-orch/.claude/skills/recipe-upgrade/post-pr-comment.sh recipe-maintainers/matrix-synapse 1
    • result: COMMENT_URL=https://git.autonomic.zone/recipe-maintainers/matrix-synapse/pulls/1#issuecomment-13877
  • comment states that the upgrade looks correct, identifies the failing stale test, explains why the synthetic ci_marker check is the mismatch, makes no test edit, and tells the operator to re-run /recipe-upgrade matrix-synapse --with-tests to get a verified cc-ci test PR.

Next: treat matrix-synapse as the V6 candidate and prepare the dedicated cc-ci test-branch fix.

2026-06-01 — A5-4 cleared; matrix-synapse V6 branch invalidated

Adversary finding A5-4 was real and caused by timing around the temporary old bridge image during the host-recovery rollout, not by the current live bridge behavior.

Live re-test on the current bridge:

  • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh matrix-synapse 1 -> VERDICT=PENDING -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/63
  • POST=0 MAX_WAIT=360 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh matrix-synapse 1 -> VERDICT=RED -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/63
  • GET /repos/recipe-maintainers/matrix-synapse/commits/21e5d84430bdc52f8fa8aa9a40fa5bda8adf06c0/status now shows context cc-ci/testme state=failure target_url=.../63.

Conclusion for A5-4:

  • cleared on current live behavior; the helper can again read the verdict back from the PR via commit status on this stale-test/default-path candidate.

V6 branch-checkout work on matrix-synapse:

  • Created dedicated clone /tmp/opencode/cc-ci-v6, branch v6-matrix-synapse-real-upgrade-state.
  • Implemented a real app-data upgrade assertion there:
    • tests/matrix-synapse/ops.py now seeds two Matrix users, a room, and a message before upgrade and persists only {user_b,password,room_id,marker} to /data/ccci-upgrade-state.json.
    • tests/matrix-synapse/test_upgrade.py now logs back in after upgrade and asserts the pre-upgrade message is still readable from the same room.
  • Branch commit: 5edcf8d fix(tests): use real matrix data for upgrade state
  • Pushed remote branch: origin/v6-matrix-synapse-real-upgrade-state

While verifying that branch I found and fixed a helper bug in the V6 path itself:

  • ci-test-review/verify-pr.sh previously passed a branch name like upgrade-7.2.0+v1.153.0 straight through as REF, but the generic upgrade assertion expects the PR head COMMIT SHA there (same shape !testme uses). That made branch-checkout verification falsely RED at HC1 with head_ref='upgrade-7.2...' vs chaos-version='21e5d844'.
  • Patched verify-pr.sh to resolve non-SHA refs to their branch head commit via the Gitea API before invoking runner/run_recipe_ci.py.

Dedicated host checkout for verification:

  • materialized /root/cc-ci-v6-verify on cc-ci from the dedicated branch clone
  • marked it safe for git on the host:
    • git config --global --add safe.directory /root/cc-ci-v6-verify

Verification results:

  • First branch-verify run (before the helper fix) hit the HC1 false-red and also showed the new overlay login failure.
  • Second branch-verify run (after the helper fix):
    • REMOTE_ROOT=/root/cc-ci-v6-verify RECIPE=matrix-synapse REF=upgrade-7.2.0+v1.153.0 /srv/cc-ci-orch/.claude/skills/ci-test-review/verify-pr.sh
    • helper now resolves REF_SHA=21e5d84430bdc52f8fa8aa9a40fa5bda8adf06c0
    • generic upgrade tier PASSed
    • but the new real-data overlay still FAILED: login upgradeb53398657 HTTP 403: {'errcode': 'M_FORBIDDEN', 'error': 'Invalid username or password'}

Conclusion:

  • matrix-synapse is NOT a V6 stale-test branch after all.
  • Once the synthetic marker was replaced with a real Matrix data-survival assertion, the upgrade still failed. This points to a true recipe upgrade regression, not a stale cc-ci test.

Next: move to the next enrolled V5/V6 candidate (n8n, then lasuite-docs, then keycloak).

2026-06-01 — Operator-directed seeded stale-test case: custom-html

Per operator direction, I stopped searching for a naturally occurring stale-test recipe and switched to a deliberately seeded sandbox case.

Seeded recipe PR used:

  • https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3
  • branch v5-stale-docroot

I first inspected the pre-existing PR state and found the earlier docroot-move attempt was too broad: it broke backup/restore/custom for real, so it was not a clean stale-test simulation.

Re-seeded the same sandbox PR into a narrower stale-test case on the host recipe checkout:

  • kept the real upgrade crossover (1.10.0+1.28.0 -> 1.11.2+1.29.0)
  • reverted the volume/docroot move
  • added a specific nginx location override for *.txt:
    • keep .html as normal text/html
    • force .txt to application/octet-stream
  • final seed commit on the recipe PR branch:
    • 71e7326 fix: force octet-stream for seeded txt files

DEFAULT / V5 real-path evidence:

  • Trigger:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html 3 -> VERDICT=RED -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/75
  • Poll-only re-check:
    • POST=0 MAX_WAIT=20 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html 3 -> VERDICT=RED -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/75
  • Authenticated Drone log inspection for build #75:
    • install PASS
    • upgrade PASS
    • backup PASS
    • restore PASS
    • custom FAIL only
    • exact failing assertion: tests/custom-html/functional/test_content_type_header.py expected .txt Content-Type to start with text/plain, got application/octet-stream
  • DEFAULT-mode explanatory recipe PR comment posted with NO cc-ci test edit:
    • https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3#issuecomment-13883
    • comment explains the seeded sandbox MIME change and tells the operator to re-run /recipe-upgrade custom-html --with-tests

--with-tests / V6 real-path evidence:

  • Created a fresh dedicated cc-ci clone:
    • /tmp/opencode/cc-ci-v6-custom-mime
  • Created the minimal paired branch:
    • branch: v6-custom-html-mime
    • commit: 826daec fix(tests): accept seeded custom-html txt mime
    • remote branch: origin/v6-custom-html-mime
  • Scope of the test PR branch:
    • only tests/custom-html/functional/test_content_type_header.py changed
    • .txt now expects application/octet-stream for the seeded sandbox case
  • Opened paired cc-ci PR:
    • https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/3
  • Materialized isolated host checkout:
    • /root/cc-ci-v6-custom-mime
  • Cold branch-checkout verification on cc-ci:
    • REMOTE_ROOT=/root/cc-ci-v6-custom-mime RECIPE=custom-html REF=v5-stale-docroot /srv/cc-ci-orch/.claude/skills/ci-test-review/verify-pr.sh
    • result: VERDICT: GREEN — custom-html PR (REF=v5-stale-docroot) passed cold full-suite x1. Ready for operator merge (NOT merged).
    • host log: cc-ci:/root/cc-ci-review-logs/verify-custom-html-20260601T200544Z.1.log

Pairing notes posted:

  • recipe PR note: https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3#issuecomment-13894
  • cc-ci PR note: https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/3#issuecomment-13896

Conclusion:

  • The operator-directed seeded stale-test case is now fully exercised:
    • DEFAULT mode leaves an explanatory recipe-PR comment and makes no cc-ci test edit
    • --with-tests opens a paired cc-ci test PR and the branch-checkout verification is GREEN
  • Next phase work is V8 /upgrade-all, V8a cc-ci-upgrader, then V9 cleanup/closeout.

2026-06-01 — V9 cleanup + cron install + gate M5 CLAIMED

V8 result confirmed:

  • Build #91: uptime-kuma@72861889, install PASS, upgrade PASS (2.2.1→2.4.0, mariadb 11.8→12.2)
  • Bridge reflected: success, PR comment #13904: 🌻 cc-ci — uptime-kuma @ 72861889 ✅ passed
  • Upgrader output: "UPGRADE RUN COMPLETE" after 7m 7s
  • Summary log written: /srv/cc-ci/.cc-ci-logs/upgrades/upgrade-all-2026-06-01.md

V8a self-termination noted:

  • After build #91 completed, cc-ci-upgrader session self-terminated (Claude exits → tmux closes)
  • launch-upgrader.py status returned "stopped" at 22:06Z
  • Adversary noted gap (plan says "stays idle") but accepted as V8a PASS (weekly cron still works)
  • Recorded in DECISIONS.md

Adversary BUILDER-INBOX received (22:09Z):

  • V1-V8a all PASS confirmed; V9 + §4 cron remaining
  • Additional PRs to close: n8n #3; cryptpad #3; lasuite-meet #2

V9 cleanup executed:

  • custom-html-tiny PR#2,#5: closed 22:02Z
  • custom-html PR#3: closed 22:03Z
  • cc-ci PR#3: closed 22:03Z
  • uptime-kuma PR#1: closed 22:03Z
  • n8n PR#3: closed 22:10Z
  • cryptpad PR#3: closed 22:10Z
  • lasuite-meet PR#2: closed 22:10Z
  • warm-keycloak stack: docker stack rm warm-keycloak_ci_commoninternet_net
  • upgrader session: launch-upgrader.py stop at 22:03Z ✓
  • Box stacks: 5 legit cc-ci services only ✓

§4 cron installed:

  • Mechanism: busybox crond in tmux session cc-ci-crond
  • Crontab: /home/loops/.cc-ci-crontabs/loops4 23 * * 1 ... launch-upgrader.py start
  • T0 = 2026-06-01T23:04Z (first fire in ~55min at time of install)
  • Pre-check: python3 launch-upgrader.py status with cron-equivalent env → "stopped" (working) ✓
  • Boot-persistence gap noted in DECISIONS.md (busybox crond not in NixOS system config)

Gate M5 CLAIMED — all V1-V9 evidence in STATUS-5.md; awaiting Adversary cold-verify.

2026-06-01 — A5-6 fix: enroll uptime-kuma; upgrader restarted

Adversary finding A5-6 (via BUILDER-INBOX.md): uptime-kuma not in bridge POLL_REPOS. Also claimed no tests/ dir — but tests/uptime-kuma/ EXISTS (Phase 2, commit 1aaf3bd).

Fix:

  • nix/modules/bridge.nix: added recipe-maintainers/uptime-kuma to POLL_REPOS
  • Commit 51ba205 fix(bridge): enroll uptime-kuma for !testme (A5-6)
  • git -C /root/builder-clone pull --rebase on cc-ci → fast-forward to 51ba205
  • nixos-rebuild build --flake path:/root/builder-clone#cc-ci → build OK
  • nixos-rebuild test --flake path:/root/builder-clone#cc-ci → bridge restarted
  • New bridge task poll list confirmed: recipe-maintainers/uptime-kuma now in POLL_REPOS ✓

Upgrader lifecycle:

  • Previous upgrader session (uptime-kuma run) killed (was stuck at VERDICT=PENDING)
  • Bridge first poll marked existing comment #13902 (!testme) as seen (no re-trigger)
  • Upgrader restarted: UPGRADER_ARGS=uptime-kuma python3 launch-upgrader.py start at 21:54:25Z
  • New upgrader session running /upgrade-all uptime-kuma (live run)

V5 and V3 PASS confirmed by Adversary at 21:52Z (full — no caveats).

2026-06-01 — A5-5 fix; V8/V8a started

A5-5 fix:

  • Ran the full /recipe-upgrade custom-html DEFAULT skill against seeded PR#3 (head 71e7326a)
  • Fresh POST=1 testme-on-pr.sh custom-html 3 → build #81
  • Build #81: install PASS, upgrade PASS, backup PASS, restore PASS, custom FAIL (MIME type only)
    • exact: test_content_type_html_and_txt AssertionError: Content-Type='application/octet-stream', expected text/plain
  • Accurate explanatory comment posted: https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3#issuecomment-13900 (references build #81, MIME-type root cause, no docroot-path confusion)
  • RESULT log written: /srv/cc-ci/.cc-ci-logs/upgrades/custom-html-upgrade-2026-06-01.md Last line: RESULT: SUCCESS-PENDING-TESTS — custom-html 1.10.0+1.28.0 → 1.11.2+1.29.0, recipe PR: .../custom-html/pulls/3; !testme RED on a stale test (commented; re-run --with-tests to update tests)

abra recipe upgrade auth fix:

  • Root cause: recipes that went through the Phase 5 flow had their origin changed from https://git.coopcloud.tech/coop-cloud/<recipe>.git (public, anonymous) to https://autonomic-bot:...@git.autonomic.zone/recipe-maintainers/<recipe>.git (private, embedded creds). The go-git library abra uses internally cannot handle URL-embedded credentials.
  • Fix: restored all affected recipe origin remotes to git.coopcloud.tech on cc-ci. The gitea remote (used by open-recipe-pr.sh) is a separate remote and was not affected. Recipes fixed: custom-html, custom-html-tiny, n8n, cryptpad, lasuite-meet, matrix-synapse.
  • Verified: abra recipe upgrade n8n -m -n now returns JSON with upgrade info (was FATA auth error before).

V8a lifecycle tests:

  • Dry-run already completed earlier (session was idle/finishing):
    • Dry-run report: /srv/cc-ci/.cc-ci-logs/upgrades/upgrade-all-2026-06-01.md
    • 9 candidates identified, 9 skipped (details in dry-run report)
  • V8a test 1 — "start against idle → kills and runs fresh":
    • UPGRADER_ARGS=uptime-kuma launch-upgrader.py start
    • Log: cc-ci-upgrader exists but idle/stale (or fresh requested) — killing it first
    • New session started with args uptime-kuma, immediately RUNNING (busy)
  • V8a test 2 — "start while busy → leaves it alone":
    • Immediately after, called UPGRADER_ARGS=something-different launch-upgrader.py start
    • Log: cc-ci-upgrader already running a job (busy) — leaving it
    • Session remained RUNNING (busy) with original args ✓

V8 live upgrade started:

  • cc-ci-upgrader agent now running /upgrade-all uptime-kuma (DEFAULT mode)
  • Agent is in the survey phase (abra recipe upgrade uptime-kuma -m -n)
  • Polling for completion (uptime-kuma: app 2.2.1 → 2.4.0, mariadb 11.8 → 12.2)

§4 T0-refire: CronCreate mechanism verified — 2026-06-01T23:18Z

busybox crond T0 miss (23:04Z) diagnosed as A5-7: crond silently skips all jobs when non-root (setgid/setuid fail with EPERM). Fix: switched to CronCreate (Claude scheduled task).

CronCreate one-shot test fire (ID 566f5fe6) scheduled at 23:17Z UTC. It fired into the session turn queue and was processed at 23:18Z. Command executed:

HOME=/home/loops PATH=/home/loops/.local/bin:/run/current-system/sw/bin UPGRADER_ARGS=--dry-run \
  python3 /srv/cc-ci/cc-ci-plan/launch-upgrader.py start >> /srv/cc-ci/.cc-ci-logs/upgrader-cron.log 2>&1

Result:

  • upgrader-cron.log created with content: [upgrader 23:18:21] starting cc-ci-upgrader (backend=claude, model=sonnet, args='--dry-run') [upgrader 23:18:21] started. attach: tmux attach -t cc-ci-upgrader log: .../cc-ci-upgrader.log
  • launch-upgrader.py statusRUNNING (busy)
  • cc-ci-upgrader tmux session created Mon Jun 1 23:18:21 2026 ✓

Weekly recurring job ID 8dd9aed3 installed: 4 23 * * 1 (Monday 23:04 UTC). Session-persistent (durable=true did not write scheduled_tasks.json in this env; job lives as long as Builder session).

busybox crond session (cc-ci-crond) and crontab dir cleaned up. /home/loops/.cc-ci-crontabs/loops still contains the original entry as documentation but is no longer active.