Files
cc-ci/machine-docs/JOURNAL-5.md
autonomic-bot a147e0772d
Some checks failed
continuous-integration/drone Build is failing
status(5): record lasuite-meet enrollment rollout block
2026-06-01 13:00:34 +00:00

14 KiB

JOURNAL — cc-ci Phase 5

2026-05-31 — Phase 5 boot

Phase 5 starting. System state verified:

  • cc-ci: systemctl is-system-running → running; 0 failed units
  • Docker services: ccci-bridge 1/1, ccci-dashboard 1/1, drone 1/1, traefik 1/1
  • Bridge: 1/1 (container-based, logs via docker service logs ccci-bridge_app)

Sandbox recipe chosen: custom-html-tiny (simple static-web-server; short timeouts; existing install_steps.sh hook; generic harness; ideal for upgrade-flow testing with minimal CI runtime).

Existing open PRs on custom-html-tiny mirror:

  • #1 serve-hidden-files branch — "chore: publish 1.0.2+2.38.0 release" (feature + version bump, NOT from upstream main, NOT merged upstream, from 2026-05-25). Will be closed as superseded when we open the upgrade PR (expected V7 behavior).

Available upgrades for custom-html-tiny:

  • app service (joseluisq/static-web-server): 2.38.0 → 2.42.0
  • git service (alpine/git, compose.git-pull.yml): v2.36.3 → v2.52.0
  • New version label: 1.1.0+2.42.0

2026-05-31 — V3: recipe-upgrade flow starting

Following SKILL.md procedure for /recipe-upgrade custom-html-tiny: Step 1 (Plan): fetched recipe, found upgrades available — see above. Step 2 (Implement): upgrading image tags on cc-ci; bumping version label; committing. Step 3: open-recipe-pr.sh:

  • First attempt: FAILED — script uses python3 which is not installed on cc-ci. Fixed by rewriting to use jq (available on cc-ci) in commit 0df57c6 to cc-ci-orchestrator repo.
  • Second attempt: SUCCESS. Closed PR #1 (serve-hidden-files) as superseded, pushed branch upgrade-1.1.0+2.42.0, opened PR #2 at recipe-maintainers/custom-html-tiny#2 Step 4: testme-on-pr.sh:
  • Initial post: posted !testme, but VERDICT=PENDING (bridge didn't see it — custom-html-tiny not in poll list).
  • Adversary BUILDER-INBOX message received: two critical findings (A5-1, A5-2).

2026-05-31 — Adversary findings A5-1, A5-2 — both FIXED

A5-2 (CRITICAL): testme-on-pr.sh cannot read verdicts — bridge never posts commit statuses.

  • Root cause: bridge only posts PR comments; testme-on-pr.sh reads Gitea commit statuses.
  • Fix: Added post_commit_status() to bridge.py. Called from process_testme() (state=pending) and watch_and_reflect() (state=success/failure). Commit 5d48436.
  • Decision: use commit status approach (option 1) — cleaner, adds native Gitea PR status indicator. Recorded in DECISIONS.md.

A5-1: custom-html-tiny not in bridge poll list.

  • Fix: Added recipe-maintainers/custom-html-tiny to POLL_REPOS in nix/modules/bridge.nix. Commit 5d48436.
  • Bridge rebuilt via nixos-rebuild build --flake path:/root/builder-clone#cc-ci on cc-ci.
  • Note: secrets submodule needed manual checkout (git clone cc-ci-secrets /root/builder-clone/secrets) because git submodule update --init silently fails when submodule URL lacks credentials.
  • Bridge redeployed via /nix/store/asn4.../cc-ci-reconcile-bridge, new image cc-ci-bridge:3761c4221042.
  • Verified: docker service logs ccci-bridge_app --since 30s shows custom-html-tiny in poll list.

Next: re-post !testme on custom-html-tiny PR #2 with the fixed bridge; poll for VERDICT=GREEN.

2026-05-31 — V3 COMPLETE; V1/V2 partial; testme-on-pr.sh fix

testme-on-pr.sh fix committed (orchestrator repo 6910b19): now reads cc-ci/testme context URL.

Build #29 evidence:

  • Params: RECIPE=custom-html-tiny REF=156a49acc... PR=2 stages=install,upgrade,backup,restore,custom
  • Results: install PASS, upgrade PASS (1.0.0+2.38.0→1.1.0+2.42.0), backup/restore/custom N/A
  • Bridge commit status posted: cc-ci/testme state=success url=.../cc-ci/29 @2026-05-31T13:56:19
  • PR comment updated with 🌻 success banner

V2 GREEN verified: POST=0 → VERDICT=GREEN BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/29

V7 verified: mirror main = upstream main (435df8fc); PR#1 (serve-hidden-files) closed as superseded.

Next: V4 (regression loop) — create bad-tag branch on custom-html-tiny, get RED, fix, get GREEN.

2026-05-31 — Bootstrap/access checks + V4 regression loop complete

Bootstrap probes from the builder clone:

  • ssh cc-ci "hostname && whoami && nixos-version"cc-ci / root / 24.11.20250630.50ab793 (Vicuna)
  • set -a; . /srv/cc-ci/.testenv; set +a; curl -s https://$GITEA_URL/api/v1/version{"version":"1.24.2"}
  • getent ahostsv4 probe-12345.ci.commoninternet.net91.98.47.73 (STREAM/DGRAM/RAW)

V4 red side:

  • POST=0 MAX_WAIT=15 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5VERDICT=REDBUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/34
  • curl -fsSL https://ci.commoninternet.net/runs/34/results.json → install=pass, upgrade=fail, clean_teardown=true, no_secret_leak=true

V4 fix on cc-ci host (same recipe PR branch):

  • git -C /root/.abra/recipes/custom-html-tiny checkout -B v4-red-verify origin/v4-red-verify
  • git -C /root/.abra/recipes/custom-html-tiny checkout origin/upgrade-1.1.0+2.42.0 -- compose.yml compose.git-pull.yml
  • git -C /root/.abra/recipes/custom-html-tiny -c user.name='autonomic-bot' -c user.email='autonomic-bot@git.autonomic.zone' commit -m 'fix: resolve V4 regression for green re-test'[v4-red-verify 4bd8416] fix: resolve V4 regression for green re-test
  • git -C /root/.abra/recipes/custom-html-tiny push origin HEAD:v4-red-verify → updated PR #5 head 7e1491c..4bd8416

V4 green side:

  • MAX_WAIT=300 INTERVAL=10 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5VERDICT=GREENBUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/37

Adversary follow-up:

  • REVIEW-5.md follow-up (review(5) commit e87782a) closed A5-1 and A5-2 after a fresh cold re-test.
  • BUILDER-INBOX.md noted that POST=0 must be env-prefixed in STATUS-5.md; corrected here and the inbox is being consumed now.

Next: V5 default stale-test case, then V6 --with-tests.

2026-06-01 — Adversary finding A5-3 fixed; helper paths corrected

Adversary review+inbox reported a real V2 rerun bug: on a re-!testme against the same PR head, POST=1 testme-on-pr.sh could read the previous terminal cc-ci/testme status before the bridge posted the new pending state, and return the old build URL.

Fix authored in the orchestration repo helper:

  • testme-on-pr.sh now captures the current cc-ci/testme status tuple before posting a fresh !testme, then ignores that unchanged tuple while polling. It returns only once the status changes to the new run's state/URL.
  • ci-test-review/{verify-pr.sh,run-all-recipes.sh} also now resolve the live host checkout dynamically (/root/builder-clone, fallback /root/cc-ci) because the current cc-ci box no longer has /root/cc-ci.

Verification:

  • bash -n /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh && bash -n /srv/cc-ci-orch/.claude/skills/ci-test-review/verify-pr.sh && bash -n /srv/cc-ci-orch/.claude/skills/ci-test-review/run-all-recipes.sh → exit 0
  • cmp -s /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh && echo samesame
  • BEFORE=$(...) ; POST=1 MAX_WAIT=80 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5 ; RC=$? ; AFTER=$(...) ; printf 'RC=%s\nBEFORE=%s\nAFTER=%s\n' "$RC" "$BEFORE" "$AFTER"VERDICT=GREENBUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/43RC=0BEFORE=4AFTER=5

Next: consume BUILDER-INBOX.md in git, then continue with V5 stale-test candidate selection.

2026-06-01 — Adversary re-test PASS; V5/V6 helpers added; n8n live probe

Adversary review update:

  • REVIEW-5.md 2026-06-01T03:31:30Z closed A5-3 after a cold re-test. The rerun helper now returns the fresh build URL on same-head re-!testme.

V5/V6 automation gap closed in the orchestration repo (new files only; did not rewrite the already-dirty helper scripts):

  • /srv/cc-ci-orch/.claude/skills/recipe-upgrade/post-pr-comment.sh
  • /srv/cc-ci-orch/.claude/skills/ci-test-review/open-cc-ci-pr.sh
  • Verification: bash -n on both new scripts exited 0 after chmod +x.

Live stale-test candidate exploration:

  • ssh cc-ci "export PATH=/run/current-system/sw/bin:$PATH; abra recipe upgrade n8n -m -n" showed a real available upgrade: app 2.20.6 -> 2.23.1, db 17-alpine -> 18-alpine.
  • On cc-ci ~/.abra/recipes/n8n, created a scratch upgrade commit:
    • compose.yml: n8nio/n8n:2.20.6 -> 2.23.1
    • compose.yml: version label 3.2.0+2.20.6 -> 3.3.0+2.23.1
    • compose.postgres.yml: pgautoupgrade/pgautoupgrade:17-alpine -> 18-alpine
  • Opened mirror PR via open-recipe-pr.sh:
    • PR_URL=https://git.autonomic.zone/recipe-maintainers/n8n/pulls/2
    • branch upgrade-3.3.0+2.23.1, head c8d27a2
  • Triggered real cc-ci gate:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh n8n 2 -> VERDICT=PENDING -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47
    • POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh n8n 2 -> VERDICT=GREEN -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47

Conclusion:

  • n8n remains the best V5/V6 sandbox candidate because its tests have real version-shape assertions, but the natural upgrade path did NOT yield a stale-test failure. Per Phase 5 §2, the next move is to seed a stale-test case explicitly on a sandbox/scratch branch and then run the DEFAULT comment-only and --with-tests paths against that seeded case.

2026-06-01 — Resume loop: cryptpad green, lasuite-meet not enrolled

Pulled the latest Adversary review (REVIEW-5.md 2026-06-01T03:50:00Z): V2 poll-only on n8n PR #2 still PASSes cold (VERDICT=GREEN, build #47). No new finding to fix.

Live cryptpad probe:

  • Registry check showed a real app upgrade beyond the current recipe head: cryptpad/cryptpad:version-2026.2.0 -> version-2026.5.1 (plus nginx 1.29 -> 1.31).
  • On cc-ci ~/.abra/recipes/cryptpad, created branch phase5-v5-cryptpad-2026-5-1, updated compose.yml, and committed:
    • cryptpad/cryptpad:version-2026.2.0 -> version-2026.5.1
    • nginx:1.29 -> 1.31
    • recipe version label 0.5.4+v2026.2.0 -> 0.5.5+v2026.5.1
    • commit: 9db61d3 feat: upgrade to 0.5.5+v2026.5.1
  • Opened mirror PR via open-recipe-pr.sh:
    • PR_URL=https://git.autonomic.zone/recipe-maintainers/cryptpad/pulls/3
    • branch upgrade-0.5.5+v2026.5.1
  • Real cc-ci verdict:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh cryptpad 3 -> VERDICT=PENDING -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/50
    • POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh cryptpad 3 -> VERDICT=GREEN -> BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/50
  • Conclusion: cryptpad does NOT provide the V5 stale-test branch either; its live upgrade stayed green.

Live lasuite-meet probe:

  • ssh cc-ci "export PATH=/run/current-system/sw/bin:$PATH; abra recipe upgrade lasuite-meet -m -n" showed a real app upgrade: frontend/backend/celery v1.16.0 -> v1.17.0, redis 8.6.3 -> 8.8.0.
  • On cc-ci ~/.abra/recipes/lasuite-meet, created branch phase5-v5-lasuite-meet-v1-17-0, updated compose.yml, and committed:
    • frontend/backend/celery v1.16.0 -> v1.17.0
    • redis:8.6.3 -> 8.8.0
    • recipe version label 0.3.0+v1.16.0 -> 0.3.1+v1.17.0
    • commit: 2d0c707 feat: upgrade to 0.3.1+v1.17.0
  • Opened mirror PR via open-recipe-pr.sh:
    • PR_URL=https://git.autonomic.zone/recipe-maintainers/lasuite-meet/pulls/2
    • branch upgrade-0.3.1+v1.17.0
  • Real trigger attempts:
    • POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=PENDING -> BUILD=?
    • POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2 -> VERDICT=PENDING -> BUILD=?
    • after an extra 60s delay, POST=0 MAX_WAIT=240 INTERVAL=10 ... still returned VERDICT=PENDING BUILD=?
  • Conclusion: this is not a stale-test case yet; recipe-maintainers/lasuite-meet is not enrolled in the bridge poll set, so !testme never entered the real CI path. Keep V5/V6 search on already-enrolled recipes.

2026-06-01 — Operator steer: enroll lasuite-meet; activation left host offline

Re-oriented from the current Phase 5 SSOT and the phase ledgers. There is no separate plan-phase6-* file in /srv/cc-ci/cc-ci-plan; the operator steer maps to Phase 5 V5/V6.

Minimal code change:

  • nix/modules/bridge.nix: added recipe-maintainers/lasuite-meet to POLL_REPOS
  • committed + pushed as f28a2a3 fix(bridge): enroll lasuite-meet for !testme

Host rollout attempts:

  • ssh cc-ci "test -d /root/builder-clone && git -C /root/builder-clone pull --rebase" -> fast-forwarded host clone to f28a2a3
  • ssh cc-ci "nixos-rebuild build --flake path:/root/builder-clone#cc-ci" -> build completed (new system store path created)
  • ssh cc-ci "nixos-rebuild switch --flake path:/root/builder-clone#cc-ci" -> activation reached the known bootloader failure: efiSysMountPoint = '/boot' is not a mounted partition Failed to install bootloader but did not roll the bridge task
  • ssh cc-ci "systemctl show -P ExecStart deploy-bridge.service" showed the old active helper path, and the running swarm task still used cc-ci-bridge:3761c4221042
  • ssh cc-ci "nixos-rebuild test --flake path:/root/builder-clone#cc-ci" was used to activate the updated config without touching the bootloader; it restarted multiple units, including deploy-bridge.service, and then the SSH session dropped with: Timeout, server 100.95.31.88 not responding.

Post-activation reachability probes from the orchestrator:

  • ssh cc-ci "systemctl status deploy-bridge.service --no-pager" -> connect to host 100.95.31.88 port 22: Connection timed out
  • tailscale status -> 100.95.31.88 cc-ci ... active; relay "nue"; offline
  • tailscale ping -c 3 cc-ci -> no reply
  • after a 2-minute warm poll: SSH still timed out

Current state:

  • The repo-side enrollment fix is durable on origin/main.
  • Live verification that the bridge poller now watches recipe-maintainers/lasuite-meet is blocked on host reachability returning.