258 lines
14 KiB
Markdown
258 lines
14 KiB
Markdown
# JOURNAL — cc-ci Phase 5
|
|
|
|
## 2026-05-31 — Phase 5 boot
|
|
|
|
Phase 5 starting. System state verified:
|
|
- cc-ci: `systemctl is-system-running` → running; 0 failed units
|
|
- Docker services: ccci-bridge 1/1, ccci-dashboard 1/1, drone 1/1, traefik 1/1
|
|
- Bridge: 1/1 (container-based, logs via `docker service logs ccci-bridge_app`)
|
|
|
|
**Sandbox recipe chosen:** `custom-html-tiny` (simple static-web-server; short timeouts; existing
|
|
install_steps.sh hook; generic harness; ideal for upgrade-flow testing with minimal CI runtime).
|
|
|
|
**Existing open PRs on custom-html-tiny mirror:**
|
|
- #1 `serve-hidden-files` branch — "chore: publish 1.0.2+2.38.0 release" (feature + version bump,
|
|
NOT from upstream main, NOT merged upstream, from 2026-05-25). Will be closed as superseded when
|
|
we open the upgrade PR (expected V7 behavior).
|
|
|
|
**Available upgrades for custom-html-tiny:**
|
|
- `app` service (joseluisq/static-web-server): 2.38.0 → 2.42.0
|
|
- `git` service (alpine/git, compose.git-pull.yml): v2.36.3 → v2.52.0
|
|
- New version label: 1.1.0+2.42.0
|
|
|
|
## 2026-05-31 — V3: recipe-upgrade flow starting
|
|
|
|
Following SKILL.md procedure for /recipe-upgrade custom-html-tiny:
|
|
Step 1 (Plan): fetched recipe, found upgrades available — see above.
|
|
Step 2 (Implement): upgrading image tags on cc-ci; bumping version label; committing.
|
|
Step 3: open-recipe-pr.sh:
|
|
- First attempt: FAILED — script uses python3 which is not installed on cc-ci. Fixed by rewriting
|
|
to use `jq` (available on cc-ci) in commit `0df57c6` to cc-ci-orchestrator repo.
|
|
- Second attempt: SUCCESS. Closed PR #1 (`serve-hidden-files`) as superseded, pushed branch
|
|
`upgrade-1.1.0+2.42.0`, opened PR #2 at https://git.autonomic.zone/recipe-maintainers/custom-html-tiny/pulls/2
|
|
Step 4: testme-on-pr.sh:
|
|
- Initial post: posted !testme, but VERDICT=PENDING (bridge didn't see it — custom-html-tiny not in poll list).
|
|
- Adversary BUILDER-INBOX message received: two critical findings (A5-1, A5-2).
|
|
|
|
## 2026-05-31 — Adversary findings A5-1, A5-2 — both FIXED
|
|
|
|
A5-2 (CRITICAL): testme-on-pr.sh cannot read verdicts — bridge never posts commit statuses.
|
|
- Root cause: bridge only posts PR comments; testme-on-pr.sh reads Gitea commit statuses.
|
|
- Fix: Added `post_commit_status()` to bridge.py. Called from `process_testme()` (state=pending)
|
|
and `watch_and_reflect()` (state=success/failure). Commit `5d48436`.
|
|
- Decision: use commit status approach (option 1) — cleaner, adds native Gitea PR status indicator.
|
|
Recorded in DECISIONS.md.
|
|
|
|
A5-1: custom-html-tiny not in bridge poll list.
|
|
- Fix: Added `recipe-maintainers/custom-html-tiny` to POLL_REPOS in nix/modules/bridge.nix.
|
|
Commit `5d48436`.
|
|
- Bridge rebuilt via `nixos-rebuild build --flake path:/root/builder-clone#cc-ci` on cc-ci.
|
|
- Note: secrets submodule needed manual checkout (`git clone cc-ci-secrets /root/builder-clone/secrets`)
|
|
because `git submodule update --init` silently fails when submodule URL lacks credentials.
|
|
- Bridge redeployed via `/nix/store/asn4.../cc-ci-reconcile-bridge`, new image `cc-ci-bridge:3761c4221042`.
|
|
- Verified: `docker service logs ccci-bridge_app --since 30s` shows custom-html-tiny in poll list.
|
|
|
|
Next: re-post !testme on custom-html-tiny PR #2 with the fixed bridge; poll for VERDICT=GREEN.
|
|
|
|
## 2026-05-31 — V3 COMPLETE; V1/V2 partial; testme-on-pr.sh fix
|
|
|
|
testme-on-pr.sh fix committed (orchestrator repo 6910b19): now reads cc-ci/testme context URL.
|
|
|
|
Build #29 evidence:
|
|
- Params: RECIPE=custom-html-tiny REF=156a49acc... PR=2 stages=install,upgrade,backup,restore,custom
|
|
- Results: install PASS, upgrade PASS (1.0.0+2.38.0→1.1.0+2.42.0), backup/restore/custom N/A
|
|
- Bridge commit status posted: cc-ci/testme state=success url=.../cc-ci/29 @2026-05-31T13:56:19
|
|
- PR comment updated with 🌻 success banner
|
|
|
|
V2 GREEN verified: POST=0 → VERDICT=GREEN BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/29
|
|
|
|
V7 verified: mirror main = upstream main (435df8fc); PR#1 (serve-hidden-files) closed as superseded.
|
|
|
|
Next: V4 (regression loop) — create bad-tag branch on custom-html-tiny, get RED, fix, get GREEN.
|
|
|
|
## 2026-05-31 — Bootstrap/access checks + V4 regression loop complete
|
|
|
|
Bootstrap probes from the builder clone:
|
|
- `ssh cc-ci "hostname && whoami && nixos-version"` → `cc-ci` / `root` / `24.11.20250630.50ab793 (Vicuna)`
|
|
- `set -a; . /srv/cc-ci/.testenv; set +a; curl -s https://$GITEA_URL/api/v1/version` → `{"version":"1.24.2"}`
|
|
- `getent ahostsv4 probe-12345.ci.commoninternet.net` → `91.98.47.73` (STREAM/DGRAM/RAW)
|
|
|
|
V4 red side:
|
|
- `POST=0 MAX_WAIT=15 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5`
|
|
→ `VERDICT=RED`
|
|
→ `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/34`
|
|
- `curl -fsSL https://ci.commoninternet.net/runs/34/results.json` → install=`pass`, upgrade=`fail`, clean_teardown=`true`, no_secret_leak=`true`
|
|
|
|
V4 fix on cc-ci host (same recipe PR branch):
|
|
- `git -C /root/.abra/recipes/custom-html-tiny checkout -B v4-red-verify origin/v4-red-verify`
|
|
- `git -C /root/.abra/recipes/custom-html-tiny checkout origin/upgrade-1.1.0+2.42.0 -- compose.yml compose.git-pull.yml`
|
|
- `git -C /root/.abra/recipes/custom-html-tiny -c user.name='autonomic-bot' -c user.email='autonomic-bot@git.autonomic.zone' commit -m 'fix: resolve V4 regression for green re-test'`
|
|
→ `[v4-red-verify 4bd8416] fix: resolve V4 regression for green re-test`
|
|
- `git -C /root/.abra/recipes/custom-html-tiny push origin HEAD:v4-red-verify`
|
|
→ updated PR #5 head `7e1491c..4bd8416`
|
|
|
|
V4 green side:
|
|
- `MAX_WAIT=300 INTERVAL=10 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5`
|
|
→ `VERDICT=GREEN`
|
|
→ `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/37`
|
|
|
|
Adversary follow-up:
|
|
- `REVIEW-5.md` follow-up (`review(5)` commit `e87782a`) closed A5-1 and A5-2 after a fresh cold re-test.
|
|
- `BUILDER-INBOX.md` noted that `POST=0` must be env-prefixed in `STATUS-5.md`; corrected here and the inbox is being consumed now.
|
|
|
|
Next: V5 default stale-test case, then V6 `--with-tests`.
|
|
|
|
## 2026-06-01 — Adversary finding A5-3 fixed; helper paths corrected
|
|
|
|
Adversary review+inbox reported a real V2 rerun bug: on a re-`!testme` against the same PR head,
|
|
`POST=1 testme-on-pr.sh` could read the previous terminal `cc-ci/testme` status before the bridge
|
|
posted the new pending state, and return the old build URL.
|
|
|
|
Fix authored in the orchestration repo helper:
|
|
- `testme-on-pr.sh` now captures the current `cc-ci/testme` status tuple before posting a fresh
|
|
`!testme`, then ignores that unchanged tuple while polling. It returns only once the status changes
|
|
to the new run's state/URL.
|
|
- `ci-test-review/{verify-pr.sh,run-all-recipes.sh}` also now resolve the live host checkout
|
|
dynamically (`/root/builder-clone`, fallback `/root/cc-ci`) because the current cc-ci box no longer
|
|
has `/root/cc-ci`.
|
|
|
|
Verification:
|
|
- `bash -n /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh && bash -n /srv/cc-ci-orch/.claude/skills/ci-test-review/verify-pr.sh && bash -n /srv/cc-ci-orch/.claude/skills/ci-test-review/run-all-recipes.sh`
|
|
→ exit 0
|
|
- `cmp -s /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh && echo same`
|
|
→ `same`
|
|
- `BEFORE=$(...) ; POST=1 MAX_WAIT=80 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5 ; RC=$? ; AFTER=$(...) ; printf 'RC=%s\nBEFORE=%s\nAFTER=%s\n' "$RC" "$BEFORE" "$AFTER"`
|
|
→ `VERDICT=GREEN`
|
|
→ `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/43`
|
|
→ `RC=0`
|
|
→ `BEFORE=4`
|
|
→ `AFTER=5`
|
|
|
|
Next: consume `BUILDER-INBOX.md` in git, then continue with V5 stale-test candidate selection.
|
|
|
|
## 2026-06-01 — Adversary re-test PASS; V5/V6 helpers added; n8n live probe
|
|
|
|
Adversary review update:
|
|
- `REVIEW-5.md` 2026-06-01T03:31:30Z closed A5-3 after a cold re-test. The rerun helper now returns the
|
|
fresh build URL on same-head re-`!testme`.
|
|
|
|
V5/V6 automation gap closed in the orchestration repo (new files only; did not rewrite the already-dirty
|
|
helper scripts):
|
|
- `/srv/cc-ci-orch/.claude/skills/recipe-upgrade/post-pr-comment.sh`
|
|
- `/srv/cc-ci-orch/.claude/skills/ci-test-review/open-cc-ci-pr.sh`
|
|
- Verification: `bash -n` on both new scripts exited 0 after `chmod +x`.
|
|
|
|
Live stale-test candidate exploration:
|
|
- `ssh cc-ci "export PATH=/run/current-system/sw/bin:$PATH; abra recipe upgrade n8n -m -n"`
|
|
showed a real available upgrade: app `2.20.6 -> 2.23.1`, db `17-alpine -> 18-alpine`.
|
|
- On cc-ci `~/.abra/recipes/n8n`, created a scratch upgrade commit:
|
|
- `compose.yml`: `n8nio/n8n:2.20.6 -> 2.23.1`
|
|
- `compose.yml`: version label `3.2.0+2.20.6 -> 3.3.0+2.23.1`
|
|
- `compose.postgres.yml`: `pgautoupgrade/pgautoupgrade:17-alpine -> 18-alpine`
|
|
- Opened mirror PR via `open-recipe-pr.sh`:
|
|
- `PR_URL=https://git.autonomic.zone/recipe-maintainers/n8n/pulls/2`
|
|
- branch `upgrade-3.3.0+2.23.1`, head `c8d27a2`
|
|
- Triggered real cc-ci gate:
|
|
- `POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh n8n 2`
|
|
-> `VERDICT=PENDING`
|
|
-> `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47`
|
|
- `POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh n8n 2`
|
|
-> `VERDICT=GREEN`
|
|
-> `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47`
|
|
|
|
Conclusion:
|
|
- `n8n` remains the best V5/V6 sandbox candidate because its tests have real version-shape assertions,
|
|
but the natural upgrade path did NOT yield a stale-test failure. Per Phase 5 §2, the next move is to
|
|
seed a stale-test case explicitly on a sandbox/scratch branch and then run the DEFAULT comment-only and
|
|
`--with-tests` paths against that seeded case.
|
|
|
|
## 2026-06-01 — Resume loop: cryptpad green, lasuite-meet not enrolled
|
|
|
|
Pulled the latest Adversary review (`REVIEW-5.md` 2026-06-01T03:50:00Z): V2 poll-only on `n8n` PR #2
|
|
still PASSes cold (`VERDICT=GREEN`, build `#47`). No new finding to fix.
|
|
|
|
Live cryptpad probe:
|
|
- Registry check showed a real app upgrade beyond the current recipe head:
|
|
`cryptpad/cryptpad:version-2026.2.0 -> version-2026.5.1` (plus `nginx 1.29 -> 1.31`).
|
|
- On cc-ci `~/.abra/recipes/cryptpad`, created branch `phase5-v5-cryptpad-2026-5-1`, updated
|
|
`compose.yml`, and committed:
|
|
- `cryptpad/cryptpad:version-2026.2.0 -> version-2026.5.1`
|
|
- `nginx:1.29 -> 1.31`
|
|
- recipe version label `0.5.4+v2026.2.0 -> 0.5.5+v2026.5.1`
|
|
- commit: `9db61d3 feat: upgrade to 0.5.5+v2026.5.1`
|
|
- Opened mirror PR via `open-recipe-pr.sh`:
|
|
- `PR_URL=https://git.autonomic.zone/recipe-maintainers/cryptpad/pulls/3`
|
|
- branch `upgrade-0.5.5+v2026.5.1`
|
|
- Real cc-ci verdict:
|
|
- `POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh cryptpad 3`
|
|
-> `VERDICT=PENDING`
|
|
-> `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/50`
|
|
- `POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh cryptpad 3`
|
|
-> `VERDICT=GREEN`
|
|
-> `BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/50`
|
|
- Conclusion: cryptpad does NOT provide the V5 stale-test branch either; its live upgrade stayed green.
|
|
|
|
Live lasuite-meet probe:
|
|
- `ssh cc-ci "export PATH=/run/current-system/sw/bin:$PATH; abra recipe upgrade lasuite-meet -m -n"`
|
|
showed a real app upgrade: frontend/backend/celery `v1.16.0 -> v1.17.0`, redis `8.6.3 -> 8.8.0`.
|
|
- On cc-ci `~/.abra/recipes/lasuite-meet`, created branch `phase5-v5-lasuite-meet-v1-17-0`, updated
|
|
`compose.yml`, and committed:
|
|
- frontend/backend/celery `v1.16.0 -> v1.17.0`
|
|
- `redis:8.6.3 -> 8.8.0`
|
|
- recipe version label `0.3.0+v1.16.0 -> 0.3.1+v1.17.0`
|
|
- commit: `2d0c707 feat: upgrade to 0.3.1+v1.17.0`
|
|
- Opened mirror PR via `open-recipe-pr.sh`:
|
|
- `PR_URL=https://git.autonomic.zone/recipe-maintainers/lasuite-meet/pulls/2`
|
|
- branch `upgrade-0.3.1+v1.17.0`
|
|
- Real trigger attempts:
|
|
- `POST=1 MAX_WAIT=90 INTERVAL=5 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2`
|
|
-> `VERDICT=PENDING`
|
|
-> `BUILD=?`
|
|
- `POST=0 MAX_WAIT=300 INTERVAL=10 /srv/cc-ci-orch/.claude/skills/recipe-upgrade/testme-on-pr.sh lasuite-meet 2`
|
|
-> `VERDICT=PENDING`
|
|
-> `BUILD=?`
|
|
- after an extra 60s delay, `POST=0 MAX_WAIT=240 INTERVAL=10 ...` still returned `VERDICT=PENDING BUILD=?`
|
|
- Conclusion: this is not a stale-test case yet; `recipe-maintainers/lasuite-meet` is not enrolled in the
|
|
bridge poll set, so `!testme` never entered the real CI path. Keep V5/V6 search on already-enrolled
|
|
recipes.
|
|
|
|
## 2026-06-01 — Operator steer: enroll lasuite-meet; activation left host offline
|
|
|
|
Re-oriented from the current Phase 5 SSOT and the phase ledgers. There is no separate `plan-phase6-*`
|
|
file in `/srv/cc-ci/cc-ci-plan`; the operator steer maps to Phase 5 V5/V6.
|
|
|
|
Minimal code change:
|
|
- `nix/modules/bridge.nix`: added `recipe-maintainers/lasuite-meet` to `POLL_REPOS`
|
|
- committed + pushed as `f28a2a3 fix(bridge): enroll lasuite-meet for !testme`
|
|
|
|
Host rollout attempts:
|
|
- `ssh cc-ci "test -d /root/builder-clone && git -C /root/builder-clone pull --rebase"`
|
|
-> fast-forwarded host clone to `f28a2a3`
|
|
- `ssh cc-ci "nixos-rebuild build --flake path:/root/builder-clone#cc-ci"`
|
|
-> build completed (new system store path created)
|
|
- `ssh cc-ci "nixos-rebuild switch --flake path:/root/builder-clone#cc-ci"`
|
|
-> activation reached the known bootloader failure:
|
|
`efiSysMountPoint = '/boot' is not a mounted partition`
|
|
`Failed to install bootloader`
|
|
but did not roll the bridge task
|
|
- `ssh cc-ci "systemctl show -P ExecStart deploy-bridge.service"`
|
|
showed the old active helper path, and the running swarm task still used `cc-ci-bridge:3761c4221042`
|
|
- `ssh cc-ci "nixos-rebuild test --flake path:/root/builder-clone#cc-ci"`
|
|
was used to activate the updated config without touching the bootloader; it restarted multiple units,
|
|
including `deploy-bridge.service`, and then the SSH session dropped with:
|
|
`Timeout, server 100.95.31.88 not responding.`
|
|
|
|
Post-activation reachability probes from the orchestrator:
|
|
- `ssh cc-ci "systemctl status deploy-bridge.service --no-pager"`
|
|
-> `connect to host 100.95.31.88 port 22: Connection timed out`
|
|
- `tailscale status`
|
|
-> `100.95.31.88 cc-ci ... active; relay "nue"; offline`
|
|
- `tailscale ping -c 3 cc-ci`
|
|
-> `no reply`
|
|
- after a 2-minute warm poll: SSH still timed out
|
|
|
|
Current state:
|
|
- The repo-side enrollment fix is durable on origin/main.
|
|
- Live verification that the bridge poller now watches `recipe-maintainers/lasuite-meet` is blocked on
|
|
host reachability returning.
|