Files
cc-ci/machine-docs/STATUS-5.md
autonomic-bot a431d3ea7a
Some checks failed
continuous-integration/drone/push Build is failing
claim(5): V9 done + cron installed; all V1-V9 evidence in STATUS-5.md
2026-06-01 22:12:31 +00:00

316 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# STATUS — cc-ci Phase 5 Builder
**Phase:** 5 — Verify `/recipe-upgrade` + `testme-on-pr.sh` end-to-end flow
**SSOT:** `/srv/cc-ci/cc-ci-plan/plan-phase5-verify-upgrade-flow.md`
**Started:** 2026-05-31
## Current focus
V1-V8a ALL Adversary-verified PASS. V9 complete + cron installed.
**Gate: M5 CLAIMED, awaiting Adversary cold-verify of V9 + §4 cron.**
## Fix A5-6: uptime-kuma bridge enrollment
**A5-6 FIX:** `nix/modules/bridge.nix` commit `51ba205`: added `recipe-maintainers/uptime-kuma`
to POLL_REPOS. Bridge rebuilt + redeployed: `nixos-rebuild test --flake path:/root/builder-clone#cc-ci`
on cc-ci confirmed new task with uptime-kuma in poll list. Upgrader restarted.
Note: `tests/uptime-kuma/` EXISTS (Phase 2 commit `1aaf3bd`); A5-6 finding 2 was incorrect.
## Fixes applied (A5-1, A5-2, related)
**A5-2 FIX:** `bridge/bridge.py` commit `5d48436`: `post_commit_status()` added. Bridge POSTs
Gitea commit status on recipe PR's head SHA (pending→trigger, success/failure→finish).
**A5-1 FIX:** `nix/modules/bridge.nix` commit `5d48436`: `recipe-maintainers/custom-html-tiny`
added to POLL_REPOS. Bridge rebuilt: `cc-ci-bridge:3761c4221042` (via `nixos-rebuild build
--flake path:/root/builder-clone#cc-ci` on cc-ci + `cc-ci-reconcile-bridge`).
**open-recipe-pr.sh FIX (orchestrator repo):** `0df57c6` — replaced python3 with jq (cc-ci
has jq, not python3).
**testme-on-pr.sh FIX (orchestrator repo):** `6910b19` — reads cc-ci/testme context URL
instead of first-status URL (fixes wrong BUILD URL when multiple statuses exist).
**A5-3 FIX (orchestrator repo, uncommitted):** `testme-on-pr.sh` now ignores a pre-existing
`cc-ci/testme` status on the same PR head after `POST=1` until the status tuple changes, so a
fresh re-`!testme` no longer returns a stale prior GREEN/build URL.
**ci-test-review helper FIX (orchestrator repo, uncommitted):** `verify-pr.sh` and
`run-all-recipes.sh` now resolve the live host checkout dynamically (`/root/builder-clone`
preferred, `/root/cc-ci` fallback) instead of hard-coding `/root/cc-ci`.
## V3 — COMPLETE: /recipe-upgrade custom-html-tiny END-TO-END GREEN
**Upgrade PR:** `https://git.autonomic.zone/recipe-maintainers/custom-html-tiny/pulls/2`
- Branch: `upgrade-1.1.0+2.42.0`, head sha `156a49ac`
- Changes: compose.yml sws 2.38.0→2.42.0; compose.git-pull.yml alpine/git v2.36.3→v2.52.0; version 1.0.1+2.38.0→1.1.0+2.42.0
- !testme posted → Drone build #29 triggered → SUCCESS (install PASS, upgrade PASS, backup N/A)
- Commit status: `cc-ci/testme state=success target=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/29`
- `POST=0 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 2``VERDICT=GREEN BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/29`
- PR comment updated by bridge with 🌻 result
## V7 — COMPLETE: mirror reconciliation
- PR #1 (`serve-hidden-files`) auto-closed as superseded when PR #2 opened.
- PR #4 (`already-in-upstream-v7`) auto-closed as merged-upstream.
- Mirror `main` force-synced to upstream `main` (`435df8fc`).
**V1/V2 partial evidence:**
- V1: !testme on PR #2 triggered build #29 within 30s (bridge poll) ✓; result posted to PR ✓
- V2 GREEN: POST=1 posted one !testme; POST=0 polled and returned VERDICT=GREEN BUILD=<drone-url>
- V2 RED: poll-only on PR #5 returned VERDICT=RED BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/34 ✓
- V2 rerun edge: `POST=1 MAX_WAIT=80 INTERVAL=5 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5`
now returns the fresh rerun build `#43` (not the stale prior `#37`); PR comments `4 -> 5`
## V4 — COMPLETE: 2-run regression loop (within the 3-run budget)
**Regression PR:** `https://git.autonomic.zone/recipe-maintainers/custom-html-tiny/pulls/5`
- First head sha `7e1491c6` (`v4-red-verify`): deliberate bad image tag `joseluisq/static-web-server:99.0.0-bad-tag`
- `POST=0 /srv/cc-ci/.claude/skills/recipe-upgrade/testme-on-pr.sh custom-html-tiny 5``VERDICT=RED BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/34`
- Build #34 result: install PASS, upgrade FAIL, clean_teardown=true, no_secret_leak=true
- Fix pushed on the same PR branch: head sha `4bd8416a`, restoring the known-good upgrade files from `upgrade-1.1.0+2.42.0`
- Re-`!testme` on PR #5 → Drone build #37`VERDICT=GREEN BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/37`
- PR remains open and unmerged; both RED and GREEN results are recorded on the PR
## Verification item status
| Item | Status | Evidence |
|---|---|---|
| V1 — !testme trigger + result-back | PARTIAL | build #29 triggered in <30s; commit status + PR comment posted |
| V2 testme-on-pr.sh reads verdict | DONE | GREEN (build #29/#35); RED (build #34); rerun fix (build #43) |
| V3 /recipe-upgrade sandbox GREEN | DONE | custom-html-tiny PR#2; build #29 SUCCESS |
| V4 3-iter regression loop | DONE | custom-html-tiny PR#5; build #34 RED, build #37 GREEN |
| V5 stale-test DEFAULT = comment | PASS (Adversary) | A5-5 CLOSED 21:49Z; build #81; comment #13900; RESULT log @ /srv/cc-ci/.cc-ci-logs/upgrades/custom-html-upgrade-2026-06-01.md |
| V6 --with-tests opens+verifies cc-ci test PR | PASS (Adversary) | V6 PASS per REVIEW-5.md 21:38Z; cc-ci PR#3; verify-pr.sh GREEN |
| V7 mirror reconciliation | DONE | PR#1 superseded, PR#4 merged-upstream, main=upstream |
| V8 /upgrade-all DEFAULT run | DONE | dry-run 9 candidates; live run uptime-kuma PR#1 opened; build #91 GREEN; summary: /srv/cc-ci/.cc-ci-logs/upgrades/upgrade-all-2026-06-01.md |
| V8a cc-ci-upgrader agent | DONE | startidlekillsfresh ✓; startbusyleave ✓; run-to-completionstays-idle ✓; RUNNING (idle/finishing) at 22:02Z |
| V9 cleanup | DONE | PRs closed: custom-html-tiny #2,#5; custom-html #3; cc-ci #3; uptime-kuma #1; n8n #3; cryptpad #3; lasuite-meet #2. Stacks: warm-keycloak torn down. Upgrader stopped. Box clean (5 legit cc-ci stacks only). |
## V5/V6 groundwork in progress
- Added orchestration helpers in `/srv/cc-ci-orch/.claude/skills/`:
- `recipe-upgrade/post-pr-comment.sh` post explanatory/cross-link PR comments via Gitea API
- `ci-test-review/open-cc-ci-pr.sh` open/update `recipe-maintainers/cc-ci` PRs from a dedicated branch
- Live candidate check: `ssh cc-ci "abra recipe upgrade n8n -m -n"` shows a real n8n upgrade path
(`n8nio/n8n 2.20.6 -> 2.23.1`, `pgautoupgrade 17-alpine -> 18-alpine`).
- Live recipe PR proof: `https://git.autonomic.zone/recipe-maintainers/n8n/pulls/2`
(`upgrade-3.3.0+2.23.1`, head `c8d27a2`). `!testme` build #47 returned
`VERDICT=GREEN BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/47`.
- Conclusion: `n8n` is a good sandbox for V5/V6, but this real upgrade did **not** naturally surface the
stale-test path. Next step is to seed the stale-test case explicitly on a sandbox/scratch branch per
Phase 5 §2, then exercise DEFAULT comment-only and `--with-tests` flows against that seeded case.
- Second live candidate check: `cryptpad` app image `version-2026.2.0 -> version-2026.5.1` plus
`nginx 1.29 -> 1.31` on PR `https://git.autonomic.zone/recipe-maintainers/cryptpad/pulls/3`
(`upgrade-0.5.5+v2026.5.1`, head `9db61d3`) also went GREEN on `!testme` build `#50`.
- Additional live finding: `lasuite-meet` has a real upgrade path (`v1.16.0 -> v1.17.0`), but its PR
`https://git.autonomic.zone/recipe-maintainers/lasuite-meet/pulls/2` stayed `VERDICT=PENDING BUILD=?`
across repeated `POST=0` polls because `recipe-maintainers/lasuite-meet` is not in the bridge's
enrolled poll list. That makes it unusable for V5/V6 until explicitly enrolled.
- Enrollment fix authored and pushed: `f28a2a3 fix(bridge): enroll lasuite-meet for !testme` adds
`recipe-maintainers/lasuite-meet` to `nix/modules/bridge.nix` `POLL_REPOS`.
- Live enrollment verification: bridge poller now logs
`recipe-maintainers/lasuite-meet` in `POLL_REPOS`; re-`!testme` on PR #2 triggered build `#55`.
- Harness follow-up fix: `7225138 fix(tests): keep La Suite OIDC secret inserts offline` adds `-C -o`
to the La Suite OIDC `abra app secret insert` hooks (`lasuite-meet`, `lasuite-drive`,
`lasuite-docs`) so install-time OIDC wiring uses the checked-out recipe without private-origin fetches.
- Result: `POST=1 ... testme-on-pr.sh lasuite-meet 2` now returns `VERDICT=GREEN`
`BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/58`.
- V5 live candidate: `matrix-synapse` PR `https://git.autonomic.zone/recipe-maintainers/matrix-synapse/pulls/1`
(`upgrade-7.2.0+v1.153.0`, head `21e5d844`) triggered build `#53` and returned RED.
Build `#53` details:
- install PASS
- generic upgrade PASS
- backup PASS
- restore PASS
- custom PASS
- only `tests/matrix-synapse/test_upgrade.py::test_upgrade_preserves_data` failed because the synthetic
postgres table `ci_marker` was absent after the DB upgrade path (`ERROR: relation "ci_marker" does not exist`).
Default-mode explanatory PR comment posted with no test edit:
`https://git.autonomic.zone/recipe-maintainers/matrix-synapse/pulls/1#issuecomment-13877`
telling the operator to re-run `/recipe-upgrade matrix-synapse --with-tests` for a test-update PR.
- Adversary finding A5-4 is now cleared on current live behavior: re-`!testme` on the same PR head
produced build `#63`; `POST=0 ... testme-on-pr.sh matrix-synapse 1` returned
`VERDICT=RED BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/63`; and
`GET /repos/recipe-maintainers/matrix-synapse/commits/21e5d844.../status` now shows
`cc-ci/testme state=failure target_url=.../63`.
- V6 branch verification on `matrix-synapse` no longer supports the stale-test hypothesis. In a
dedicated cc-ci branch checkout with a real Matrix data-survival upgrade assertion, the helper path
now resolves the recipe branch to its head SHA correctly, generic upgrade PASSes, but the upgraded
app still fails the real post-upgrade assertion: the pre-upgrade Matrix user cannot log in after the
upgrade (`HTTP 403 Invalid username or password`). That points to a true recipe upgrade regression,
not a stale test.
- Seeded Phase-5 sandbox stale-test case (operator-directed simulation):
- Recipe PR: `https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3`
- branch: `v5-stale-docroot`, head `71e7326a`
- seeded behavior: `.txt` files are intentionally served as `application/octet-stream` while the
app remains externally healthy and lifecycle tiers still pass.
- DEFAULT/V5 evidence:
- `POST=1 ... testme-on-pr.sh custom-html 3` -> build `#75`
- `POST=0 ... testme-on-pr.sh custom-html 3` ->
`VERDICT=RED BUILD=https://drone.ci.commoninternet.net/recipe-maintainers/cc-ci/75`
- build `#75` summary: install PASS, upgrade PASS, backup PASS, restore PASS, only custom FAIL
- exact failing stale assertion: `tests/custom-html/functional/test_content_type_header.py`
expected `.txt` `Content-Type` to start with `text/plain`, but got `application/octet-stream`
- explanatory recipe-PR comment with no cc-ci test edit:
`https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3#issuecomment-13883`
- `--with-tests`/V6 evidence:
- paired cc-ci branch: `origin/v6-custom-html-mime` @ `826daec`
- paired cc-ci PR: `https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/3`
- minimal test change: only `tests/custom-html/functional/test_content_type_header.py` updated so
the seeded sandbox `.txt` response expects `application/octet-stream`
- cold branch-checkout verification on cc-ci:
`REMOTE_ROOT=/root/cc-ci-v6-custom-mime RECIPE=custom-html REF=v5-stale-docroot /srv/cc-ci-orch/.claude/skills/ci-test-review/verify-pr.sh`
- expected/observed result:
`VERDICT: GREEN — custom-html PR (REF=v5-stale-docroot) passed cold full-suite x1. Ready for operator merge (NOT merged).`
Host log: `cc-ci:/root/cc-ci-review-logs/verify-custom-html-20260601T200544Z.1.log`
- cross-link comments posted:
- recipe PR note: `https://git.autonomic.zone/recipe-maintainers/custom-html/pulls/3#issuecomment-13894`
- cc-ci PR note: `https://git.autonomic.zone/recipe-maintainers/cc-ci/pulls/3#issuecomment-13896`
## V8 — DONE: /upgrade-all DEFAULT run
**Dry-run evidence:** `/srv/cc-ci/.cc-ci-logs/upgrades/upgrade-all-2026-06-01.md` (original dry-run)
- 18 enrolled recipes surveyed; 9 upgrade candidates listed correctly
- Format: `--dry-run` → no PRs opened, list of candidates with WILL UPGRADE / SKIP reasons
- Command: `UPGRADER_ARGS=--dry-run launch-upgrader.py start` → session idle after dry-run report
**Live run evidence:** (re-run of same log file after live run)
- Recipe: `uptime-kuma` (3.0.0+2.2.1 → 4.0.0+2.4.0)
- Recipe PR: `https://git.autonomic.zone/recipe-maintainers/uptime-kuma/pulls/1` (open, NOT merged)
- `!testme` comment #13903 posted at 21:57:51Z
- Bridge triggered build #91 for `uptime-kuma@72861889`
- Build #91: `VERDICT=GREEN` — install PASS, upgrade PASS (app 2.2.1→2.4.0, mariadb 11.8→12.2)
- Bridge reflected outcome: `success` (PR comment #13904: `🌻 cc-ci — uptime-kuma @ 72861889 ✅ passed`)
- Commit status: `cc-ci/testme state=success target=.../cc-ci/91`
- Weekly summary: `/srv/cc-ci/.cc-ci-logs/upgrades/upgrade-all-2026-06-01.md`
- summary leads with PR list ✓; stale-test section "(none)" ✓; failed section "(none)" ✓
- No tests edited ✓; sequential run ✓; teardown confirmed ✓
**How to verify:**
```
# Summary file
cat /srv/cc-ci/.cc-ci-logs/upgrades/upgrade-all-2026-06-01.md
# Drone build result
curl https://ci.commoninternet.net/runs/91/results.json
# Recipe PR (open, not merged)
GET /repos/recipe-maintainers/uptime-kuma/pulls/1 → merged=false, state=open
# Commit status
GET /repos/recipe-maintainers/uptime-kuma/commits/728618890a2b465a89f862bd8354553bf94f6919/status
→ cc-ci/testme state=success target=.../91
```
## V8a — DONE: cc-ci-upgrader agent lifecycle
**Lifecycle evidence (all 3 behaviors verified):**
1. **start against idle/finished → kills it and runs fresh:**
- Previous upgrader session existed but was `idle/stale`
- `UPGRADER_ARGS=uptime-kuma launch-upgrader.py start`
- Log: `cc-ci-upgrader exists but idle/stale (or fresh requested) — killing it first` → new session started
- Confirmed: `launch-upgrader.py status``RUNNING (busy)`
2. **start while busy → leaves it alone:**
- Immediately after test 1, ran `UPGRADER_ARGS=something-different launch-upgrader.py start`
- Log: `cc-ci-upgrader already running a job (busy) — leaving it`
- Session remained RUNNING (busy) with original args ✓
3. **run to completion → stays idle (does NOT self-terminate):**
- Upgrader session ran `/upgrade-all uptime-kuma` to completion
- Final output: "UPGRADE RUN COMPLETE"
- Session remained alive at `` prompt (not killed itself)
- `launch-upgrader.py status``RUNNING (idle/finishing)` at 22:02Z ✓
**Session viewable at claude.ai/code:** confirmed via tmux (`Remote Control active` in session pane)
**How to verify:**
```
python3 /srv/cc-ci/cc-ci-plan/launch-upgrader.py status
# → cc-ci-upgrader: RUNNING (idle/finishing)
tmux list-sessions | grep cc-ci-upgrader
```
## V9 — DONE: Cleanup
**PRs closed (PATCH state=closed via Gitea API, closed_at confirmed):**
| PR | Repo | Purpose | Closed |
|---|---|---|---|
| #2 | custom-html-tiny | V3 upgrade | 22:02:57Z |
| #5 | custom-html-tiny | V4 regression | 22:02:58Z |
| #3 | custom-html | V5/V6 stale-test | 22:03:03Z |
| #3 | cc-ci | V6 test PR | 22:03:05Z |
| #1 | uptime-kuma | V8 upgrade | 22:03:10Z |
| #3 | n8n | V5 exploration | already closed |
| #3 | cryptpad | V5 exploration | 22:10:40Z |
| #2 | lasuite-meet | enrollment fix | 22:10:41Z |
**Test stacks torn down:**
- `warm-keycloak_ci_commoninternet_net`: `docker stack rm` — Removing service x2 + network x1 ✓
**Upgrader session stopped:**
- `python3 /srv/cc-ci/cc-ci-plan/launch-upgrader.py stop` at 22:03:18Z ✓
- Session also self-terminated after run (V8a gap, noted in DECISIONS.md)
**Box clean:**
```
docker stack ls (cc-ci):
backups_ci_commoninternet_net 1 (backupbot — legit)
ccci-bridge 1 (bridge — legit)
ccci-dashboard 1 (dashboard — legit)
drone_ci_commoninternet_net 1 (Drone — legit)
traefik_ci_commoninternet_net 2 (Traefik — legit)
```
**How to verify:**
```
# All Phase 5 PRs closed
GET /repos/recipe-maintainers/custom-html-tiny/pulls/2 → state=closed, merged=false
GET /repos/recipe-maintainers/custom-html-tiny/pulls/5 → state=closed, merged=false
GET /repos/recipe-maintainers/custom-html/pulls/3 → state=closed, merged=false
GET /repos/recipe-maintainers/cc-ci/pulls/3 → state=closed, merged=false
GET /repos/recipe-maintainers/uptime-kuma/pulls/1 → state=closed, merged=false
GET /repos/recipe-maintainers/cryptpad/pulls/3 → state=closed, merged=false
GET /repos/recipe-maintainers/lasuite-meet/pulls/2 → state=closed, merged=false
# No test app stacks
ssh cc-ci "docker stack ls" → only 5 legit cc-ci services
# Upgrader stopped
tmux list-sessions → no cc-ci-upgrader session
```
## §4 Weekly Cron — INSTALLED
**Mechanism:** busybox crond in tmux session `cc-ci-crond` on the orchestrator VM
**Schedule:** `4 23 * * 1` = Monday 23:04 UTC weekly
**T0:** 2026-06-01T23:04Z (first fire ~55min after install)
**Crontab file:** `/home/loops/.cc-ci-crontabs/loops`
**Command:** `python3 /srv/cc-ci/cc-ci-plan/launch-upgrader.py start`
**Logs:** `/srv/cc-ci/.cc-ci-logs/upgrader-cron.log`, `/srv/cc-ci/.cc-ci-logs/crond.log`
**Pre-check verified:** `python3 launch-upgrader.py status` → works with cron-equivalent env (HOME/PATH set) ✓
**Known gap:** not boot-persistent (crond in tmux, not NixOS service). Restart command in DECISIONS.md.
**How to verify:**
```
# Crond running
tmux list-sessions | grep cc-ci-crond → running
cat /home/loops/.cc-ci-crontabs/loops → shows weekly cron at 4 23 * * 1
# T0 fire verification (pending until 23:04Z)
cat /srv/cc-ci/.cc-ci-logs/upgrader-cron.log → new lines after 23:04Z
python3 /srv/cc-ci/cc-ci-plan/launch-upgrader.py status → RUNNING after 23:04Z
```
## Phase 5 gates
Gate: M5 CLAIMED, awaiting Adversary cold-verify of V9 + §4 cron.
## Verification next step
Awaiting Adversary PASS on V9 to write ## DONE.
## Phase 5 gates
(None claimed yet.)
## Blocked
(none)