claim(cfold): claim M2 full sweep green
Some checks failed
continuous-integration/drone/push Build is failing
Some checks failed
continuous-integration/drone/push Build is failing
This commit is contained in:
@ -405,3 +405,83 @@ comment-bridge listening on 0.0.0.0:8080 (poll primary + optional webhook)
|
||||
|
||||
This fix addresses the replay hole exposed during cfold's Ghost retrigger. It does not change the cfold
|
||||
bottom line: Ghost's upgrade tier remains the lone M2 blocker, while custom discovery continues to pass.
|
||||
|
||||
## 2026-06-13 — Ghost upgrade blocker fixed in cc-ci; same-ref real CI rerun now green
|
||||
|
||||
I stayed on the Ghost blocker until I had a same-ref real-`!testme` proof, since M2 could not be claimed
|
||||
while Ghost remained the only non-green recipe in the sweep.
|
||||
|
||||
Focused investigation sequence:
|
||||
|
||||
- Preserved-current-code repros showed the old failure mode honestly: during the base->head crossover, the
|
||||
new Ghost app task could start before the replacement mysql service was usable, exiting on
|
||||
`ENOTFOUND` / `ECONNREFUSED` against `${STACK_NAME}_db`, which made swarm pause the update before the
|
||||
head spec settled.
|
||||
- My first attempt (`restart_policy.delay`) was insufficient because swarm paused the update on the first
|
||||
failed new task before any retry delay could matter.
|
||||
- My second attempt (wrapping Ghost in `command: sh -ec ...`) proved the DB wait idea but regressed the
|
||||
base install: it bypassed Ghost's normal docker-entrypoint first-boot path, so the default `source`
|
||||
theme was never seeded and `/` stayed 500 (`The currently active theme "source" is missing`).
|
||||
- Final fix: move the DB wait into the app `entrypoint`, then exec the normal
|
||||
`/abra-entrypoint.sh node current/index.js` path. That preserved both the first-boot seeding behavior
|
||||
and the upgrade crossover guard.
|
||||
|
||||
The finished overlay in `tests/ghost/compose.ccci.yml` now does three things and nothing more:
|
||||
|
||||
1. keep the existing 15m app healthcheck grace,
|
||||
2. keep the existing 15m db healthcheck grace,
|
||||
3. wait for the DB TCP socket before entering the normal Ghost entrypoint on the base->head crossover.
|
||||
|
||||
Verification:
|
||||
|
||||
```bash
|
||||
$ ssh cc-ci 'jq -r ".results, .stages" /var/lib/cc-ci-runs/ghost-repro-cfold-3/results.json'
|
||||
{
|
||||
"install": "pass",
|
||||
"upgrade": "pass"
|
||||
}
|
||||
[
|
||||
{"name":"install","status":"pass",...},
|
||||
{"name":"upgrade","status":"pass",...},
|
||||
{"name":"lint","status":"pass",...}
|
||||
]
|
||||
|
||||
$ ssh cc-ci 'tok=$(cat /run/secrets/bridge_drone_token); curl -fsS -H "Authorization: Bearer $tok" https://drone.ci.commoninternet.net/api/repos/recipe-maintainers/cc-ci/builds/585 | jq -r "[.number,.status,.after,.params.RECIPE,.params.PR,.params.REF] | @tsv"'
|
||||
585 success d44f799de945d0775933aad58726d46509154a64 ghost 5 d42d0f7c7cf9946077a583ffa3f7c96abfe94a77
|
||||
|
||||
$ ssh cc-ci 'jq -r "{level,recipe,ref,results,stages:(.stages|map({name,status}))}" /var/lib/cc-ci-runs/585/results.json'
|
||||
{
|
||||
"level": 5,
|
||||
"recipe": "ghost",
|
||||
"ref": "d42d0f7c7cf9",
|
||||
"results": {
|
||||
"backup": "pass",
|
||||
"custom": "pass",
|
||||
"install": "pass",
|
||||
"restore": "pass",
|
||||
"upgrade": "pass"
|
||||
},
|
||||
"stages": [
|
||||
{"name":"install","status":"pass"},
|
||||
{"name":"upgrade","status":"pass"},
|
||||
{"name":"backup","status":"pass"},
|
||||
{"name":"restore","status":"pass"},
|
||||
{"name":"custom","status":"pass"},
|
||||
{"name":"lint","status":"pass"}
|
||||
]
|
||||
}
|
||||
|
||||
$ ssh cc-ci 'printf "ghost custom junit="; ls /var/lib/cc-ci-runs/585/junit/custom__cc-ci__*.xml | wc -l; printf " ghost upgrade junit="; ls /var/lib/cc-ci-runs/585/junit/upgrade*.xml | wc -l'
|
||||
ghost custom junit=4
|
||||
ghost upgrade junit=2
|
||||
|
||||
$ ssh cc-ci 'printf "live_pr_apps="; docker stack ls --format "{{.Name}}" | grep -c -- "-pr" || true'
|
||||
live_pr_apps=0
|
||||
```
|
||||
|
||||
Outcome:
|
||||
|
||||
- Ghost is no longer the M2 blocker.
|
||||
- The real PR-triggered build (`585`) on the same Ghost ref that previously failed (`d42d0f7c`) is now L5.
|
||||
- The custom tier remained intact throughout: still 4 canonical custom JUnit files on the green run.
|
||||
- With Ghost green and teardown clean, the cfold phase is ready for a formal M2 claim.
|
||||
|
||||
@ -53,16 +53,18 @@ Adversary verdict:
|
||||
|
||||
---
|
||||
|
||||
## M2 — IN PROGRESS
|
||||
## M2 — CLAIMED, awaiting Adversary
|
||||
|
||||
Gate: M2 — CLAIMED, awaiting Adversary
|
||||
|
||||
Current work item:
|
||||
- full real-CI `!testme` sweep evidence is mostly assembled; one recipe (`ghost`) remains non-green for
|
||||
a cfold-neutral upgrade regression on the recipe/environment side
|
||||
- fresh follow-up probes now show the Ghost upgrade failure is not confined to PR #4 / PR #5: a reopened
|
||||
PR #3 at ref `720faa0b` also re-failed twice post-cfold (`568`, `569`) with the same shape
|
||||
- the Ghost duplicate-trigger side issue is now root-caused in the bridge source: reopened PRs can replay
|
||||
old pre-bridge-start `!testme` comments that were never seen during startup because the PR was closed
|
||||
at that time; the bridge fix is now pushed and live on `cc-ci` (image tag `eb32876581d9`)
|
||||
- full real-CI `!testme` sweep is now green across the enrolled recipe set, including the formerly-blocking
|
||||
Ghost PR head
|
||||
- Ghost's upgrade blocker was fixed in cc-ci via the `tests/ghost/compose.ccci.yml` overlay: the app now
|
||||
waits in its entrypoint for the replacement DB socket before starting during the base->head crossover,
|
||||
while preserving Ghost's normal `/abra-entrypoint.sh node current/index.js` boot path
|
||||
- bridge replay-guard fix remains live on `cc-ci` (image tag `eb32876581d9`); the Ghost duplicate-trigger
|
||||
side issue is separately closed and no longer affects the cfold sweep result
|
||||
|
||||
### M2 baseline matrix (built from live PR heads + fresh post-cfold evidence)
|
||||
|
||||
@ -74,7 +76,7 @@ Current work item:
|
||||
| custom-html-tiny | PR #7 `526502ba` | 5 | 1 | build `510` -> L5 |
|
||||
| discourse | PR #2 `b7d8a244` | 5 | 3 | build `521` -> L5 |
|
||||
| drone | PR #1 `049438e1` | 5 | 1 | build `506` -> L5 |
|
||||
| ghost | PR #3 `720faa0b` | 5 | 4 | build `568` -> L1 (upgrade fail) |
|
||||
| ghost | PR #5 `d42d0f7c` | 5 | 4 | build `585` -> L5 |
|
||||
| hedgedoc | PR #1 `441c411c` | 5 | 2 | build `555` -> L5 |
|
||||
| immich | PR #2 `17f1649c` | 5 | 3 | build `522` -> L5 |
|
||||
| keycloak | PR #3 `bfe0d16f` | 5 | 3 | build `553` -> L5 |
|
||||
@ -89,29 +91,25 @@ Current work item:
|
||||
| plausible | PR #3 `709a294d` | 5 | 2 | build `530` -> L5 |
|
||||
| uptime-kuma | PR #3 `b0ce7942` | 5 | 4 | build `531` -> L5 |
|
||||
|
||||
### Ghost deviation (blocking a formal M2 claim)
|
||||
### Ghost closure
|
||||
|
||||
`ghost` is the only recipe still preventing an M2 claim.
|
||||
`ghost` was the final M2 blocker and is now green on the real `!testme` path.
|
||||
|
||||
- Current upgrade PR heads and fresh post-cfold outcomes are all red with the same stage shape:
|
||||
- PR #3 `720faa0b`: builds `568` and `569` -> L1; install/backup/restore/custom/lint pass, upgrade fail
|
||||
- PR #4 `d88f5801`: build `557` -> L1; install/backup/restore/custom pass, upgrade fail
|
||||
- PR #5 `d42d0f7c`: build `559` -> L1; install/backup/restore/custom/lint pass, upgrade fail
|
||||
- Focused artifact audit still confirms the strongest same-ref comparison explicitly:
|
||||
historical build `185` (`d42d0f7c7cf9`) had `upgrade=pass`, while fresh build `559` on that same ref
|
||||
has `upgrade=fail` with the canonical `custom` stage still green.
|
||||
- The fresh PR #3 rerun adds a second previously-green Ghost upgrade head that now fails the same way,
|
||||
so the blocker is broader than a single Ghost branch and still points away from cfold itself.
|
||||
- Side observation from the PR #3 retrigger: a single `!testme` comment at `2026-06-13T00:07:50Z` spawned
|
||||
three new Ghost runs (`568`, `569`, `570`). All three are now red with the same upgrade-only
|
||||
failure.
|
||||
- Root cause of the triple-trigger: bridge logs show those three runs were tied to three distinct comment
|
||||
ids on the reopened PR (`14029`, `14032`, `14497`), not one comment processed three times. The poller
|
||||
replayed two historical `!testme` comments that predated the current bridge process because PR #3 was
|
||||
closed during bridge startup and only became visible to the poller after reopen.
|
||||
- Conclusion so far: Ghost's current failure is not caused by the `custom/` folder migration; the custom
|
||||
tier still discovers and passes all 4 canonical custom tests, and the regression reproduces across
|
||||
multiple Ghost PR heads as an upgrade convergence failure.
|
||||
- Historical failing same-ref comparison remains the strongest pre-fix proof:
|
||||
- build `559` on `d42d0f7c7cf9` -> L1; install/backup/restore/custom/lint pass, upgrade fail
|
||||
- build `585` on `d42d0f7c7cf9` -> L5; install/upgrade/backup/restore/custom/lint pass
|
||||
- Root cause of the upgrade failure: during the base->head crossover, Ghost's app task started before the
|
||||
replacement DB service was accepting connections, so the new task exited on `ENOTFOUND`/`ECONNREFUSED`
|
||||
against `${STACK_NAME}_db` and swarm paused the update before the head spec could settle.
|
||||
- Fix landed in `cc-ci` commit `d44f799` (`fix(cfold): wait for ghost db in entrypoint`):
|
||||
`tests/ghost/compose.ccci.yml` now keeps the existing 15m app/db healthcheck grace and wraps the app
|
||||
`entrypoint` with a tiny TCP wait that execs the normal `/abra-entrypoint.sh node current/index.js`
|
||||
path only after the DB socket is reachable.
|
||||
- Focused same-code-path repro after the fix:
|
||||
- `/var/lib/cc-ci-runs/ghost-repro-cfold-3/results.json` -> `install=pass`, `upgrade=pass`
|
||||
- log `/root/ghost-repro-cfold-3.log` includes
|
||||
`upgrade-converged: ghos-ce3c44_ci_commoninternet_net_app swarm UpdateStatus=completed`
|
||||
and `upgrade->PR-head: head_ref=d42d0f7c chaos-version=d42d0f7c+U version=1.2.0+6.21.2-alpine->1.4.0+6.44.0-alpine`
|
||||
|
||||
### Fresh Adversary state
|
||||
|
||||
@ -119,6 +117,31 @@ Current work item:
|
||||
- `REVIEW-cfold.md` 2026-06-13T00:23:55Z: cold M2 artifact/teardown audit only, no new finding, no M2
|
||||
claim pending; zero leaked live `-pr` stacks confirmed.
|
||||
|
||||
WHAT:
|
||||
- M2 is now met: the full real-CI `!testme` recipe sweep is green, the formerly-blocking Ghost recipe is
|
||||
green again on the same PR head that previously failed, custom-tier coverage remains intact, and there
|
||||
are zero leaked live `-pr` stacks.
|
||||
|
||||
HOW:
|
||||
- `ssh cc-ci 'tok=$(cat /run/secrets/bridge_drone_token); curl -fsS -H "Authorization: Bearer $tok" https://drone.ci.commoninternet.net/api/repos/recipe-maintainers/cc-ci/builds/585 | jq -r "[.number,.status,.after,.params.RECIPE,.params.PR,.params.REF] | @tsv"'`
|
||||
- `ssh cc-ci 'jq -r "{level,recipe,ref,results,stages:(.stages|map({name,status}))}" /var/lib/cc-ci-runs/585/results.json'`
|
||||
- `ssh cc-ci 'printf "ghost custom junit="; ls /var/lib/cc-ci-runs/585/junit/custom__cc-ci__*.xml | wc -l; printf " ghost upgrade junit="; ls /var/lib/cc-ci-runs/585/junit/upgrade*.xml | wc -l'`
|
||||
- `ssh cc-ci 'printf "live_pr_apps="; docker stack ls --format "{{.Name}}" | grep -c -- "-pr" || true'`
|
||||
- `ssh cc-ci 'jq -r ".results, .stages" /var/lib/cc-ci-runs/ghost-repro-cfold-3/results.json'`
|
||||
|
||||
EXPECTED:
|
||||
- Drone build query returns build `585`, status `success`, `after=d44f799de945d0775933aad58726d46509154a64`, recipe `ghost`, PR `5`, ref `d42d0f7c7cf9946077a583ffa3f7c96abfe94a77`
|
||||
- `results.json` for build `585` shows `level: 5` and `results.install=pass`, `results.upgrade=pass`, `results.backup=pass`, `results.restore=pass`, `results.custom=pass`; stages include `install`, `upgrade`, `backup`, `restore`, `custom`, `lint` all `pass`
|
||||
- JUnit counts for build `585`: `ghost custom junit=4`, `ghost upgrade junit=2`
|
||||
- Teardown check returns `live_pr_apps=0`
|
||||
- Focused repro `ghost-repro-cfold-3` shows `install=pass`, `upgrade=pass`
|
||||
|
||||
WHERE:
|
||||
- Fix commit: `d44f799` (`fix(cfold): wait for ghost db in entrypoint`)
|
||||
- Ghost overlay: `tests/ghost/compose.ccci.yml`
|
||||
- Real CI proof: `/var/lib/cc-ci-runs/585/results.json`, `/var/lib/cc-ci-runs/585/junit/`
|
||||
- Focused repro proof: `/var/lib/cc-ci-runs/ghost-repro-cfold-3/results.json`, `/root/ghost-repro-cfold-3.log`
|
||||
|
||||
---
|
||||
|
||||
## Baseline (pre-cfold) — custom test count per recipe
|
||||
|
||||
Reference in New Issue
Block a user