cc-ci/machine-docs/STATUS-cfold.md

# STATUS — phase cfold (custom-folder collapse)

**Phase:** cfold — collapse `functional/`+`playwright/` into `custom/`
**Builder:** autonomic-bot
**Updated:** 2026-06-12

---

## M1 — PASS

Gate result: `REVIEW-cfold.md` 2026-06-12T16:20Z -> **M1 PASS**

Inputs for verification:
- Implementation commit: `44e0242` (`feat(cfold): canonicalize custom test layout`)

Completed in this checkpoint:
- discovery.py: `custom/` canonical + deprecated aliases with warnings
- `git mv` all 64 custom tests (60 functional + 4 playwright) across 20 recipes
- helper modules moved alongside their tests into `custom/`
- sys.path refs updated in mailu lifecycle overlays
- docs updated (`README.md`, `recipe-customization.md`, `testing.md`, `enroll-recipe.md`)
- unit tests updated (`test_discovery.py`, `test_discovery_phase2.py`, `test_manifest.py`)
- manifest.py now reports canonical `custom` counts

WHAT:
- M1 implementation is complete: custom-test discovery is canonicalized to `custom/`, deprecated
  aliases warn loudly instead of silently dropping coverage, all cc-ci custom tests/helpers moved to
  `tests/<recipe>/custom/`, manifest counts are canonicalized, and the placement-rule docs/unit tests
  were updated.

HOW:
- `git ls-files "tests/*/custom/test_*.py" | wc -l`
- `git ls-files "tests/*/functional/*" "tests/*/playwright/*"`
- `for recipe in bluesky-pds cryptpad custom-html custom-html-tiny discourse drone ghost hedgedoc immich keycloak lasuite-docs lasuite-drive lasuite-meet mailu matrix-synapse mattermost-lts mumble n8n plausible uptime-kuma; do count=$(git ls-files "tests/$recipe/custom/test_*.py" | wc -l); printf "%s %s\n" "$recipe" "$count"; done`
- `nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_discovery.py tests/unit/test_discovery_phase2.py tests/unit/test_manifest.py -q`

EXPECTED:
- Total canonical custom tests: `64`
- Old tracked trees: no output for `functional/*` or `playwright/*`
- Per-recipe counts exactly match the baseline table below
- Focused unit suite: `18 passed`

WHERE:
- Discovery + alias warnings: `runner/harness/discovery.py`
- Canonical manifest counts: `runner/harness/manifest.py`
- Migrated custom tests/helpers: `tests/*/custom/`
- Focused unit coverage: `tests/unit/test_discovery.py`, `tests/unit/test_discovery_phase2.py`, `tests/unit/test_manifest.py`
- Placement-rule docs: `docs/recipe-customization.md`, `docs/testing.md`, `docs/enroll-recipe.md`, `README.md`

Adversary verdict:
- `machine-docs/REVIEW-cfold.md` lines 52-77
- PASS facts include: 64 canonical custom tests, zero old tracked custom trees, focused unit suite `18 passed`, deprecated-alias warning probe green, normalized `(recipe, filename)` coverage set preserved exactly (`missing []`, `extra []`).

---

## M2 — IN PROGRESS

Current work item:
- full real-CI `!testme` sweep evidence is mostly assembled; one recipe (`ghost`) remains non-green for
  a cfold-neutral upgrade regression on the recipe/environment side
- fresh follow-up probes now show the Ghost upgrade failure is not confined to PR #4 / PR #5: a reopened
  PR #3 at ref `720faa0b` also re-failed twice post-cfold (`568`, `569`) with the same shape
- the Ghost duplicate-trigger side issue is now root-caused in the bridge source: reopened PRs can replay
  old pre-bridge-start `!testme` comments that were never seen during startup because the PR was closed
  at that time; the bridge fix is now pushed and live on `cc-ci` (image tag `eb32876581d9`)

### M2 baseline matrix (built from live PR heads + fresh post-cfold evidence)

| Recipe | PR / ref | Expected level | Custom tests | Fresh evidence |
|---|---|---:|---:|---|
| bluesky-pds | PR #2 `f7b6c8df` | 5 | 4 | build `556` -> L5 |
| cryptpad | PR #5 `9c18c176` | 5 | 4 | build `554` -> L5 |
| custom-html | PR #2 `db9a9502` | 5 | 4 | build `541` -> L5 |
| custom-html-tiny | PR #7 `526502ba` | 5 | 1 | build `510` -> L5 |
| discourse | PR #2 `b7d8a244` | 5 | 3 | build `521` -> L5 |
| drone | PR #1 `049438e1` | 5 | 1 | build `506` -> L5 |
| ghost | PR #3 `720faa0b` | 5 | 4 | build `568` -> L1 (upgrade fail) |
| hedgedoc | PR #1 `441c411c` | 5 | 2 | build `555` -> L5 |
| immich | PR #2 `17f1649c` | 5 | 3 | build `522` -> L5 |
| keycloak | PR #3 `bfe0d16f` | 5 | 3 | build `553` -> L5 |
| lasuite-docs | PR #5 `8a06cfc2` | 5 | 5 | build `523` -> L5 |
| lasuite-drive | PR #2 `6771622b` | 5 | 3 | build `524` -> L5 |
| lasuite-meet | PR #6 `05cdafb5` | 5 | 3 | build `525` -> L5 |
| mailu | PR #4 `682ccaaa` | 5 | 3 | build `526` -> L5 |
| matrix-synapse | PR #2 `72f0176a` | 5 | 3 | build `527` -> L5 |
| mattermost-lts | PR #2 `966c6d61` | 5 | 3 | build `529` -> L5 |
| mumble | PR #1 `2b50b2f7` | 5 | 5 | build `558` -> L5 |
| n8n | PR #5 `989c44b3` | 5 | 4 | build `528` -> L5 |
| plausible | PR #3 `709a294d` | 5 | 2 | build `530` -> L5 |
| uptime-kuma | PR #3 `b0ce7942` | 5 | 4 | build `531` -> L5 |

### Ghost deviation (blocking a formal M2 claim)

`ghost` is the only recipe still preventing an M2 claim.

- Current upgrade PR heads and fresh post-cfold outcomes are all red with the same stage shape:
  - PR #3 `720faa0b`: builds `568` and `569` -> L1; install/backup/restore/custom/lint pass, upgrade fail
  - PR #4 `d88f5801`: build `557` -> L1; install/backup/restore/custom pass, upgrade fail
  - PR #5 `d42d0f7c`: build `559` -> L1; install/backup/restore/custom/lint pass, upgrade fail
- Focused artifact audit still confirms the strongest same-ref comparison explicitly:
  historical build `185` (`d42d0f7c7cf9`) had `upgrade=pass`, while fresh build `559` on that same ref
  has `upgrade=fail` with the canonical `custom` stage still green.
- The fresh PR #3 rerun adds a second previously-green Ghost upgrade head that now fails the same way,
  so the blocker is broader than a single Ghost branch and still points away from cfold itself.
- Side observation from the PR #3 retrigger: a single `!testme` comment at `2026-06-13T00:07:50Z` spawned
  three new Ghost runs (`568`, `569`, `570`). All three are now red with the same upgrade-only
  failure.
- Root cause of the triple-trigger: bridge logs show those three runs were tied to three distinct comment
  ids on the reopened PR (`14029`, `14032`, `14497`), not one comment processed three times. The poller
  replayed two historical `!testme` comments that predated the current bridge process because PR #3 was
  closed during bridge startup and only became visible to the poller after reopen.
- Conclusion so far: Ghost's current failure is not caused by the `custom/` folder migration; the custom
  tier still discovers and passes all 4 canonical custom tests, and the regression reproduces across
  multiple Ghost PR heads as an upgrade convergence failure.

### Fresh Adversary state

- `REVIEW-cfold.md` 2026-06-12T23:45:11Z: cold Ghost follow-up audit only, no new finding, no M2 claim pending.
- `REVIEW-cfold.md` 2026-06-13T00:23:55Z: cold M2 artifact/teardown audit only, no new finding, no M2
  claim pending; zero leaked live `-pr` stacks confirmed.

---

## Baseline (pre-cfold) — custom test count per recipe

| Recipe | Count |
|--------|-------|
| bluesky-pds | 4 |
| cryptpad | 4 |
| custom-html | 4 |
| custom-html-tiny | 1 |
| discourse | 3 |
| drone | 1 |
| ghost | 4 |
| hedgedoc | 2 |
| immich | 3 |
| keycloak | 3 |
| lasuite-docs | 5 |
| lasuite-drive | 3 |
| lasuite-meet | 3 |
| mailu | 3 |
| matrix-synapse | 3 |
| mattermost-lts | 3 |
| mumble | 5 |
| n8n | 4 |
| plausible | 2 |
| uptime-kuma | 4 |
| **TOTAL** | **64** |