plan: phase 'cfold' — collapse functional/+playwright/ into custom/ + full !testme recipe sweep (queued after drone)

The functional/playwright split is purely organizational (discovery globs both
with no branching; same custom tier -> L4 rung, same fixtures, same failure
semantics). Migrate all custom tests to one custom/ folder; M1 proves coverage
identical before/after (no silent drops), M2 is a full real-CI !testme sweep
across all recipes confirming levels unchanged. cfold becomes the last phase so
the queued /upgrade-all fires after it (folder change verified before upgrade).
This commit is contained in:
autonomic-bot
2026-06-11 22:52:45 +00:00
parent 79134a94e8
commit af2b2e8156
2 changed files with 101 additions and 0 deletions

View File

@ -498,3 +498,13 @@ session cc-ci-orchestrator-stale can be killed; recipe-mirrors org still private
kickoff → machine-docs/STATUS-mailu.md); watchdog bounced. resolve_state/INBOX already
read machine-docs/ first so phase_done unaffected.
- Memory notes committed+pushed (cc-ci-orch c33b21f) per AGENTS.md 'memory lives in repo'.
## 2026-06-11 ~22:05 — phase `cfold` queued after drone (+ recipe CI sweep)
- Operator: collapse custom-test folders functional/ + playwright/ → one custom/ folder
(the split is purely organizational — verified: discovery.py globs both with no
branching, same tier/rung/fixtures/failure semantics). Plan:
plan-phase-cfold-custom-folder.md. M2 = full !testme recipe sweep proving no recipe's
custom tests silently dropped + levels unchanged (the operator-required sweep).
- .phases-spec now …drone;cfold (10 phases). cfold is the new LAST phase, so the
.run-upgrade-on-complete hook fires /upgrade-all AFTER cfold — correct order (folder
change swept-green before the weekly upgrade runs). Watchdog bounced to load it.

View File

@ -0,0 +1,91 @@
# Phase `cfold` — collapse custom-test folders into one `custom/` + full recipe CI sweep
**Mission (operator-specified):** custom recipe tests currently live in TWO folders —
`tests/<recipe>/functional/` and `tests/<recipe>/playwright/` — a split that is purely
organizational (the harness treats both identically). Collapse them into a single
`tests/<recipe>/custom/` folder, then prove the change with a full real-CI recipe sweep
confirming every recipe's `!testme` still works and no custom test was silently dropped.
State files (machine-docs/, per the file-location rule): `machine-docs/STATUS-cfold.md`,
`BACKLOG-cfold.md`, `REVIEW-cfold.md`, `JOURNAL-cfold.md`. DECISIONS.md shared.
## 1. Why this is safe (investigation already done, 2026-06-11)
The split carries ZERO semantic weight — verified:
- `runner/harness/discovery.py:103 custom_tests()` globs `subdirs = ("functional",
"playwright")` with NO branching on which folder.
- `runner/run_recipe_ci.py:579 run_custom()` runs both in the same `custom` tier with the
same pytest command.
- Same fixtures for both (`recipe`/`meta`/`live_app`/`op_state`/`deps`); playwright tests
just `from playwright.sync_api import sync_playwright` directly — no special fixture.
- Both map to the `functional` rung (L4); folder name does NOT affect tier/rung/level.
- Failure semantics identical. So merging loses nothing.
The ONE distinction that DOES matter and MUST be preserved: a **top-level**
`test_<op>.py` is a *lifecycle overlay*, NOT a custom test (top-level non-lifecycle files
are not discovered). `custom/` is still a subdir, so that distinction survives.
## 2. Implementation (P1)
1. **Discovery:** `discovery.custom_tests()` → canonical subdir is `custom/`. To prevent
SILENT coverage loss, do NOT do a blind cutover: either (RECOMMENDED) keep recognizing
`functional/`/`playwright/` as deprecated aliases AND emit a loud one-line warning when
a test is found in a deprecated folder, OR have discovery raise/log loudly if a
non-empty `functional/`/`playwright/` remains after migration. The end-state canonical
home is `custom/`; nothing may be dropped without a loud signal. Decide + record in
DECISIONS.md; the Adversary reviews the choice.
2. **Migrate cc-ci's own tests:** `git mv tests/<recipe>/{functional,playwright}/test_*.py`
→ `tests/<recipe>/custom/` for EVERY recipe. Preserve any per-recipe `conftest.py` /
helper modules those tests import (move/adjust imports as needed — mechanical only).
3. **repo-local (HC2-gated) tests:** recipes' OWN repo `tests/` may still use the old
folder names. Keep discovery recognizing them (deprecated-alias path) OR document the
rename requirement — do NOT silently stop discovering them. State the decision.
4. **Docs:** update the placement rule everywhere — `docs/recipe-customization.md` §3
+ §5.3 + the tree, `docs/testing.md` §4, `docs/enroll-recipe.md` worked examples.
Regenerate anything generated.
5. **Unit test:** `tests/unit/` coverage for `custom_tests()` — finds `custom/`, ignores
top-level lifecycle overlays, (alias behavior if kept), deterministic ordering.
6. **Nothing else may key off the names:** grep the whole repo for `functional/` and
`playwright/` string literals (harness, bridge, dashboard, results, screenshot, drone
pipeline) and fix every consumer. The screenshot `SCREENSHOT` hook + manifest must be
unaffected.
## 3. Gates
**M1 — Migration complete + coverage-preserving (pre-sweep).** All recipes' custom tests
relocated to `custom/`; discovery + docs + unit tests updated; full-repo grep shows no
stale consumer. **Coverage-diff proof (cardinal, mirrors rcust M1):** the SET of
discovered + executed custom tests per recipe is IDENTICAL before and after — same files,
same count, just relocated; NONE dropped, NONE newly skipped. Adversary cold-verifies the
diff from a clean checkout and confirms no consumer still keys off the old folder names
and no test assertion was weakened.
**M2 — Full recipe CI sweep (the operator-required proof).** Run a real-CI sweep across
ALL enrolled recipes via the **drone `!testme` path** confirming every recipe's custom
tier still discovers + runs + passes its tests at the same level as its pre-cfold
baseline. Build the baseline matrix (recipe → expected level + custom-test set) BEFORE
the change. Then sweep: every recipe's `!testme` green, custom tests present in the run
output / manifest, levels unchanged, zero leaked apps. Max 2-3 concurrent live deploys;
canary suite green. Any deviation must be explained as cfold-neutral or fixed. Fresh
Adversary PASS → Builder writes `## DONE`.
## 4. Guardrails (binding)
- **No silent coverage loss** — the whole point of M1's coverage diff. A custom test that
stops being discovered without a loud signal is an automatic FAIL.
- **No test weakening** — this is a pure relocation; assertions are untouchable. The only
content changes allowed are import-path adjustments forced by the move (mechanical,
Adversary-checked line-by-line).
- **File-location rule** still applies to loop-state files (machine-docs/).
- Real-CI etiquette: ≤2-3 concurrent deploys, teardown every dev deploy on every exit
path, never git-checkout `~/.abra/recipes/<recipe>` mid-build, no secrets in logs.
- Recipe mirrors: PR only, never merge. Commit author `autonomic-bot
<autonomic-bot@noreply.git.autonomic.zone>`; push every commit. CI host: no python3 on
default PATH (use `cc-ci-run`).
## 5. Definition of Done
All custom tests live under `tests/<recipe>/custom/` (functional/playwright collapsed),
discovery + docs + unit tests updated, no consumer keys off the old names, coverage proven
identical before/after, and a full `!testme` recipe sweep is green with unchanged levels
and zero leaks. M1 + M2 fresh Adversary PASSes.