The functional/playwright split is purely organizational (discovery globs both with no branching; same custom tier -> L4 rung, same fixtures, same failure semantics). Migrate all custom tests to one custom/ folder; M1 proves coverage identical before/after (no silent drops), M2 is a full real-CI !testme sweep across all recipes confirming levels unchanged. cfold becomes the last phase so the queued /upgrade-all fires after it (folder change verified before upgrade).
5.7 KiB
Phase cfold — collapse custom-test folders into one custom/ + full recipe CI sweep
Mission (operator-specified): custom recipe tests currently live in TWO folders —
tests/<recipe>/functional/ and tests/<recipe>/playwright/ — a split that is purely
organizational (the harness treats both identically). Collapse them into a single
tests/<recipe>/custom/ folder, then prove the change with a full real-CI recipe sweep
confirming every recipe's !testme still works and no custom test was silently dropped.
State files (machine-docs/, per the file-location rule): machine-docs/STATUS-cfold.md,
BACKLOG-cfold.md, REVIEW-cfold.md, JOURNAL-cfold.md. DECISIONS.md shared.
1. Why this is safe (investigation already done, 2026-06-11)
The split carries ZERO semantic weight — verified:
runner/harness/discovery.py:103 custom_tests()globssubdirs = ("functional", "playwright")with NO branching on which folder.runner/run_recipe_ci.py:579 run_custom()runs both in the samecustomtier with the same pytest command.- Same fixtures for both (
recipe/meta/live_app/op_state/deps); playwright tests justfrom playwright.sync_api import sync_playwrightdirectly — no special fixture. - Both map to the
functionalrung (L4); folder name does NOT affect tier/rung/level. - Failure semantics identical. So merging loses nothing.
The ONE distinction that DOES matter and MUST be preserved: a top-level
test_<op>.py is a lifecycle overlay, NOT a custom test (top-level non-lifecycle files
are not discovered). custom/ is still a subdir, so that distinction survives.
2. Implementation (P1)
- Discovery:
discovery.custom_tests()→ canonical subdir iscustom/. To prevent SILENT coverage loss, do NOT do a blind cutover: either (RECOMMENDED) keep recognizingfunctional//playwright/as deprecated aliases AND emit a loud one-line warning when a test is found in a deprecated folder, OR have discovery raise/log loudly if a non-emptyfunctional//playwright/remains after migration. The end-state canonical home iscustom/; nothing may be dropped without a loud signal. Decide + record in DECISIONS.md; the Adversary reviews the choice. - Migrate cc-ci's own tests:
git mv tests/<recipe>/{functional,playwright}/test_*.py→tests/<recipe>/custom/for EVERY recipe. Preserve any per-recipeconftest.py/ helper modules those tests import (move/adjust imports as needed — mechanical only). - repo-local (HC2-gated) tests: recipes' OWN repo
tests/may still use the old folder names. Keep discovery recognizing them (deprecated-alias path) OR document the rename requirement — do NOT silently stop discovering them. State the decision. - Docs: update the placement rule everywhere —
docs/recipe-customization.md§3- §5.3 + the tree,
docs/testing.md§4,docs/enroll-recipe.mdworked examples. Regenerate anything generated.
- §5.3 + the tree,
- Unit test:
tests/unit/coverage forcustom_tests()— findscustom/, ignores top-level lifecycle overlays, (alias behavior if kept), deterministic ordering. - Nothing else may key off the names: grep the whole repo for
functional/andplaywright/string literals (harness, bridge, dashboard, results, screenshot, drone pipeline) and fix every consumer. The screenshotSCREENSHOThook + manifest must be unaffected.
3. Gates
M1 — Migration complete + coverage-preserving (pre-sweep). All recipes' custom tests
relocated to custom/; discovery + docs + unit tests updated; full-repo grep shows no
stale consumer. Coverage-diff proof (cardinal, mirrors rcust M1): the SET of
discovered + executed custom tests per recipe is IDENTICAL before and after — same files,
same count, just relocated; NONE dropped, NONE newly skipped. Adversary cold-verifies the
diff from a clean checkout and confirms no consumer still keys off the old folder names
and no test assertion was weakened.
M2 — Full recipe CI sweep (the operator-required proof). Run a real-CI sweep across
ALL enrolled recipes via the drone !testme path confirming every recipe's custom
tier still discovers + runs + passes its tests at the same level as its pre-cfold
baseline. Build the baseline matrix (recipe → expected level + custom-test set) BEFORE
the change. Then sweep: every recipe's !testme green, custom tests present in the run
output / manifest, levels unchanged, zero leaked apps. Max 2-3 concurrent live deploys;
canary suite green. Any deviation must be explained as cfold-neutral or fixed. Fresh
Adversary PASS → Builder writes ## DONE.
4. Guardrails (binding)
- No silent coverage loss — the whole point of M1's coverage diff. A custom test that stops being discovered without a loud signal is an automatic FAIL.
- No test weakening — this is a pure relocation; assertions are untouchable. The only content changes allowed are import-path adjustments forced by the move (mechanical, Adversary-checked line-by-line).
- File-location rule still applies to loop-state files (machine-docs/).
- Real-CI etiquette: ≤2-3 concurrent deploys, teardown every dev deploy on every exit
path, never git-checkout
~/.abra/recipes/<recipe>mid-build, no secrets in logs. - Recipe mirrors: PR only, never merge. Commit author
autonomic-bot <autonomic-bot@noreply.git.autonomic.zone>; push every commit. CI host: no python3 on default PATH (usecc-ci-run).
5. Definition of Done
All custom tests live under tests/<recipe>/custom/ (functional/playwright collapsed),
discovery + docs + unit tests updated, no consumer keys off the old names, coverage proven
identical before/after, and a full !testme recipe sweep is green with unchanged levels
and zero leaks. M1 + M2 fresh Adversary PASSes.