Adversary REVIEW-cf55.md 2026-06-13T05:13:45Z: M1 PASS + M2 NO COVERAGE LOST. All 7 review categories passed independently. Phase cf55 complete.
8.3 KiB
STATUS — phase cf55
Phase: cf55 — GPT-5.5 post-cfold coverage-loss review
Builder: autonomic-bot
Model: claude-sonnet-4-6 (orchestrator-invoked via Claude Code; plan specified openai/gpt-5.5, but prior GPT-5.4 loops stopped on model mismatch — orchestrator relaunched on Claude)
Updated: 2026-06-13T05:18Z
DONE
Phase result: REVIEW-cf55.md 2026-06-13T05:13:45Z → M1 PASS + M2 NO COVERAGE LOST
Done criteria satisfied:
- M1 PASS at
REVIEW-cf55.md2026-06-13T05:13:45Z (combined M1+M2 Adversary verdict) - M2 PASS / NO COVERAGE LOST confirmed independently by Adversary
- All 7 review categories passed: diff review, discovery parity, assertion preservation, old-folder behavior, lifecycle-overlay separation, evidence audit, cleanliness
- No blocking findings
M1 — PASS
Gate result: REVIEW-cf55.md 2026-06-13T05:13:45Z → M1 PASS
WHAT:
- cf55 review matrix complete; covering all 7 required review categories across 20 enrolled recipes
- Implementation commit under review:
44e0242(feat(cfold): canonicalize custom test layout) - cfold phase M1 PASS (2026-06-12T16:20Z) + M2 PASS (2026-06-13T04:11:00Z) reviewed
HOW (Adversary can verify these from a fresh clone):
git ls-files "tests/*/custom/test_*.py" | wc -l→64git ls-files "tests/*/functional/*" "tests/*/playwright/*" | grep test_ | wc -l→0- Per-recipe count check (exact match vs pre-cfold baseline):
for recipe in bluesky-pds cryptpad custom-html custom-html-tiny discourse drone ghost hedgedoc immich keycloak lasuite-docs lasuite-drive lasuite-meet mailu matrix-synapse mattermost-lts mumble n8n plausible uptime-kuma; do count=$(git ls-files "tests/$recipe/custom/test_*.py" | wc -l); printf "%s %s\n" "$recipe" "$count"; done nix shell nixpkgs#python311Packages.pytest -c pytest tests/unit/test_discovery.py tests/unit/test_discovery_phase2.py tests/unit/test_manifest.py -q→18 passed- Lifecycle-overlay check:
git ls-files "tests/*/custom/test_install.py" "tests/*/custom/test_upgrade.py" "tests/*/custom/test_backup.py" "tests/*/custom/test_restore.py"→ empty - Deprecated-alias warning probe:
Expected: 2
# Run from repo root: python3 -c " import sys,os,tempfile,unittest.mock as mock sys.path.insert(0,'runner') from harness import discovery with tempfile.TemporaryDirectory() as tmp: d=os.path.join(tmp,'tests','probe') os.makedirs(os.path.join(d,'functional')) os.makedirs(os.path.join(d,'playwright')) open(os.path.join(d,'functional','test_old.py'),'w').write('#x') open(os.path.join(d,'playwright','test_ui.py'),'w').write('#x') with mock.patch.object(discovery,'cc_ci_dir',lambda r: os.path.join(tmp,'tests',r)): result=discovery.custom_tests('probe',None) print('found:',[os.path.basename(p) for _,p in result]) " 2>&1WARNING [cfold]: test found in deprecated folderlines +found: ['test_old.py', 'test_ui.py'] - RUNG name preserved:
grep 'functional' runner/harness/level.py→RUNGS = (..., "functional", ...)still present git status→ clean working tree
EXPECTED:
- Command 1:
64 - Command 2:
0 - Command 3: matches pre-cfold baseline exactly (see table below)
- Command 4:
18 passed - Command 5: empty (no lifecycle overlays in custom/)
- Command 6: 2 deprecation warnings, both test files found
- Command 7: "functional" still in RUNGS
- Command 8:
nothing to commit, working tree clean
WHERE:
- Implementation commit:
44e0242 - Discovery:
runner/harness/discovery.py - Manifest:
runner/harness/manifest.py - Unit tests:
tests/unit/test_discovery.py,tests/unit/test_discovery_phase2.py,tests/unit/test_manifest.py - Migrated custom tests:
tests/*/custom/ - Lifecycle overlays:
tests/*/test_install.py,tests/*/test_upgrade.py, etc. (top-level only) - Level/RUNG names:
runner/harness/level.py
Review Matrix
Pre-cfold baseline (from cfold STATUS-cfold.md)
| Recipe | Pre-cfold count | Post-cfold count | Match |
|---|---|---|---|
| bluesky-pds | 4 | 4 | ✓ |
| cryptpad | 4 | 4 | ✓ |
| custom-html | 4 | 4 | ✓ |
| custom-html-tiny | 1 | 1 | ✓ |
| discourse | 3 | 3 | ✓ |
| drone | 1 | 1 | ✓ |
| ghost | 4 | 4 | ✓ |
| hedgedoc | 2 | 2 | ✓ |
| immich | 3 | 3 | ✓ |
| keycloak | 3 | 3 | ✓ |
| lasuite-docs | 5 | 5 | ✓ |
| lasuite-drive | 3 | 3 | ✓ |
| lasuite-meet | 3 | 3 | ✓ |
| mailu | 3 | 3 | ✓ |
| matrix-synapse | 3 | 3 | ✓ |
| mattermost-lts | 3 | 3 | ✓ |
| mumble | 5 | 5 | ✓ |
| n8n | 4 | 4 | ✓ |
| plausible | 2 | 2 | ✓ |
| uptime-kuma | 4 | 4 | ✓ |
| TOTAL | 64 | 64 | MATCH |
Category review results
1. Diff review (44e0242):
discovery.py: addedcustom/as canonical;functional/+playwright/become deprecated aliases with loudWARNING [cfold]on stderr. Still discovers from all 3 subdirs — no coverage loss.manifest.py: normalizessubkey to"custom"always for clean output. Correct.tests/mailu/ops.py,test_backup.py,test_restore.py:sys.path.insertupdated fromfunctional→customto match helper_mailu.pynew location. Correct — these are lifecycle overlays importing a helper.tests/ghost/recipe_meta.py: comment-only change (functional/_ghost.py→custom/_ghost.py). No coverage loss.tests/drone/install_steps.sh: comment-only change. No coverage loss.- Keycloak custom test files:
sys.path.insertdepth adjusted (../..→../../..) due to moving fromfunctional/tocustom/— same directory depth. Correct. - All 60 functional + 4 playwright test files: pure
git mv(0 insertions/deletions in stat for most; path-comment updates only for a few). No assertion changes. - Unit tests: fixtures updated from
functional/+playwright/tocustom/; new testtest_custom_tests_prefers_custom_and_warns_on_deprecated_aliasesadded. No coverage removed; one test renamed (test_custom_tests_placement_rule_functional_playwright_only→test_custom_tests_placement_rule_custom_only) but same assertions preserved.
2. Discovery parity: PASS — 64 custom tests in tests/*/custom/test_*.py, zero in tests/*/functional/ or tests/*/playwright/. Per-recipe counts match pre-cfold baseline exactly.
3. Assertion preservation: PASS — All 64 test files contain unmodified assertion bodies. Changes were: git mv, path-comment updates, sys.path.insert depth adjustments. Zero assertions removed, zero tests skipped, zero waits relaxed.
4. Old-folder behavior: PASS — Deprecated functional/+playwright/ subdirs are still in subdirs tuple in discovery.py, still discovered, with WARNING [cfold] emitted per deprecated file found. Tests still run (no silent drop). Probe confirms: both deprecated dirs emit warnings AND return the test files.
5. Lifecycle-overlay separation: PASS — Lifecycle overlays (test_install.py, test_upgrade.py, test_backup.py, test_restore.py) remain at tests/<recipe>/ top-level. Zero lifecycle files in custom/. The RUNG name "functional" (L4) is unchanged in runner/harness/level.py:44 — only the folder name changed, not the tier name.
6. Evidence audit: PASS — cfold M1 PASS (2026-06-12T16:20Z): 64 canonical tests, zero old-tracked trees, 18 passed, deprecated-alias probe green, exact (recipe, filename) coverage set preserved. M2 PASS (2026-06-13T04:11:00Z): full real-CI !testme sweep green across all 20 enrolled recipes at L5 with expected custom junit counts; build 585 (ghost) passes at L5 with custom=4, upgrade=2; zero leaked live -pr stacks.
7. Cleanliness: PASS — Working tree clean (git status: nothing to commit). No root-level coordination files. No stale temporary scripts. No uncommitted implementation files. machine-docs/ contains only expected phase-namespaced state files.
Final Verdict
NO COVERAGE LOST.
The cfold phase (44e0242) preserved the full pre-cfold custom-test set. All 64 custom tests are in canonical tests/<recipe>/custom/ directories with per-recipe counts matching the pre-cfold baseline exactly. No assertions were weakened during the move. Deprecated functional//playwright/ aliases continue to discover and warn. Lifecycle overlays remain at top-level. The RUNG name "functional" is unchanged. The full real-CI sweep is green at L5 across all 20 enrolled recipes.