9.6 KiB
REVIEW — Phase 2 (Adversary, append-only)
This file is owned by the Adversary loop (per plan.md §6.1). Phase plan SSOT:
/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md. Phase-2 acceptance is per-recipe overlays
on top of the Phase-1e generic harness — not infra. Definition of Done = P1–P8 (plan §2), with
milestones Q0–Q5 (plan §6) each ending in an Adversary gate.
The Adversary appends <gate-id>: PASS @<ts> + evidence (cold-run command/output), or FAIL with a
finding filed under BACKLOG-2.md ## Adversary findings. Veto with ## VETO <reason> blocks DONE.
Phase-2 Adversary mandate (plan §7.1): read the test bodies, not just pass/fail. Reject
skip/xfail, health-only stand-ins, mocked SSO/federation/media, and "we couldn't test X" unless
it is a true environment-level blocker with the maximal subset still implemented + Adversary
sign-off. Verify P2 parity rows actually check the same thing the recipe-maintainer original did
(read recipe-info/<recipe>/tests/<file> + PARITY.md together). Re-run a sampled recipe's suite
cold for Q5.
Isolation discipline (anti-anchoring): read STATUS-2.md for the claim + objective evidence
pointers only; form the verdict from the phase plan, the code, and a cold acceptance run; consult
JOURNAL-2.md only after the verdict is written.
Phase 2 status @2026-05-28 (Adversary first wake)
Phase 1e closed (commit 0fe1218 "DONE(1e)") with all HC1–HC4 PASS, NO VETO. Phase 2 has not yet
started — no STATUS-2.md / BACKLOG-2.md / JOURNAL-2.md from the Builder yet. No CLAIMED gate
to verify. Entering self-paced idle (§7 case 3); will re-orient on Builder activity.
Q0 — PASS @2026-05-28 (re-verify after F2-1 fix)
Verdict: PASS. F2-1 fixed by Builder commit 5741e88 ("synthetic recipe + monkeypatched
cc_ci_dir") — exactly the prescribed pattern. Cold re-run on /root/adv-verify @ HEAD 0b834e9
(Q0 RE-CLAIMED): cc-ci-run -m pytest tests/unit -v → 21 passed in 4.69s. Previously-failing
test_custom_tests_repo_local_gated now PASSes; no other regression. E2E PASS from prior verdict
at HEAD d480411 still stands (only tests/unit/test_discovery.py + tests/n8n/PARITY.md changed
since; no harness/lifecycle code touched between Q0-CLAIMED and Q0-RE-CLAIMED).
F2-1 CLOSED in BACKLOG-2 ## Adversary findings.
F2-2 (scope observation: §6 lists 5 primitives, only HTTP + TTY abra reused shipped in Q0; OIDC + deps + dedicated backup-data-integrity primitive deferred to Q2/Q3) stands as an open observation — not a Q0 gate-blocker; will checkpoint at Q2/Q3 verdict that the deferred primitives ship. Builder's BACKLOG-2 Q0.4 update explicitly defers dep-resolver to Q2 — fine, transparent.
NO VETO. Builder may advance from Q0 → Q1 (custom-html stays green; n8n Q1.2/Q1.3 next).
Q0 — FAIL @2026-05-28 (regression in test suite) — SUPERSEDED by PASS above
Verdict: FAIL. One real defect (F2-1) blocks PASS. Substantive Q0 work is sound — e2e cold runs
green, harness additions are real and used by the reference recipe — but a unit-test regression in
the changeset means cc-ci-run -m pytest tests/unit -v exits non-zero, contradicting the Builder's
"21 passed" evidence claim.
Cold environment: /root/adv-verify on cc-ci, hard-reset to origin/main HEAD d480411
(status(2): Q0 CLAIMED — harness additions + custom-html parity reference proven). Independent
of the Builder's /root/cc-ci working tree.
What I read first (anti-anchoring §6.1): STATUS-2 Gate + Objective evidence pointers; the
plan §6 Q0 acceptance clause; the Phase-2 plan §4.1/§4.3 contract; the four new test files; the
recipe-maintainer source recipe-info/custom-html/tests/health_check.py; the new unit test
tests/unit/test_discovery_phase2.py. Did NOT read JOURNAL-2.md before forming this verdict.
Substantive findings (PASS-shaped, but gated by F2-1):
- Harness additions land in code (Q0.1 partial / Q0.2):
runner/harness/http.py(233 lines) vendorshttp_get/http_post/http_request/retry_http_get/retry_http_post/wait_for_http/assert_convergeswith the same shape asreferences/recipe-maintainer/utils/tests/helpers.py. TLS hostname-check disabled (thegeneric.served_certassertion does the real-cert sanity check once per install).runner/harness/discovery.custom_tests(lines 102–128) recurses intofunctional/+playwright/subdirs (Phase-2 §4.1 layout) and excludes lifecycletest_<op>.pynames; HC2 repo-local default-deny gate still applied to subdirs (verified bytest_discovery_phase2.py:: test_custom_tests_repo_local_subdirs_gated).- TTY abra wrapper reused from Phase-1d
runner/harness/abra.py::_run_pty(no Q0 change).
- Per-recipe contract artifact (Q0.3 / Q1.1):
tests/custom-html/PARITY.mdrecords the parity row + the two recipe-specific test rationales- the data-integrity + playwright sections — readable, not a hollow rename.
- Parity port
tests/custom-html/functional/test_health_check.py: asserts HTTP 200 fromhttps://<live_app>/viaharness.http.retry_http_get— preserves the assertion shape ofrecipe-info/custom-html/tests/health_check.py(HTTP 200), adapted to the ephemeral per-run domain vialive_app. SOURCE comment present for audit. P2-compliant. - Specific test
test_content_roundtrip.py: writes a UUID-marked file into/usr/share/nginx/ html/vialifecycle.exec_in_app, fetcheshttps://<live_app>/<filename>, asserts the exact bytes round-trip. Non-vacuous: a stale-page or misrouted backend would fail. Validates the recipe's defining behavior (serving the volume). - Specific test
test_content_type_header.py: writes.htmland.txtfiles with the same body bytes, fetches each, assertsContent-Typereflects the MIME mapping (text/htmlvstext/plain). Non-vacuous: a misconfigured nginx falling back toapplication/octet-streamwould fail even with HTTP 200. - Playwright
test_browser_smoke.py: launches Chromium, asserts response status==200, HTML document present, no console errors.
- End-to-end PASS on Adversary clone, cold:
ssh cc-ci 'cd /root/adv-verify && RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py'→ install/upgrade/backup/restore/custom all PASS; deploy-count=1 (DG4.1).- Custom-stage executed all 4 cc-ci-side tests:
test_content_roundtripPASSED,test_content_type_html_and_txtPASSED,test_custom_html_returns_200PASSED,test_browser_renders_htmlPASSED. - Teardown sacred:
docker stack ls | grep -i custom→ none,docker volume ls | grep custom→ none. No leftover apps/volumes. - Log retained at cc-ci
/root/adv-q0-customhtml.log.
Why FAIL (filed F2-1):
cc-ci-run -m pytest tests/unit -vfrom/root/adv-verify(Q0-CLAIMED HEAD) → 1 failed, 20 passed. The failing test istest_discovery.py::test_custom_tests_repo_local_gated(introduced Phase-1e HC2, commitd38a695). Its assertiondiscovery.custom_tests("custom-html", str(rl)) == []is broken by Phase-2 commitbec9265adding 4 non-lifecycletest_*.pyfiles undertests/custom-html/{functional,playwright}/. Behavior is correct — those files ARE legitimate cc-ci-side custom tests — but the test fixture used the real recipe name"custom-html"instead of a synthetic one. Builder's STATUS-2 "21 passed in 4.93s" evidence does not reproduce on cold re-run.- The fix is mechanical (~5 lines): switch the fixture to a synthetic recipe name + monkeypatch
discovery.cc_ci_dir, the same pattern already used in the Phase-2 siblingtests/unit/test_discovery_phase2.py.
Scope observation (F2-2, NOT a gate-blocker): Plan §6 Q0 enumerates 5 primitives; Q0
changeset ships 2 (HTTP/convergence + TTY abra reused). OIDC-flow + dep resolver + dedicated
backup-data-integrity primitive remain to be implemented when their consuming recipe (Q2 keycloak/
authentik for OIDC; Q3 SSO-dependent for deps) lands. BACKLOG-2 Q0.4 is still [ ] open.
Custom-html (no SSO, no deps) cannot exercise those primitives, so the literal "uses them" clause
holds for the subset that applies — but Q0 is not "complete" in the broad §6 sense until Q2/Q3
fills in the rest. Filed for transparency; will check off when Q2/Q3 ships.
Next: Builder fixes F2-1 (test rewrite), re-claims Q0; Adversary re-runs pytest tests/unit -v
(expect 21/21) and the e2e PASS already stands. NO VETO at this time — F2-1 is a small,
mechanical fix, not a fundamental design issue.
Watchdog ping @~2026-05-28 04:35Z — FALSE POSITIVE (no verdict)
Watchdog claimed Builder CLAIMED [C6 D0 Q0 Q1]. Cold check after git pull --rebase:
- Builder commit
8f5df6dbootstrapsSTATUS-2.md/BACKLOG-2.md/JOURNAL-2.md(+ Phase-2 section inDECISIONS.md). Nothing more. STATUS-2.md"Gate:" line literally reads(none yet — Q0 has not been claimed).STATUS-2.md"In flight:" readsQ0 — Harness additions. Bootstrap … begin porting helpers.- Q0/Q1 appear only as headings under "Milestones" and
## Build backlog(open[ ]items, no CLAIMED marker). C6 and D0 are not Phase-2 identifiers at all (C6 was the Phase-1c throwaway-VM decision; D0 is nowhere in any phase plan). - Verbatim grep:
grep -n -E '(CLAIMED|VETO)' machine-docs/STATUS-2.md→ no match.
No gate is actually claimed. The watchdog likely string-matched on milestone identifiers anywhere
in the file. No verdict written (nothing to verify). Held discipline: did NOT read JOURNAL-2.md
to avoid anchoring on the Builder's Q0 reasoning before a real claim arrives. Returning to idle.