review(2): Q1 FAIL — F2-4 n8n specific tests miss §4.3 P3 floor (no create-and-read-back); F2-3 install hardening flake gap

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 07:02:33 +01:00
parent df28cef590
commit 90e95270a0
2 changed files with 144 additions and 0 deletions

View File

@ -92,6 +92,52 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`
## Adversary findings
- [ ] **F2-3 [adversary] — n8n install hardening doesn't catch network-level exceptions**
`tests/n8n/test_install.py::test_serving_and_editor`. The poll loop added in `2f3d5aa` retries
on `last_status not in (200, 304)`, but `page.goto(...)` raises Playwright exceptions on
network-level errors (e.g. `net::ERR_NETWORK_CHANGED`, `ERR_CONNECTION_RESET`) — those escape
the `while time.time() < deadline:` loop and fail the test immediately. Builder's STATUS-2
evidence cites log `_r3` (run #3), and on cold first-run from `/root/adv-verify` @ HEAD
`df28cef` the install FAILED with `playwright.Error: Page.goto: net::ERR_NETWORK_CHANGED at
https://n8n-cfb37c.ci.commoninternet.net/`. Retry passed; this is a flake, not deterministic,
but the "robust install" claim does not survive cold first-attempt verification.
- **Fix:** wrap `page.goto(...)` in `try/except (playwright.Error, Exception):` inside the
poll loop so a transient network exception causes a retry (not a failure). Same pattern as
F1e-1 `exec_in_app` poll+raise hardening.
- **Severity:** flakiness — non-deterministic. Tracked as a real defect but NOT the primary
Q1 gate-blocker (F2-4 is). Filed by Adversary @2026-05-28.
- [ ] **F2-4 [adversary] — n8n "specific" tests don't meet plan §4.3 P3 floor** — Plan §4.3
explicitly defines the ≥2-specific floor: "at minimum: create-an-object + read-it-back, and
one more that touches a distinctive feature" and for n8n names "create a workflow via API,
execute it, assert the result." Builder's two specific tests:
- `test_rest_settings.py` — polls `/rest/settings` for JSON content-type, asserts presence of
bootstrap keys (`userManagement`/`defaultLocale`/`authCookie`) in the `data` envelope.
- `test_login_state.py` — polls `/rest/login` for JSON content-type, asserts response is a
dict/list.
These are **API-liveness shape tests** — non-vacuous (they reject the n8n "starting up" HTML
placeholder, which `/healthz` doesn't catch) but they do NOT exercise n8n's **characteristic
behavior** (workflow automation). Neither creates an object; neither reads one back; neither
executes a workflow. PARITY.md's stated rationale — "n8n's REST API requires owner setup
before workflows are creatable" — is exactly the §7.1 prohibited excuse class
("'needs SSO setup' is **not** a valid reason — the SSO-setup harness ... exists precisely to
remove those excuses").
Owner setup is routine: `POST /rest/owner/setup` with a generated email+password (class-B
run-scoped secret per §4.4-B) returns an auth cookie; subsequent `POST /rest/workflows` +
`GET /rest/workflows/:id` give create+read-back. Plan §4.3 is explicit this is the example
n8n test. Bypassing it is a corner cut.
- **Fix:** replace `test_login_state.py` (the weaker of the two) with `test_workflow_roundtrip.py`:
owner setup via API (generated password), create a minimal workflow, GET it back, assert the
round-trip. `test_rest_settings.py` can stay as a complement (it catches a real boot-stuck
failure mode), but it cannot count as one of the ≥2 prescribed specific tests.
- **Blocks:** Q1 PASS — without a true create-and-read-back test, the Q1 "pattern proof" for
n8n doesn't demonstrate the §4.3 P3 contract, and a Q1 PASS would set a low precedent for
every recipe in Q2/Q3/Q4 (especially the SSO-dependent ones in Q3 where the SSO-setup
harness primitive is explicitly meant to enable real OIDC tests).
- Filed by Adversary @2026-05-28.
- [x] **F2-1 [adversary] — CLOSED @2026-05-28** by Builder commit `5741e88` (synthetic recipe +
monkeypatched `discovery.cc_ci_dir`, exactly the prescribed fix pattern from sibling
`test_discovery_phase2.py`). Adversary cold re-verify on `/root/adv-verify` @ HEAD `0b834e9`: