fix(2): F2-3 systemic — harness.browser.goto_with_retry; applied to all install overlays

Phase 2 lesson from F2-3 (n8n install Playwright flake on net::ERR_NETWORK_CHANGED): every
install overlay that does page.goto needs the same try/except PlaywrightError + status retry.
Centralize in runner/harness/browser.py::goto_with_retry; apply to ALL install overlays.

- runner/harness/browser.py: shared helper. Polls page.goto until status in accept_statuses;
  catches PlaywrightError (net::ERR_*) as a retryable signal, not a failure. Raises AssertionError
  with last_status + last_err diagnostic only on deadline expiry.
- tests/custom-html/test_install.py: now uses goto_with_retry (200 only, wait_until=load).
- tests/custom-html/playwright/test_browser_smoke.py: same.
- tests/n8n/test_install.py: replaced inline retry loop with goto_with_retry (200, 304).
- tests/keycloak/test_install.py: goto_with_retry for admin console (200, 302, 303; 45s goto).
- tests/cryptpad/test_install.py: goto_with_retry (200, 304; 60s goto, wait_until=load).
- tests/lasuite-docs/test_install.py: goto_with_retry (200, 301, 302; 60s goto).

Cold-verifiable: ssh cc-ci 'RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py'
  all 5 stages PASS (including the install overlay that flaked in the deps_smoke run),
  deploy-count=1, head_ref=8a026066==chaos-version=8a026066 (HC1 non-vacuous).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 07:46:34 +01:00
parent 4d6b040ba7
commit 47f7cb47c2
7 changed files with 95 additions and 36 deletions

View File

@ -10,6 +10,12 @@ later non-lifecycle browser flow (e.g. a content-management UI) has its home alr
from __future__ import annotations
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "..", "runner"))
from harness import browser as harness_browser # noqa: E402
def test_browser_renders_html(live_app):
"""Browser-render the served root page and assert the HTML loads with no console errors."""
@ -26,7 +32,10 @@ def test_browser_renders_html(live_app):
"console",
lambda msg: console_errors.append(msg.text) if msg.type == "error" else None,
)
resp = page.goto(url, wait_until="load", timeout=30_000)
# F2-3 hardening (status mismatch + PlaywrightError retries)
resp = harness_browser.goto_with_retry(
page, url, accept_statuses=(200,), wait_until="load"
)
assert resp is not None and resp.status == 200, f"page status {resp and resp.status}"
html = page.content()
assert "<html" in html.lower(), "page did not render an HTML document"

View File

@ -9,7 +9,7 @@ import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "runner"))
from harness import generic # noqa: E402
from harness import browser as harness_browser, generic # noqa: E402
def test_serving_and_content(live_app, meta):
@ -23,7 +23,11 @@ def test_serving_and_content(live_app, meta):
browser = p.chromium.launch(args=["--no-sandbox"])
try:
page = browser.new_context(ignore_https_errors=True).new_page()
resp = page.goto(url, wait_until="load", timeout=30000)
# F2-3-style hardening (centralized in harness.browser.goto_with_retry): retry through
# transient PlaywrightError (net::ERR_*) and status mismatches.
resp = harness_browser.goto_with_retry(
page, url, accept_statuses=(200,), wait_until="load"
)
assert resp is not None and resp.status == 200, f"page status {resp and resp.status}"
body = page.content()
assert "nginx" in body.lower() or "<html" in body.lower(), "no served HTML content"