Phase 2 lesson from F2-3 (n8n install Playwright flake on net::ERR_NETWORK_CHANGED): every
install overlay that does page.goto needs the same try/except PlaywrightError + status retry.
Centralize in runner/harness/browser.py::goto_with_retry; apply to ALL install overlays.
- runner/harness/browser.py: shared helper. Polls page.goto until status in accept_statuses;
catches PlaywrightError (net::ERR_*) as a retryable signal, not a failure. Raises AssertionError
with last_status + last_err diagnostic only on deadline expiry.
- tests/custom-html/test_install.py: now uses goto_with_retry (200 only, wait_until=load).
- tests/custom-html/playwright/test_browser_smoke.py: same.
- tests/n8n/test_install.py: replaced inline retry loop with goto_with_retry (200, 304).
- tests/keycloak/test_install.py: goto_with_retry for admin console (200, 302, 303; 45s goto).
- tests/cryptpad/test_install.py: goto_with_retry (200, 304; 60s goto, wait_until=load).
- tests/lasuite-docs/test_install.py: goto_with_retry (200, 301, 302; 60s goto).
Cold-verifiable: ssh cc-ci 'RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py'
all 5 stages PASS (including the install overlay that flaked in the deps_smoke run),
deploy-count=1, head_ref=8a026066==chaos-version=8a026066 (HC1 non-vacuous).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mechanical port to the assertion-only contract (no softened/skipped assertions): install uses
live_app + generic.assert_serving (extend) + the recipe's http/playwright/api checks; upgrade seeds
its data marker then generic.do_upgrade + asserts survival; backup/restore split into test_backup.py
(seed->do_backup->mutate) + new test_restore.py (do_restore->assert original). Recipe-specifics
preserved verbatim (keycloak realm+admin-console+kc_admin, matrix/lasuite db-service psql markers,
cryptpad/n8n volume markers). No recipe now double-deploys under the deploy-once orchestrator.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9-service stack (frontend/backend/celery/y-provider/docspec/postgres/redis/minio/nginx) converges
9/9 and serves the SPA; install 2 passed on host. Root-caused a deploy timeout: cold-pulling ~9
large images exceeds abra's default 300s convergence TIMEOUT -> bumped to 900 via EXTRA_ENV (the
generic per-recipe mechanism, no harness surgery). upgrade/backup use a postgres marker (docs/docs)
exercising the pg_backup.sh DB-dump hook; verifying next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>