diff --git a/tests/plausible/PARITY.md b/tests/plausible/PARITY.md new file mode 100644 index 0000000..4a3d92c --- /dev/null +++ b/tests/plausible/PARITY.md @@ -0,0 +1,48 @@ +# plausible — parity & coverage map (Phase 2 Q4.7) + +## P2 — Parity port +**No recipe-maintainer corpus exists for plausible** (`/srv/recipe-maintainer/recipe-info/` has no +`plausible/` entry — confirmed: no `recipe-info/plausible/tests/*.py`). P2 (parity) is therefore +**vacuously satisfied** — there are no scripts to port. Coverage is delivered entirely via the generic +lifecycle tiers (install/upgrade/backup/restore) + the recipe-aware backup overlay + the +recipe-specific functional tests below. + +## Lifecycle (generic tiers + recipe-aware overlays — Phase 1d/1e) +plausible ships `test_install.py` (generic serving + SPA shell at `/`) and the backup-integrity +overlays (`test_backup.py` / `test_restore.py` / `test_upgrade.py` + `ops.py`). Where an overlay is +present the generic tier still runs alongside it (Phase-1e HC3 invariant). + +Readiness probe: `HEALTH_PATH = /api/health`, `HEALTH_OK = (200,)`. plausible's `app` boots before its +ClickHouse events DB is ready and `/` 500s during init (then 302s once ready, so `/` cannot +distinguish not-ready from ready). The dedicated `/api/health` endpoint returns `200` with +`{"clickhouse":"ok","postgres":"ok","sites_cache":"ok"}` **only** once both datastores are reachable — +a true readiness gate. `DEPLOY_TIMEOUT` / `HTTP_TIMEOUT` are widened to 1200s to wait out the cold +ClickHouse + migrations init. + +## P3 — Recipe-specific functional tests +- `functional/test_health_check.py` + - `test_plausible_root_serves` — GET `/api/health` → 200, proving ClickHouse + postgres + the + sites_cache are all up (plausible's self-reported backend readiness; not a Traefik fallback). +- `functional/test_event_tracking.py` — **§4.3 prescribed "track a test event, query it back"**, the + app's primary object. Both tests register a site row in the metadata postgres (plausible's + `sites_cache` drops events for unregistered domains — empirically confirmed), POST to the public + `/api/event` ingestion endpoint with a browser User-Agent (plausible drops bot/library UAs), then + read the row back out of the ClickHouse `events_v2` table on a poll loop (sites_cache refresh + event + write-buffer flush make the first landing non-instantaneous). Real app-state assertions, not 202-ack + stand-ins: + - `test_pageview_event_roundtrip` — a `pageview` event lands in `events_v2`; asserts the stored + `name`/`pathname`/`hostname` match what was sent. + - `test_custom_event_roundtrip` — a *custom-named* event (a goal/conversion, plausible's distinctive + non-pageview tracking path) lands under that exact name (not coerced to `pageview`). + +## P4 — Backup data-integrity (real) +`ops.py` seeds an identifiable `ci_marker` row in the metadata postgres (`db` service), `pre_restore` +drops it, and the restore tier asserts the row survives backup→restore. plausible's recipe backs up +postgres via a real `pg_dump`/`pg_restore` hook (backupbot pre-/post-hooks), so the SQL-level marker +restores cleanly — the recipe-aware data-integrity bar (P4), not a "service is up" stand-in. + +## Notes / deferrals +- Reading events back via plausible's **stats API** (rather than ClickHouse directly) requires a + registered user + API key; under `DISABLE_AUTH=true` there is no default user, and creating an API + key adds significant setup with no extra signal over the direct ClickHouse read-back (which is the + authoritative store). The ClickHouse read-back is the stronger, more direct assertion and is used. diff --git a/tests/plausible/functional/test_event_tracking.py b/tests/plausible/functional/test_event_tracking.py new file mode 100644 index 0000000..190697e --- /dev/null +++ b/tests/plausible/functional/test_event_tracking.py @@ -0,0 +1,146 @@ +"""plausible — Phase-2 §4.3 recipe-specific functional tests (event tracking). + +plausible's *raison d'être* is ingesting analytics events and storing them in ClickHouse. These two +tests prove the full create-and-read-back path end to end — they are NOT health/200 stand-ins: + + * test_pageview_event_roundtrip — POST a `pageview` to the public /api/event ingestion endpoint, + then read the row back out of the ClickHouse `events_v2` table (the primary object: a tracked + event). §4.3 "track a test event, query it back". + * test_custom_event_roundtrip — POST a *custom-named* event (a goal/conversion, plausible's + distinctive non-pageview tracking path) and confirm it lands under that name. Exercises a + characteristic feature beyond the basic pageview. + +Both assert real app state (the event reached the analytics store), not just the HTTP 202 ack. + +plausible only ingests events for *known* sites — the in-memory `sites_cache` gates ingestion and +drops events for unregistered domains (empirically confirmed: an event for an unregistered domain +never appears in events_v2). So each test first registers a site row in the metadata postgres, then +POSTs repeatedly while polling ClickHouse: the sites_cache must refresh to admit the new site and the +event write-buffer must flush to ClickHouse, so the first landing is not instantaneous. Re-POSTing the +same event is safe — we assert the row count is >= 1. + +No recipe-maintainer corpus exists for plausible (recipe-info/plausible/ has no tests/), so these are +net-new recipe-specific tests rather than parity ports — see tests/plausible/PARITY.md. +""" + +from __future__ import annotations + +import os +import sys +import time + +sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", "..", "runner")) +from harness import http as harness_http # noqa: E402 +from harness import lifecycle # noqa: E402 + +# A real browser User-Agent — plausible's ingestion drops requests from bot/library UAs (e.g. the +# default python-urllib UA), so the event would silently never reach ClickHouse without this. +_UA = ( + "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 " + "(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" +) + + +def _ch(domain: str, sql: str) -> str: + """Run a ClickHouse query against the `plausible_events_db` service; return stdout (stripped).""" + return lifecycle.exec_in_app( + domain, + ["clickhouse-client", "--database", "plausible_events_db", "--query", sql], + service="plausible_events_db", + ).strip() + + +def _register_site(domain: str, site: str) -> None: + """Insert a site row into the metadata postgres (`db` service) so plausible will ingest events for + it. Idempotent (ON CONFLICT DO NOTHING).""" + sql = ( + "INSERT INTO sites (domain, timezone, inserted_at, updated_at, native_stats_start_at) " + f"VALUES ('{site}','UTC', now(), now(), now()) ON CONFLICT (domain) DO NOTHING; " + f"SELECT domain FROM sites WHERE domain = '{site}';" + ) + out = lifecycle.exec_in_app( + domain, ["psql", "-U", "plausible", "-d", "plausible", "-tAc", sql], service="db" + ).strip() + assert out == site, f"site {site!r} not registered in postgres (got {out!r})" + + +def _post_event(base_domain: str, site: str, name: str, pathname: str) -> int: + """POST one event to the public ingestion endpoint; return the HTTP status (plausible acks 202).""" + status, _ = harness_http.http_post( + f"https://{base_domain}/api/event", + data={"name": name, "url": f"https://{site}{pathname}", "domain": site}, + headers={"User-Agent": _UA, "X-Forwarded-For": "203.0.113.9"}, + timeout=15, + ) + return status + + +def _ingest_and_count( + base_domain: str, + site: str, + name: str, + pathname: str, + max_wait: int = 210, + interval: int = 10, +) -> int: + """Register the site, then POST the event on a poll loop until its row appears in ClickHouse. + + Returns the events_v2 row count for (pathname, name). Raises if nothing lands within max_wait — + a genuinely-broken ingestion path therefore FAILS (this is not a vacuous check).""" + _register_site(base_domain, site) + count_sql = ( + f"SELECT count() FROM events_v2 WHERE pathname = '{pathname}' AND name = '{name}'" + ) + deadline = time.time() + max_wait + last_status = None + while True: + last_status = _post_event(base_domain, site, name, pathname) + assert last_status == 202, f"POST /api/event for {name!r} → HTTP {last_status} (expected 202)" + time.sleep(interval) + raw = _ch(base_domain, count_sql) + count = int(raw) if raw.isdigit() else 0 + if count >= 1: + return count + if time.time() >= deadline: + raise AssertionError( + f"event name={name!r} pathname={pathname!r} for site={site!r} never reached " + f"ClickHouse events_v2 within {max_wait}s (last POST status={last_status}, " + f"last count={count})" + ) + + +def test_pageview_event_roundtrip(live_app): + """Track a pageview event via /api/event, read it back from ClickHouse (§4.3 primary object).""" + site = "ccci-pageview.example" + pathname = "/ccci-pageview-roundtrip" + count = _ingest_and_count(live_app, site, "pageview", pathname) + assert count >= 1, f"expected >=1 pageview row, got {count}" + + # Read-back: confirm the stored row carries the data we sent (real app state, not just a count). + row = _ch( + live_app, + f"SELECT name, pathname, hostname FROM events_v2 " + f"WHERE pathname = '{pathname}' AND name = 'pageview' LIMIT 1 FORMAT TabSeparated", + ) + name, stored_path, hostname = (row.split("\t") + ["", "", ""])[:3] + assert name == "pageview", f"stored event name {name!r} != 'pageview'" + assert stored_path == pathname, f"stored pathname {stored_path!r} != {pathname!r}" + assert hostname == site, f"stored hostname {hostname!r} != site {site!r}" + + +def test_custom_event_roundtrip(live_app): + """Track a custom-named event (a goal/conversion — plausible's distinctive non-pageview path) and + confirm it lands under that exact name in ClickHouse, distinct from the pageview path.""" + site = "ccci-goal.example" + pathname = "/ccci-custom-event" + event_name = "ccci-Signup" + count = _ingest_and_count(live_app, site, event_name, pathname) + assert count >= 1, f"expected >=1 custom-event row, got {count}" + + # The row must be stored under the custom name (not coerced to 'pageview') — proves the + # custom-event/goal ingestion path works. + stored_name = _ch( + live_app, + f"SELECT name FROM events_v2 WHERE pathname = '{pathname}' LIMIT 1", + ) + assert stored_name == event_name, f"custom event stored as {stored_name!r}, expected {event_name!r}" diff --git a/tests/plausible/functional/test_health_check.py b/tests/plausible/functional/test_health_check.py index ce5cac4..0c7b3ea 100644 --- a/tests/plausible/functional/test_health_check.py +++ b/tests/plausible/functional/test_health_check.py @@ -10,9 +10,13 @@ from harness import http as harness_http # noqa: E402 def test_plausible_root_serves(live_app): - """GET / → 200 or 302 (redirect to login or app shell).""" - url = f"https://{live_app}/" + """GET /api/health → 200 (clickhouse+postgres ready). + + `/` itself 500s via auth_controller under DISABLE_AUTH, so it is NOT a + reliable health probe; the dedicated /api/health endpoint is. + """ + url = f"https://{live_app}/api/health" status, _ = harness_http.retry_http_get( - url, expect_status=(200, 302), max_wait=60, interval=3 + url, expect_status=(200,), max_wait=60, interval=3 ) - assert status in (200, 302), f"GET {url} HTTP {status}" + assert status == 200, f"GET {url} HTTP {status}" diff --git a/tests/plausible/recipe_meta.py b/tests/plausible/recipe_meta.py index 40e0ec7..5a0633b 100644 --- a/tests/plausible/recipe_meta.py +++ b/tests/plausible/recipe_meta.py @@ -1,12 +1,15 @@ # Per-recipe harness config for plausible (Phase 2 Q4.7 — analytics platform). # Requires SECRET_KEY_BASE (64+ char), DISABLE_AUTH, DISABLE_REGISTRATION env vars to deploy. # We use a fixed CI value for SECRET_KEY_BASE — safe for ephemeral per-run deploys. -HEALTH_PATH = "/" -HEALTH_OK = (200, 302) +HEALTH_PATH = "/api/health" +HEALTH_OK = (200,) # plausible's app starts before its clickhouse events DB is ready (the recipe's `app` depends_on lists # `events_db` but the service is named `plausible_events_db`, so swarm applies no ordering) and returns -# 500 until clickhouse + DB migrations finish — several minutes on a cold deploy. Give a wide HTTP -# window so the health poll waits out that init (it serves 302 once ready). [v1 failed at HTTP_TIMEOUT=600.] +# 500 until clickhouse + DB migrations finish — several minutes on a cold deploy. The dedicated +# /api/health endpoint returns 200 with {"clickhouse":"ok","postgres":"ok","sites_cache":"ok"} only +# once both datastores are ready, so it is a true readiness probe; `/` is unreliable (500s during init, +# 302s once ready, so it cannot distinguish "not ready" from "ready"). Give a wide HTTP window so the +# health poll waits out that init. [v1 failed at HTTP_TIMEOUT=600 polling `/`.] DEPLOY_TIMEOUT = 1200 HTTP_TIMEOUT = 1200