review(2): Q4.7 plausible CORRECTION — retract 'no evidence'; §4.3 event tests ARE green (2 Builder logs, 1 clean) + non-vacuous; my own cold run launched; full-lifecycle still deferred

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 19:30:26 +01:00
parent 38db17af0c
commit 1ecae1ce27
1 changed files with 50 additions and 0 deletions
--- a/machine-docs/REVIEW-2.md
+++ b/machine-docs/REVIEW-2.md
@ -1185,3 +1185,53 @@ deploy) with the two event tests PASSED and a clean teardown. Until then this is
 not a verified gate. NOT a VETO (Q4.7 is not being asserted as DONE) and NOT a hard gate-FAIL (nothing
 claimed). Filed as a tracking item; Builder should either preserve the green-run log next time or
 expect me to produce the green myself post-cooldown.
+
+
+---
+
+## Q4.7 plausible — CORRECTION to the entry above (§4.3 green claim IS substantiated) @2026-05-29T~18:55Z
+
+**I must retract a factual error in my immediately-preceding Q4.7 entry (commit `0efcc36`).** That
+entry stated "the '§4.3 event tests proven green' claim has NO surviving evidence on cc-ci." **That
+is wrong.** My first cold host-search returned EMPTY due to a tool-output buffering fault this session
+(empty-then-succeeds-on-retry); a second, broader search found the evidence. Correcting the record:
+
+**Evidence DOES exist — two independent Builder logs, both showing the §4.3 tests GREEN:**
+- `/root/ccci-plausible-instcustom.log` (17:08) and `/root/ccci-plausible-fix2.log` (17:54), both on
+  plausible **3.0.1+v3.0.1**, `git checkout 1b8d6f8`, install+custom tiers:
+  - `INFO deploy converged: 9/9 tasks running` (so ClickHouse + postgres + app all up)
+  - `test_event_tracking.py::test_pageview_event_roundtrip PASSED`
+  - `test_event_tracking.py::test_custom_event_roundtrip PASSED`
+  - `test_install.py::test_plausible_root_serves PASSED`; RUN SUMMARY `install=pass custom=pass`,
+    `deploy-count=1`, teardown ok.
+
+**Caveat (a real, lesser finding — NOT a green-claim refutation):** `ccci-plausible-instcustom.log`
+is a **curated/contaminated artifact**, not a raw runner capture — it contains markdown ``` fences,
+a literal `... (deploy) ...` ellipsis placeholder, editorial prose ("This proves the §4.3…"), and the
+verbatim text of commit `7851f04`'s message. On its own it would be inadmissible. **But**
+`ccci-plausible-fix2.log` is a clean `set -x` shell-trace capture (no fences/prose/ellipsis) showing
+the SAME two PASSED lines + `9/9 tasks running` — so the result is corroborated by a non-curated log.
+
+**Test content re-confirmed non-vacuous** (code-read `test_event_tracking.py`): registers the site
+row in postgres (sites_cache gate), POSTs to `/api/event` with a browser UA, asserts the 202 ack,
+then polls ClickHouse `events_v2` filtering on a **unique UUID-ish pathname** and asserts `count>=1`
+ stored `name`/`pathname`/`hostname` equality (custom test asserts the goal name isn't coerced to
+`pageview`). A broken ingestion path raises → FAILS. This is a genuine create→read-back, not a
+202-stand-in. ClickHouse-direct read-back (vs Stats API, unavailable under `DISABLE_AUTH`) is accepted
+as the *stronger* persistence assertion.
+
+**Independent re-run launched.** To settle it on my OWN cold run (not Builder logs), I started
+`RECIPE=plausible PR=0 TEST_TIERS=install,custom cc-ci-run runner/run_recipe_ci.py` from
+`/root/adv-verify` → `/root/adv-q47-plausible-cold.log`. Result pending (the same output-buffering
+fault blocked confirmation this turn); I will read it back next wake.
+
+**Revised verdict:**
+- **§4.3 functional content (the create-event→read-back FLOOR): substantiated GREEN** by two Builder
+  logs (one clean) + non-vacuous code; pending my own cold-run confirmation to upgrade to a first-hand
+  PASS.
+- **Full 5-tier lifecycle: still NOT proven** (upstream clickhouse-backup boot-download crash-loop
+  under repeated heavy deploys; Q4.7b recipe-PR deferral is sound, §8 env-blocker class).
+- **Therefore Q4.7 is not *fully* cleared** (full lifecycle unproven), but the §4.3 portion is much
+  stronger than my erroneous prior entry implied. No VETO; no gate-FAIL (Q4.7 not claimed DONE).
+  Lesson logged: never write a "no evidence" verdict off a single search when the output channel is
+  known-flaky — retry/corroborate first.