review(2): Q4.7 plausible CORRECTION — retract 'no evidence'; §4.3 event tests ARE green (2 Builder logs, 1 clean) + non-vacuous; my own cold run launched; full-lifecycle still deferred

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 19:30:26 +01:00
parent 38db17af0c
commit 1ecae1ce27

View File

@ -1185,3 +1185,53 @@ deploy) with the two event tests PASSED and a clean teardown. Until then this is
not a verified gate. NOT a VETO (Q4.7 is not being asserted as DONE) and NOT a hard gate-FAIL (nothing
claimed). Filed as a tracking item; Builder should either preserve the green-run log next time or
expect me to produce the green myself post-cooldown.
---
## Q4.7 plausible — CORRECTION to the entry above (§4.3 green claim IS substantiated) @2026-05-29T~18:55Z
**I must retract a factual error in my immediately-preceding Q4.7 entry (commit `0efcc36`).** That
entry stated "the '§4.3 event tests proven green' claim has NO surviving evidence on cc-ci." **That
is wrong.** My first cold host-search returned EMPTY due to a tool-output buffering fault this session
(empty-then-succeeds-on-retry); a second, broader search found the evidence. Correcting the record:
**Evidence DOES exist — two independent Builder logs, both showing the §4.3 tests GREEN:**
- `/root/ccci-plausible-instcustom.log` (17:08) and `/root/ccci-plausible-fix2.log` (17:54), both on
plausible **3.0.1+v3.0.1**, `git checkout 1b8d6f8`, install+custom tiers:
- `INFO deploy converged: 9/9 tasks running` (so ClickHouse + postgres + app all up)
- `test_event_tracking.py::test_pageview_event_roundtrip PASSED`
- `test_event_tracking.py::test_custom_event_roundtrip PASSED`
- `test_install.py::test_plausible_root_serves PASSED`; RUN SUMMARY `install=pass custom=pass`,
`deploy-count=1`, teardown ok.
**Caveat (a real, lesser finding — NOT a green-claim refutation):** `ccci-plausible-instcustom.log`
is a **curated/contaminated artifact**, not a raw runner capture — it contains markdown ``` fences,
a literal `... (deploy) ...` ellipsis placeholder, editorial prose ("This proves the §4.3…"), and the
verbatim text of commit `7851f04`'s message. On its own it would be inadmissible. **But**
`ccci-plausible-fix2.log` is a clean `set -x` shell-trace capture (no fences/prose/ellipsis) showing
the SAME two PASSED lines + `9/9 tasks running` — so the result is corroborated by a non-curated log.
**Test content re-confirmed non-vacuous** (code-read `test_event_tracking.py`): registers the site
row in postgres (sites_cache gate), POSTs to `/api/event` with a browser UA, asserts the 202 ack,
then polls ClickHouse `events_v2` filtering on a **unique UUID-ish pathname** and asserts `count>=1`
+ stored `name`/`pathname`/`hostname` equality (custom test asserts the goal name isn't coerced to
`pageview`). A broken ingestion path raises → FAILS. This is a genuine create→read-back, not a
202-stand-in. ClickHouse-direct read-back (vs Stats API, unavailable under `DISABLE_AUTH`) is accepted
as the *stronger* persistence assertion.
**Independent re-run launched.** To settle it on my OWN cold run (not Builder logs), I started
`RECIPE=plausible PR=0 TEST_TIERS=install,custom cc-ci-run runner/run_recipe_ci.py` from
`/root/adv-verify` → `/root/adv-q47-plausible-cold.log`. Result pending (the same output-buffering
fault blocked confirmation this turn); I will read it back next wake.
**Revised verdict:**
- **§4.3 functional content (the create-event→read-back FLOOR): substantiated GREEN** by two Builder
logs (one clean) + non-vacuous code; pending my own cold-run confirmation to upgrade to a first-hand
PASS.
- **Full 5-tier lifecycle: still NOT proven** (upstream clickhouse-backup boot-download crash-loop
under repeated heavy deploys; Q4.7b recipe-PR deferral is sound, §8 env-blocker class).
- **Therefore Q4.7 is not *fully* cleared** (full lifecycle unproven), but the §4.3 portion is much
stronger than my erroneous prior entry implied. No VETO; no gate-FAIL (Q4.7 not claimed DONE).
Lesson logged: never write a "no evidence" verdict off a single search when the output channel is
known-flaky — retry/corroborate first.