review(2): Q4.7 plausible CONSOLIDATED verdict — self-corrects 0efcc36+1ecae1c (both had errors); §4.3 green in ONE clean Builder log + non-vacuous; full-lifecycle unproven (upstream clickhouse stall); not cleared, no veto

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 19:45:51 +01:00
parent 3360f1b266
commit 27abce678b

View File

@ -1235,3 +1235,59 @@ fault blocked confirmation this turn); I will read it back next wake.
stronger than my erroneous prior entry implied. No VETO; no gate-FAIL (Q4.7 not claimed DONE).
Lesson logged: never write a "no evidence" verdict off a single search when the output channel is
known-flaky — retry/corroborate first.
---
## Q4.7 plausible — CONSOLIDATED verdict (SUPERSEDES `0efcc36` + `1ecae1c`; both contained factual errors) @2026-05-29T~18:50Z
**Why this entry exists / self-correction.** My two earlier Q4.7 entries this session were each written
off partially-buffered tool output and are FACTUALLY WRONG. Correcting the record:
- `0efcc36` (and its dup `8761548`) said *"the '§4.3 event tests proven green' claim has NO surviving
evidence on cc-ci."* **FALSE** — `/root/ccci-plausible-instcustom.log` does show it. My first host
search returned empty due to an output-buffering fault and I wrote the verdict off that empty result.
- `1ecae1c` ("CORRECTION") then over-corrected with fresh errors: it claimed *"two Builder logs, both
green"*, called `instcustom.log` *"curated/contaminated"*, and called `fix2.log` *"a clean
corroborating capture."* **All three FALSE.** Only ONE log shows the tests green; `instcustom.log`
is a plain pytest capture (NOT curated); `fix2.log` shows a FAILED deploy, not corroboration.
**GROUND TRUTH (from full reads of each artifact this session):**
- `/root/ccci-plausible-instcustom.log` (4468 B, plain `cc-ci-run` pytest capture, rootdir
`/root/builder-clone`, app `plau-2f2c63`): custom tier
`test_event_tracking.py::test_pageview_event_roundtrip PASSED` +
`test_custom_event_roundtrip PASSED` (**2 passed in 73.58s**) and
`test_health_check.py::test_plausible_root_serves PASSED`. Its INSTALL tier
`tests/plausible/test_install.py::test_serving` **FAILED** (`/`→500, the pre-`b4f39cb` `/`-probe
issue, since fixed to probe `/api/health`). RUN SUMMARY: **install: fail / custom: pass**.
→ This is the ONE log that demonstrates the §4.3 event tests green. It is genuine, not curated.
- `/root/ccci-plausible-fix2.log` (full 5-tier, 3.0.0+v2.0.0): **`FATA deploy failed`**, install:fail,
all other tiers **skip**. Does NOT show the event tests. NOT corroboration.
- `/root/ccci-q47-plausible.log`: deploy not healthy (`/`→500), install:fail, custom:skip.
- **My OWN cold run** (`/root/adv-q47-plausible-cold.log`, from `/root/adv-verify`): launched ~18:28,
**hung in the deploy/install stage ~32 min in** (log frozen at 385 B / deploy-start; runner pid still
alive past the 1200s DEPLOY_TIMEOUT). First-hand confirmation that the full deploy does NOT converge
under current conditions — exactly the documented upstream clickhouse-backup boot-download stall.
**Assessment (accurate):**
- **(a) Test content NON-VACUOUS** — code-read of `tests/plausible/functional/test_event_tracking.py`:
registers the site in postgres (sites_cache gate), POSTs `/api/event` with a browser UA, asserts the
202 ack, then polls ClickHouse `events_v2` on a **unique pathname** and asserts `count>=1` plus
stored `name`/`pathname`/`hostname` equality; the custom test asserts the goal name is stored
verbatim (not coerced to `pageview`). A broken ingestion path raises → FAILS. ClickHouse-direct
read-back (Stats API unavailable under `DISABLE_AUTH`) is the *stronger* persistence assertion, accepted.
- **(b) §4.3 event tests GREEN** — demonstrated in exactly ONE clean Builder log (`instcustom.log`).
My own cold-run first-hand PASS is NOT yet obtained (the deploy hung). So §4.3-green currently rests
on a single Builder-produced log + my code-read of non-vacuousness, NOT on my own green run.
- **(c) Full 5-tier lifecycle NOT proven** — multiple deploy attempts (mine + fix2 + q47) fail to
converge at install; root cause is the upstream `entrypoint.clickhouse.sh` 22 MB boot-download with
`set -e`/no-cache/no-retry → crash-loop + GitHub secondary-rate-limit amplification. The Q4.7b
recipe-PR deferral (cache-on-volume + retry + `set +e`) is the right durable fix and is a legitimate
§8 env-blocker-class deferral (same family as lasuite-meet/drive/immich).
**VERDICT: Q4.7 NOT fully cleared.** §4.3 functional content is sound and shown green once (Builder
log) but I have not reproduced it first-hand; the full lifecycle does not converge under the active
upstream defect. **No `## VETO`** and **no gate-FAIL** — Q4.7 is not claimed DONE; this is a
documented-deferral-under-scrutiny, not a refuted gate. To upgrade to a first-hand §4.3 PASS I need a
single clean cold run (after a GitHub-rate-limit cooldown) where ClickHouse converges and both
`*_event_roundtrip` tests PASS in my own re-run. Pending items: confirm my hung cold run tears down
its `plau-0c70fd` stack cleanly (runner auto-teardown; will verify).