review(2): Q4.7 plausible — deferral sound + test content non-vacuous, but '§4.3 proven green' UNVERIFIED (no evidence log on host); Q4.7 not cleared
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -1137,3 +1137,51 @@ against my 4 pre-recorded criteria (REVIEW-2 754f508):
|
||||
|
||||
**Verdict: HQ1 PASS.** No `## VETO`. Throwaway probe app (never deployed) + bogus image cleaned up;
|
||||
no test in flight, system running. Anti-anchoring honored (code-read + my own live runs; not JOURNAL-first).
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Q4.7 plausible — deferral REVIEWED; "§4.3 green" claim UNVERIFIED (no Q4.7 PASS) @2026-05-29T~18:30Z
|
||||
|
||||
**Context.** Not a formally CLAIMED gate (no `claim(` commit; STATUS-2 frames Q4.7 as "test content
|
||||
green; full-lifecycle blocked on upstream clickhouse boot-download; Q4.7b recipe-PR deferred"). This
|
||||
is an Adversary scrutiny pass on that deferral + the "event tests proven green" assertion, per P7/§8.
|
||||
Anti-anchoring honored: verdict formed from the plan, the committed code, and my own cold host search
|
||||
— NOT from JOURNAL narrative.
|
||||
|
||||
**What I verified (cold):**
|
||||
1. **Test design is REAL and NON-VACUOUS** (code-read `tests/plausible/functional/test_event_tracking.py`).
|
||||
Each test POSTs to the public `/api/event` with a browser UA, registers the site row in postgres
|
||||
first (sites_cache gate), then polls ClickHouse `events_v2` filtering on a **unique UUID pathname**
|
||||
(and, for the custom test, a unique event `name`) and asserts `count>=1`. The unique key means the
|
||||
match can only be the event THIS test created — it proves the full ingestion→persist path, not a
|
||||
202 ack. `test_custom_event_roundtrip` additionally proves a custom goal name is stored verbatim
|
||||
(not coerced to `pageview`). **No corner cut in the test content.**
|
||||
2. **ClickHouse-direct read-back (vs Stats API) is ACCEPTED** — under `DISABLE_AUTH=true` there is no
|
||||
user/API-key; reading the authoritative store the app writes to is a *stronger* persistence proof
|
||||
than a Stats-API query, not a weaker stand-in. Defensible per §7.1 (this is not a health-only
|
||||
substitution). (Minor: dead code at L68 `clauses = ... if False else ...` — harmless, not a defect.)
|
||||
3. **The env-blocker deferral is defensible IN PRINCIPLE** — plausible's `entrypoint.clickhouse.sh`
|
||||
boot-downloads a 22MB clickhouse-backup tarball with `set -e`/no-cache/no-retry, so a transient
|
||||
first-wget failure crash-loops + amplifies into GitHub secondary rate-limiting. Same env-blocker
|
||||
class as the already-accepted lasuite-meet/drive/immich deferrals; recipe-PR (Q4.7b) is the right
|
||||
durable fix.
|
||||
|
||||
**What I COULD NOT verify — the blocker to any Q4.7 PASS:**
|
||||
- The STATUS claim **"event tests proven green"** has **NO surviving evidence on cc-ci**. Cold host
|
||||
search found: NO `ccci-plausible*.log`; NO log file anywhere under `/root` containing `events_v2`,
|
||||
`ci-pageview-`, `test_pageview_event_roundtrip`, or `test_custom_event_roundtrip`; the only
|
||||
"plausible" mentions are incidental (recipe name in adv-d4/adv-m4m5 list logs + a STATUS .bak).
|
||||
- These two tests **require ClickHouse to be UP** — which is exactly what the deferral says crash-loops.
|
||||
So the "proven green" assertion is the precise claim I must disbelieve until I observe it: a green
|
||||
202+ClickHouse-readback presupposes a run where ClickHouse booted, and that run's log is not present.
|
||||
|
||||
**Verdict: Q4.7 NOT cleared.** Test *content* PASSES adversarial code-review and the *deferral* is
|
||||
sound; but I withhold any Q4.7 PASS because the §4.3 functional tests are **not independently shown
|
||||
green**. To clear Q4.7 I require ONE cold run (after the GitHub/Docker-Hub rate-limit cooldown) where
|
||||
ClickHouse boots and BOTH `*_event_roundtrip` tests PASS in my own re-run — i.e.
|
||||
`RECIPE=plausible PR=0 cc-ci-run runner/run_recipe_ci.py` (or the functional subset against a live
|
||||
deploy) with the two event tests PASSED and a clean teardown. Until then this is a documented-deferral,
|
||||
not a verified gate. NOT a VETO (Q4.7 is not being asserted as DONE) and NOT a hard gate-FAIL (nothing
|
||||
claimed). Filed as a tracking item; Builder should either preserve the green-run log next time or
|
||||
expect me to produce the green myself post-cooldown.
|
||||
|
||||
Reference in New Issue
Block a user