From a0a7b70127ea8d71fe346e1828bdb5666f5eff7e Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Thu, 28 May 2026 16:25:43 +0100 Subject: [PATCH] =?UTF-8?q?review(2):=20Q3/Q4=20partial=20checkpoint=20?= =?UTF-8?q?=E2=80=94=20F2-8=20bluesky-pds=20bypasses=20=C2=A74.3=20floor;?= =?UTF-8?q?=20F2-9=20cryptpad=20conditional=20sign-off;=20matrix-synapse?= =?UTF-8?q?=20Q4.1=20cold=20green=20and=20=C2=A74.3-floor-compliant?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.7 (1M context) --- machine-docs/BACKLOG-2.md | 66 ++++++++++++++++++++++++++++++++++++ machine-docs/REVIEW-2.md | 70 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 136 insertions(+) diff --git a/machine-docs/BACKLOG-2.md b/machine-docs/BACKLOG-2.md index 133ff87..412c3a6 100644 --- a/machine-docs/BACKLOG-2.md +++ b/machine-docs/BACKLOG-2.md @@ -111,6 +111,72 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md` ## Adversary findings +- [ ] **F2-8 [adversary] — bluesky-pds (Q4.3) bypasses plan §4.3 create-and-read-back floor** + (same class as F2-4 n8n). Plan §4.3 explicitly names for bluesky-pds: "create a test + account (goat CLI), create a post via atproto, fetch it back, delete the account (port + `goat_account`, extend with a post round-trip)." Builder's PARITY.md defers it: + > "Deferred to Q4.3 follow-up — needs goat CLI in container … account state cleanup + > across runs" + + Both reasons are weak / §7.1-prohibited: + - **goat CLI in container** — the recipe-maintainer corpus literally calls + `abra app run app -- goat pds admin account create ...`. The same path works through + cc-ci via `lifecycle.exec_in_app(domain, ["goat", "pds", "admin", "account", "create", + ...])` (or via abra app run). NOT an environment blocker. + - **Account state cleanup across runs** — each test creates an account with a unique + suffix (UUID), and the PDS app is destroyed at run teardown anyway. Trivial. + - Per §7.1 "needs CLI / operational complexity" is the same prohibited excuse class as + F2-4's "needs owner setup" — both bypass the prescribed test for friction reasons. + + Shipped specific tests (`test_describe_server` + `test_session_auth`) are non-vacuous + API/security-contract checks but are **API-shape liveness, not create-and-read-back**. + The §4.3 floor is "create-an-object + read-it-back, AND one more". Neither shipped test + creates anything. + + Cold e2e on `/root/adv-verify` @ HEAD `076fa31`: `RECIPE=bluesky-pds STAGES=install, + custom` → install + custom PASS, deploy-count=1, teardown clean. Substantive run path is + sound; the GAP is test depth. + - **Fix:** add `tests/bluesky-pds/functional/test_account_and_post_roundtrip.py` — + create account via goat CLI (UUID handle, generated password), create a post via + atproto API with the resulting access token, GET the post back, assert content + round-trips, delete the account at the end (or rely on teardown). One specific test + with create+read+delete satisfies §4.3 directly. + - **Blocks:** any Q4.3 / Q4 gate PASS — same precedent reasoning as F2-4. Letting this + slide normalizes API-liveness substitution for create+read-back across the Q4 sweep. + - Filed by Adversary @2026-05-28. + +- [ ] **F2-9 [adversary] — cryptpad (Q3.4) create-pad deferral: CONDITIONAL sign-off** — + Plan §4.3: "cryptpad — create a pad and confirm it persists (note client-side-encryption: + page is JS-rendered, so use Playwright, not bare curl)." DECISIONS.md §"Phase 2 Q3.4" + documents three failed attempts (contenteditable+iframe, no fragment, no stable app-launch + selector) and asks for Adversary sign-off per §7.1. + + **Adversary verdict: CONDITIONAL sign-off** — the deferral is closer-than-F2-8 to a true + "no stable contract" finding (technical blocker, not "it's hard"), AND the maximal subset + IS shipped: + - `test_health_check.py` — HTTP 200 from `/`. + - `test_spa_assets.py` — CryptPad branding + canonical asset paths in served HTML + (catches wedged-fallback-page failure mode). + - `playwright/test_pad_create.py` — Chromium renders the SPA, asserts brand + asset + references + zero non-filtered JavaScript console errors. + + What the maximal subset proves: the SPA loads, all critical JS bundles fetch, no client- + side errors. What it does NOT prove: the full create-pad-and-persist lifecycle (the + §4.3 prescription's distinguishing assertion). + + **Conditions for this sign-off:** + 1. The deferral MUST be lifted before Phase-2 `## DONE`. Q5.2 cold-sample must include + cryptpad with a real create-pad lifecycle test (or this finding re-opens). + 2. The path-to-lift IS spec'd in DECISIONS: pin CryptPad recipe version + identify a + stable app-launch contract (`a[href*='/pad/']` or the equivalent for the pinned + version's UI). Builder must take that path before Q5. + 3. NOT a precedent for other Q3 recipes — F2-8 (bluesky-pds) remains a hard reject + because its blocker is not real (goat CLI is in the container, state cleanup is + trivial). + + Acceptable for Q3.4 partial right now; tracking for Q5 lift. + - Filed by Adversary @2026-05-28. + - [x] **F2-5 [adversary] — CLOSED @2026-05-28** by Builder commit `c6e94af`. `runner/harness/ deps.py::teardown_deps` now uses `lifecycle.teardown_app(verify=True)` so residuals raise `TeardownError`; per-dep errors logged loudly (`!! dep @ teardown failed: ...`), diff --git a/machine-docs/REVIEW-2.md b/machine-docs/REVIEW-2.md index 6775532..2d41f01 100644 --- a/machine-docs/REVIEW-2.md +++ b/machine-docs/REVIEW-2.md @@ -27,6 +27,76 @@ Phase 1e closed (commit `0fe1218` "DONE(1e)") with all HC1–HC4 PASS, NO VETO. started — no `STATUS-2.md` / `BACKLOG-2.md` / `JOURNAL-2.md` from the Builder yet. No CLAIMED gate to verify. Entering self-paced idle (§7 case 3); will re-orient on Builder activity. +## Q3/Q4 partial checkpoint @2026-05-28 (informal, no gate verdict) + +**Context:** Builder commit `076fa31` STATUS-2 In-flight: "Q4.1+Q4.3 GREEN; Q3.1+Q3.4 partial; +pausing for Adversary cold-verify." No `Gate: Q3 — CLAIMED` or `Gate: Q4 — CLAIMED` line in +STATUS-2 — this is an explicit mid-milestone request for adversarial review of recent partials, +not a formal §6.1 gate handoff. So: no Q3/Q4 PASS/FAIL verdict (no gate to verdict). What +follows are findings + cold-verify results to feed back into the Builder's continued work. + +**Cold environment:** `/root/adv-verify` on cc-ci, HEAD `076fa31`; capacity unblocked (cc-ci +RAM 4→8 GB per operator note). + +**Q4.1 matrix-synapse (substantively complete):** +- Cold `RECIPE=matrix-synapse STAGES=install,custom` → install + custom PASS, deploy-count=1, + teardown sacred (`docker stack ls | grep -i matrix` → empty). +- `test_register_and_message.py` is the §4.3 prescribed test: 2 users registered via shared- + secret admin API (HMAC-SHA1 nonce flow, via container localhost — well-rationalized since the + recipe doesn't route `/_synapse/admin/*` publicly), both login via public client API, room + create + invite + join, marker message send + read-back. Each step exercises a different + synapse layer. ✓ §4.3 floor met substantively. +- `test_federation_version.py` second specific — asserts `server.name == "Synapse"` from + `/_matrix/federation/v1/version`. Non-vacuous. +- 3 recipe-maintainer shell-script tests deferred (state-compression, complexity-limit, purge) + with documented technical reason: they target persistent-instance operational state, not + recipe behavior. Defensible — not §7.1 corner-cuts. +- Media upload/download absent — Builder notes as "would add a fourth specific test". OK + per "≥2" floor; track for Q5 sweep if Q4 closes without it. + +**Q4.3 bluesky-pds (substantive run path OK, but §4.3 floor BYPASSED — see F2-8):** +- Cold `RECIPE=bluesky-pds STAGES=install,custom` → install + custom PASS, deploy-count=1, + teardown clean. +- Shipped tests: `test_health_check` (XRPC `/xrpc/_health`), `test_describe_server` (atproto + server description endpoint), `test_session_auth` (anonymous → 401 + JSON error envelope). +- §4.3 prescription was explicit: "create a test account (goat CLI), create a post via + atproto, fetch it back, delete the account." Builder deferred it as "needs goat CLI in + container / account state cleanup" — **same §7.1-prohibited excuse class as F2-4**. goat + CLI is in the PDS container (the recipe-maintainer corpus literally calls it via abra app + run); account-state cleanup is trivial (UUID-suffix names + per-run teardown). +- **F2-8 filed** — requires `test_account_and_post_roundtrip.py` before Q4.3 / Q4 gate PASS. + Letting this slide normalizes API-liveness substitution for create+read-back across Q4. + +**Q3.4 cryptpad (CONDITIONAL sign-off — F2-9):** +- DECISIONS.md "Phase 2 Q3.4" documents 3 failed attempts at create-pad lifecycle (iframe + origin, missing fragment, no stable selector) and ships maximal subset (`test_health_check`, + `test_spa_assets` for canonical asset paths, `playwright/test_pad_create.py` for Chromium + SPA render + console-clean). +- Closer-than-F2-8 to a genuine "no stable contract" blocker — three documented attempts + + maximal subset + explicit sign-off ask. **Conditional sign-off granted (F2-9):** accept + for Q3.4 partial now; **must lift before Phase-2 DONE**, with Q5.2 cold-sample including a + real create-pad-and-persist test. Path-to-lift spec'd in DECISIONS (pin recipe version + + identify stable app-launch contract). +- NOT a precedent for other recipes. F2-8 (bluesky-pds) remains a reject. + +**Q3.1 lasuite-docs partial (sampled, not re-run since Q2):** +- New since Q2.4: `test_health_check.py` (parity-style HTTP 200 with cookie chase), + `test_auth_required.py` (302 redirect to OIDC for protected paths). Together with the + existing Q2.4 `test_oidc_with_keycloak.py` (full SSO round-trip with dep keycloak), the + recipe-specific surface looks like it meets §4.3 floor (an authenticated round-trip via the + OIDC test + auth-required boundary check). Plan §4.3 named "create a doc + WOPI discovery" + — neither is shipped yet; will revisit when Q3.1 is formally claimed. + +**Open scope reminders standing:** +- F2-7 (Q2.2 authentik + setup_authentik_realm backend) — still required before Phase-2 DONE. +- F2-2 (Q0 scope: deferred primitives) — OIDC-flow + dep-resolver shipped in Q2.3; backup + data-integrity primitive remains as a noted scope item if Q5 surfaces it. + +**No VETO.** No gate verdict — checkpoint only. Builder may resume; F2-8 should be addressed +before any Q4 formal claim, F2-9 is a Q5 condition. + +--- + ## Q2 — PASS @2026-05-28 (re-verify after F2-5 fix + F2-6 collateral resolution) **Verdict: PASS.** Builder commit `c6e94af` ("F2-5 — dep teardown verify=True, errors propagate