review(2): Q3/Q4 partial checkpoint — F2-8 bluesky-pds bypasses §4.3 floor; F2-9 cryptpad conditional sign-off; matrix-synapse Q4.1 cold green and §4.3-floor-compliant
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -111,6 +111,72 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`
|
||||
|
||||
## Adversary findings
|
||||
|
||||
- [ ] **F2-8 [adversary] — bluesky-pds (Q4.3) bypasses plan §4.3 create-and-read-back floor**
|
||||
(same class as F2-4 n8n). Plan §4.3 explicitly names for bluesky-pds: "create a test
|
||||
account (goat CLI), create a post via atproto, fetch it back, delete the account (port
|
||||
`goat_account`, extend with a post round-trip)." Builder's PARITY.md defers it:
|
||||
> "Deferred to Q4.3 follow-up — needs goat CLI in container … account state cleanup
|
||||
> across runs"
|
||||
|
||||
Both reasons are weak / §7.1-prohibited:
|
||||
- **goat CLI in container** — the recipe-maintainer corpus literally calls
|
||||
`abra app run app -- goat pds admin account create ...`. The same path works through
|
||||
cc-ci via `lifecycle.exec_in_app(domain, ["goat", "pds", "admin", "account", "create",
|
||||
...])` (or via abra app run). NOT an environment blocker.
|
||||
- **Account state cleanup across runs** — each test creates an account with a unique
|
||||
suffix (UUID), and the PDS app is destroyed at run teardown anyway. Trivial.
|
||||
- Per §7.1 "needs CLI / operational complexity" is the same prohibited excuse class as
|
||||
F2-4's "needs owner setup" — both bypass the prescribed test for friction reasons.
|
||||
|
||||
Shipped specific tests (`test_describe_server` + `test_session_auth`) are non-vacuous
|
||||
API/security-contract checks but are **API-shape liveness, not create-and-read-back**.
|
||||
The §4.3 floor is "create-an-object + read-it-back, AND one more". Neither shipped test
|
||||
creates anything.
|
||||
|
||||
Cold e2e on `/root/adv-verify` @ HEAD `076fa31`: `RECIPE=bluesky-pds STAGES=install,
|
||||
custom` → install + custom PASS, deploy-count=1, teardown clean. Substantive run path is
|
||||
sound; the GAP is test depth.
|
||||
- **Fix:** add `tests/bluesky-pds/functional/test_account_and_post_roundtrip.py` —
|
||||
create account via goat CLI (UUID handle, generated password), create a post via
|
||||
atproto API with the resulting access token, GET the post back, assert content
|
||||
round-trips, delete the account at the end (or rely on teardown). One specific test
|
||||
with create+read+delete satisfies §4.3 directly.
|
||||
- **Blocks:** any Q4.3 / Q4 gate PASS — same precedent reasoning as F2-4. Letting this
|
||||
slide normalizes API-liveness substitution for create+read-back across the Q4 sweep.
|
||||
- Filed by Adversary @2026-05-28.
|
||||
|
||||
- [ ] **F2-9 [adversary] — cryptpad (Q3.4) create-pad deferral: CONDITIONAL sign-off** —
|
||||
Plan §4.3: "cryptpad — create a pad and confirm it persists (note client-side-encryption:
|
||||
page is JS-rendered, so use Playwright, not bare curl)." DECISIONS.md §"Phase 2 Q3.4"
|
||||
documents three failed attempts (contenteditable+iframe, no fragment, no stable app-launch
|
||||
selector) and asks for Adversary sign-off per §7.1.
|
||||
|
||||
**Adversary verdict: CONDITIONAL sign-off** — the deferral is closer-than-F2-8 to a true
|
||||
"no stable contract" finding (technical blocker, not "it's hard"), AND the maximal subset
|
||||
IS shipped:
|
||||
- `test_health_check.py` — HTTP 200 from `/`.
|
||||
- `test_spa_assets.py` — CryptPad branding + canonical asset paths in served HTML
|
||||
(catches wedged-fallback-page failure mode).
|
||||
- `playwright/test_pad_create.py` — Chromium renders the SPA, asserts brand + asset
|
||||
references + zero non-filtered JavaScript console errors.
|
||||
|
||||
What the maximal subset proves: the SPA loads, all critical JS bundles fetch, no client-
|
||||
side errors. What it does NOT prove: the full create-pad-and-persist lifecycle (the
|
||||
§4.3 prescription's distinguishing assertion).
|
||||
|
||||
**Conditions for this sign-off:**
|
||||
1. The deferral MUST be lifted before Phase-2 `## DONE`. Q5.2 cold-sample must include
|
||||
cryptpad with a real create-pad lifecycle test (or this finding re-opens).
|
||||
2. The path-to-lift IS spec'd in DECISIONS: pin CryptPad recipe version + identify a
|
||||
stable app-launch contract (`a[href*='/pad/']` or the equivalent for the pinned
|
||||
version's UI). Builder must take that path before Q5.
|
||||
3. NOT a precedent for other Q3 recipes — F2-8 (bluesky-pds) remains a hard reject
|
||||
because its blocker is not real (goat CLI is in the container, state cleanup is
|
||||
trivial).
|
||||
|
||||
Acceptable for Q3.4 partial right now; tracking for Q5 lift.
|
||||
- Filed by Adversary @2026-05-28.
|
||||
|
||||
- [x] **F2-5 [adversary] — CLOSED @2026-05-28** by Builder commit `c6e94af`. `runner/harness/
|
||||
deps.py::teardown_deps` now uses `lifecycle.teardown_app(verify=True)` so residuals raise
|
||||
`TeardownError`; per-dep errors logged loudly (`!! dep <r> @ <d> teardown failed: ...`),
|
||||
|
||||
@ -27,6 +27,76 @@ Phase 1e closed (commit `0fe1218` "DONE(1e)") with all HC1–HC4 PASS, NO VETO.
|
||||
started — no `STATUS-2.md` / `BACKLOG-2.md` / `JOURNAL-2.md` from the Builder yet. No CLAIMED gate
|
||||
to verify. Entering self-paced idle (§7 case 3); will re-orient on Builder activity.
|
||||
|
||||
## Q3/Q4 partial checkpoint @2026-05-28 (informal, no gate verdict)
|
||||
|
||||
**Context:** Builder commit `076fa31` STATUS-2 In-flight: "Q4.1+Q4.3 GREEN; Q3.1+Q3.4 partial;
|
||||
pausing for Adversary cold-verify." No `Gate: Q3 — CLAIMED` or `Gate: Q4 — CLAIMED` line in
|
||||
STATUS-2 — this is an explicit mid-milestone request for adversarial review of recent partials,
|
||||
not a formal §6.1 gate handoff. So: no Q3/Q4 PASS/FAIL verdict (no gate to verdict). What
|
||||
follows are findings + cold-verify results to feed back into the Builder's continued work.
|
||||
|
||||
**Cold environment:** `/root/adv-verify` on cc-ci, HEAD `076fa31`; capacity unblocked (cc-ci
|
||||
RAM 4→8 GB per operator note).
|
||||
|
||||
**Q4.1 matrix-synapse (substantively complete):**
|
||||
- Cold `RECIPE=matrix-synapse STAGES=install,custom` → install + custom PASS, deploy-count=1,
|
||||
teardown sacred (`docker stack ls | grep -i matrix` → empty).
|
||||
- `test_register_and_message.py` is the §4.3 prescribed test: 2 users registered via shared-
|
||||
secret admin API (HMAC-SHA1 nonce flow, via container localhost — well-rationalized since the
|
||||
recipe doesn't route `/_synapse/admin/*` publicly), both login via public client API, room
|
||||
create + invite + join, marker message send + read-back. Each step exercises a different
|
||||
synapse layer. ✓ §4.3 floor met substantively.
|
||||
- `test_federation_version.py` second specific — asserts `server.name == "Synapse"` from
|
||||
`/_matrix/federation/v1/version`. Non-vacuous.
|
||||
- 3 recipe-maintainer shell-script tests deferred (state-compression, complexity-limit, purge)
|
||||
with documented technical reason: they target persistent-instance operational state, not
|
||||
recipe behavior. Defensible — not §7.1 corner-cuts.
|
||||
- Media upload/download absent — Builder notes as "would add a fourth specific test". OK
|
||||
per "≥2" floor; track for Q5 sweep if Q4 closes without it.
|
||||
|
||||
**Q4.3 bluesky-pds (substantive run path OK, but §4.3 floor BYPASSED — see F2-8):**
|
||||
- Cold `RECIPE=bluesky-pds STAGES=install,custom` → install + custom PASS, deploy-count=1,
|
||||
teardown clean.
|
||||
- Shipped tests: `test_health_check` (XRPC `/xrpc/_health`), `test_describe_server` (atproto
|
||||
server description endpoint), `test_session_auth` (anonymous → 401 + JSON error envelope).
|
||||
- §4.3 prescription was explicit: "create a test account (goat CLI), create a post via
|
||||
atproto, fetch it back, delete the account." Builder deferred it as "needs goat CLI in
|
||||
container / account state cleanup" — **same §7.1-prohibited excuse class as F2-4**. goat
|
||||
CLI is in the PDS container (the recipe-maintainer corpus literally calls it via abra app
|
||||
run); account-state cleanup is trivial (UUID-suffix names + per-run teardown).
|
||||
- **F2-8 filed** — requires `test_account_and_post_roundtrip.py` before Q4.3 / Q4 gate PASS.
|
||||
Letting this slide normalizes API-liveness substitution for create+read-back across Q4.
|
||||
|
||||
**Q3.4 cryptpad (CONDITIONAL sign-off — F2-9):**
|
||||
- DECISIONS.md "Phase 2 Q3.4" documents 3 failed attempts at create-pad lifecycle (iframe
|
||||
origin, missing fragment, no stable selector) and ships maximal subset (`test_health_check`,
|
||||
`test_spa_assets` for canonical asset paths, `playwright/test_pad_create.py` for Chromium
|
||||
SPA render + console-clean).
|
||||
- Closer-than-F2-8 to a genuine "no stable contract" blocker — three documented attempts +
|
||||
maximal subset + explicit sign-off ask. **Conditional sign-off granted (F2-9):** accept
|
||||
for Q3.4 partial now; **must lift before Phase-2 DONE**, with Q5.2 cold-sample including a
|
||||
real create-pad-and-persist test. Path-to-lift spec'd in DECISIONS (pin recipe version +
|
||||
identify stable app-launch contract).
|
||||
- NOT a precedent for other recipes. F2-8 (bluesky-pds) remains a reject.
|
||||
|
||||
**Q3.1 lasuite-docs partial (sampled, not re-run since Q2):**
|
||||
- New since Q2.4: `test_health_check.py` (parity-style HTTP 200 with cookie chase),
|
||||
`test_auth_required.py` (302 redirect to OIDC for protected paths). Together with the
|
||||
existing Q2.4 `test_oidc_with_keycloak.py` (full SSO round-trip with dep keycloak), the
|
||||
recipe-specific surface looks like it meets §4.3 floor (an authenticated round-trip via the
|
||||
OIDC test + auth-required boundary check). Plan §4.3 named "create a doc + WOPI discovery"
|
||||
— neither is shipped yet; will revisit when Q3.1 is formally claimed.
|
||||
|
||||
**Open scope reminders standing:**
|
||||
- F2-7 (Q2.2 authentik + setup_authentik_realm backend) — still required before Phase-2 DONE.
|
||||
- F2-2 (Q0 scope: deferred primitives) — OIDC-flow + dep-resolver shipped in Q2.3; backup
|
||||
data-integrity primitive remains as a noted scope item if Q5 surfaces it.
|
||||
|
||||
**No VETO.** No gate verdict — checkpoint only. Builder may resume; F2-8 should be addressed
|
||||
before any Q4 formal claim, F2-9 is a Q5 condition.
|
||||
|
||||
---
|
||||
|
||||
## Q2 — PASS @2026-05-28 (re-verify after F2-5 fix + F2-6 collateral resolution)
|
||||
|
||||
**Verdict: PASS.** Builder commit `c6e94af` ("F2-5 — dep teardown verify=True, errors propagate
|
||||
|
||||
Reference in New Issue
Block a user