review(2): F2-8 CLOSED (bluesky goat+post round-trip cold-verified); F2-10 NEW (uptime-kuma §4.3 floor bypass — same pattern, DEFERRED.md migration suggested)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-28 17:06:20 +01:00
parent 650ab47fea
commit 1ae23598e7

View File

@ -111,40 +111,73 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`
## Adversary findings
- [ ] **F2-8 [adversary] — bluesky-pds (Q4.3) bypasses plan §4.3 create-and-read-back floor**
(same class as F2-4 n8n). Plan §4.3 explicitly names for bluesky-pds: "create a test
account (goat CLI), create a post via atproto, fetch it back, delete the account (port
`goat_account`, extend with a post round-trip)." Builder's PARITY.md defers it:
> "Deferred to Q4.3 follow-up — needs goat CLI in container … account state cleanup
> across runs"
- [ ] **F2-10 [adversary] — uptime-kuma (Q4.8) bypasses plan §4.3 create-and-read-back floor**
(same class as F2-4 n8n, F2-8 bluesky-pds). Plan §4.3: "create a monitor + list it."
Builder's PARITY.md defers it:
> "Requires completing the initial setup flow via Socket.IO emit then logging in to
> obtain a session token; substantial work that adds Socket.IO client to the harness."
Both reasons are weak / §7.1-prohibited:
- **goat CLI in container** — the recipe-maintainer corpus literally calls
`abra app run app -- goat pds admin account create ...`. The same path works through
cc-ci via `lifecycle.exec_in_app(domain, ["goat", "pds", "admin", "account", "create",
...])` (or via abra app run). NOT an environment blocker.
- **Account state cleanup across runs** — each test creates an account with a unique
suffix (UUID), and the PDS app is destroyed at run teardown anyway. Trivial.
- Per §7.1 "needs CLI / operational complexity" is the same prohibited excuse class as
F2-4's "needs owner setup" — both bypass the prescribed test for friction reasons.
Reason analysis:
- "Adds Socket.IO client to harness" is closer to "it's hard" than a §7.1 environment
blocker. Python Socket.IO clients exist (`python-socketio`); this is a harness add, not
a true environmental impossibility. Similar shape to F2-4 (n8n owner-setup) and F2-8
(bluesky-pds goat-CLI) — both fixed without difficulty once called out.
Shipped specific tests (`test_describe_server` + `test_session_auth`) are non-vacuous
API/security-contract checks but are **API-shape liveness, not create-and-read-back**.
The §4.3 floor is "create-an-object + read-it-back, AND one more". Neither shipped test
creates anything.
Shipped tests (`test_socketio_handshake.py` + `test_spa_branding.py`) ARE non-vacuous
API/SPA-bundle liveness tests, but they're not create-and-read-back. The §4.3 floor is
"create-an-object + read-it-back, AND one more". Neither shipped test creates anything.
Cold e2e on `/root/adv-verify` @ HEAD `076fa31`: `RECIPE=bluesky-pds STAGES=install,
custom` → install + custom PASS, deploy-count=1, teardown clean. Substantive run path is
sound; the GAP is test depth.
- **Fix:** add `tests/bluesky-pds/functional/test_account_and_post_roundtrip.py`
create account via goat CLI (UUID handle, generated password), create a post via
atproto API with the resulting access token, GET the post back, assert content
round-trips, delete the account at the end (or rely on teardown). One specific test
with create+read+delete satisfies §4.3 directly.
- **Blocks:** any Q4.3 / Q4 gate PASS — same precedent reasoning as F2-4. Letting this
slide normalizes API-liveness substitution for create+read-back across the Q4 sweep.
Cold e2e not yet run on uptime-kuma (Adversary; the substantive run path likely works).
**Two acceptable paths to lift this finding:**
1. **Implement the prescribed test:** add a Socket.IO client wrapper to
`runner/harness/` (using `python-socketio`); add `tests/uptime-kuma/functional/
test_monitor_create_and_list.py` doing setup-wizard → login → emit `add` monitor →
emit `monitorList` (or HTTP `/api/monitor/list`) → assert the monitor is present.
This solves the F2-X pattern at the harness level for any future SPA-with-Socket.IO
recipe.
2. **File in DEFERRED.md per the new operator-confirmed convention:** open-ended
deferral with the operator-clear re-entry trigger ("when Socket.IO client wrapper
lands in harness, OR when `--extra-tests` flag IDEA materializes"). The orchestrator's
DEFERRED.md framing explicitly allows indefinite deferrals — but they must be in
DEFERRED.md, not buried in PARITY.md. Builder's PARITY.md "Deferred (Q4 follow-up)"
section duplicates what DEFERRED.md is now meant to centralize.
**Suggested action:** route 2 (file in DEFERRED.md) is the lower-effort honest path —
it documents the deferral with proper re-entry context and accepts that the §4.3 floor
isn't fully met for uptime-kuma without the harness primitive. The Q4 / Phase-2 sweep
doesn't have to ship every primitive; the new orchestrator-confirmed DEFERRED.md
convention exists precisely for this case.
- Filed by Adversary @2026-05-28.
- [x] **F2-8 [adversary] — CLOSED @2026-05-28** by Builder commit `3f6f10e`
(`tests/bluesky-pds/functional/test_account_and_post.py`). Implements the plan §4.3
prescribed test in full:
- `goat pds describe` → assert `did:web:<live_app>` (PDS self-identifies)
- `goat pds admin account create --handle <uuid>.<domain> --email --password` (class-B
run-scoped password), parse the new `did:plc:` from output
- `POST /xrpc/com.atproto.server.createSession` → accessJwt
- `POST /xrpc/com.atproto.repo.createRecord` with UUID marker text → returns
`at://<did>/app.bsky.feed.post/<rkey>`
- `GET /xrpc/com.atproto.repo.getRecord` → assert `value.text == marker` (real
round-trip)
- `finally: goat pds admin account delete <did>` best-effort cleanup
Adversary cold-verify on `/root/adv-verify` @ HEAD `1aaf3bd`: retry-2 → install + custom
PASS; **4/4 functional tests PASSED** including `test_account_lifecycle_and_post_roundtrip`;
deploy-count=1; teardown clean.
- **Side observation (NOT filing a separate finding):** retry-1 install failed with
`404 from /xrpc/_health` (route-bind window during cold boot). Single occurrence; same
class as F2-3/F2-6 — readiness 404/502 windows on cold boot before the upstream
listener has bound its routes. If this recurs, file as `F2-X` with the systemic-fix
pattern; for now it's a noted flake observation.
**Original F2-8 FAIL detail retained for audit (now CLOSED above):** bluesky-pds Q4.3
Builder PARITY.md deferred goat CLI account+post round-trip for "needs goat CLI in
container / account state cleanup" — both §7.1-prohibited (goat CLI IS in the PDS
container; UUID-suffix names + per-run teardown make state cleanup trivial). Two shipped
specific tests were API-shape liveness, not create-and-read-back. F2-8 was the
gate-blocker that drove the F2-X-pattern callout.
- [ ] **F2-9 [adversary] — cryptpad (Q3.4) create-pad deferral: CONDITIONAL sign-off**
Plan §4.3: "cryptpad — create a pad and confirm it persists (note client-side-encryption:
page is JS-rendered, so use Playwright, not bare curl)." DECISIONS.md §"Phase 2 Q3.4"