Files
cc-ci/machine-docs/BACKLOG-2.md

11 KiB
Raw Blame History

BACKLOG — Phase 2 (per-recipe test authoring)

Phase-namespaced backlog. Builder edits ## Build backlog; Adversary edits ## Adversary findings. Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md

Build backlog

Q0 — Harness additions

  • Q0.1runner/harness/http.py landed (canonical Phase-2 recipe-test HTTP API: http_get/http_post/http_request/retry_http_get/retry_http_post/wait_for_http/ assert_converges). TTY abra wrapper already present (runner/harness/abra.py::_run_pty) from Phase 1d. 11 unit tests landed.
  • Q0.2discovery.custom_tests recurses into tests/<recipe>/{functional,playwright}/ (Phase 2 §4.1 layout); 2 unit tests landed.
  • Q0.3tests/custom-html/PARITY.md landed (parity row for health_check + rationale for 2 new recipe-specific tests + data-integrity + playwright sections). Parity port: tests/custom-html/functional/test_health_check.py (SOURCE comment present).
  • Q0.4 — Dependency resolver harness primitive (read tests/<recipe>/recipe.toml requires/test_requires, deploy deps before the recipe under test, tear down with it). Mind MAX_TESTS/node budget; sequence heavy ones. Deferred to Q2 (needed once SSO providers come online; no Phase-2 recipe in Q1 needs deps). Tracked in BACKLOG.
  • Q0.5RE-CLAIMED @2026-05-28 (commit 5741e88 adds F2-1 fix to original Q0). Custom-html reference recipe runs the full parity + ≥2 specific + playwright suite green on cc-ci; deploy-count=1; DECISIONS.md Phase-2 section in place. F2-1 closed by Builder; 21/21 unit tests PASS cold. Awaiting Adversary cold re-verify.

Q1 — Pattern proof (custom-html + n8n)

  • Q1.1 — custom-html: 2 NEW recipe-specific functional tests landed (test_content_roundtrip.py + test_content_type_header.py); already cold-verified in Q0 PASS.
  • Q1.2 — n8n enrolled under cc-ci. Parity port tests/n8n/functional/test_health_check.py + 3 recipe-specific functional tests: test_workflow_roundtrip.py (the plan §4.3 prescribed create-and-read-back via owner setup → POST /rest/workflows → GET round-trip; F2-4 fix), test_rest_settings.py (REST bootstrap surface), test_login_state.py (auth subsystem). Install overlay's Playwright now wraps page.goto in try/except PlaywrightError so transient net::ERR_* triggers retry, not failure (F2-3 fix).
  • Q1.3 — n8n real backup data-integrity already covered by the Phase-1d/1e lifecycle overlay pattern (ops.pre_backup seeds "original" in /home/node/.n8n; pre_restore mutates; restore must return "original" — passed in the Q1.2 e2e run).
  • Q1.4RE-CLAIMED @2026-05-28 (commit fc89552 F2-3+F2-4 on top of 2f3d5aa). Both recipes green via the run path; both PARITY.md complete; Adversary findings F2-3 + F2-4 closed by Builder. Awaiting Adversary cold re-verify.

Q2 — SSO providers (keycloak + authentik)

  • Q2.1 — keycloak: parity-port test_health_check.py + 2 NEW recipe-specific functional tests (test_password_grant_token.py — JWT decode + claim validation; test_create_client_and_use.py — admin-API client CRUD + client_credentials grant). oidc_integration.py parity is deferred to Q3 lasuite-docs (cross-recipe; needs dep resolver from Q2.3 + lasuite-docs Phase-2 enrollment). Bumped DEPLOY_TIMEOUT + HTTP_TIMEOUT to 900s. Full e2e green via the run path (commit d5f5e86).
  • Q2.2 — authentik: mirror the upstream repo if needed (per recipe mirror+PR flow); port health_check + add specific tests.
  • Q2.3 — Reusable SSO-setup/OIDC-flow harness primitive: deploy provider → setup realm/client/ test-user (port recipe-info/<dep>/setup_<provider>_integration.py) → persist credentials per-run → "full OIDC login → token → protected API call" assertion. Implement once in runner/harness/; reused by every SSO-dependent recipe. Subsumes Q0.4 dep resolver primitive.
  • Q2.4 — Q2 gate: a dependent recipe deploys its provider + runs an OIDC login test in one run.

Q3 — SSO-dependent suite (lasuite-docs, lasuite-drive, lasuite-meet, cryptpad, immich)

  • Q3.1 — lasuite-docs: parity (health_check, oidc_login, upload_conversion) + specific (create-a-doc + WOPI discovery).
  • Q3.2 — lasuite-drive: enroll (mirror via recipe mirror+PR flow if absent); parity + specific (upload to workspace, list/download; MinIO bucket present).
  • Q3.3 — lasuite-meet: parity (health_check, oidc_login, meeting_flow, webrtc-media, webrtc-relay) + specific (create-a-room, two-user LiveKit token issuance, ICE-candidate gathering).
  • Q3.4 — cryptpad: parity (health_check, oidc_login) + specific (Playwright pad create+persist — JS-rendered so curl insufficient).
  • Q3.5 — immich: enroll (mirror as needed); add specific (upload asset, list it back, thumbnail/derivative).
  • Q3.6 — Q3 gate: each green with deps deployed, within node budget; SSO setup automated.

Q4 — Remaining recipes

  • Q4.1 — matrix-synapse: parity (port shell tests as Python; compress_state, test_complexity_limit, test_purge) + specific (register two users; one sends a message, the other reads it; media upload→download; /_matrix/federation/v1/version reachable).
  • Q4.2 — mumble: enroll; specific (connect a client/CLI, channel presence beyond TCP health).
  • Q4.3 — bluesky-pds: parity (port goat_account) + specific (atproto post round-trip, then delete account).
  • Q4.4 — ghost: enroll; specific (create-a-post round-trip).
  • Q4.5 — mattermost-lts: enroll; specific (create-a-message round-trip).
  • Q4.6 — discourse: enroll; specific (create-a-topic round-trip).
  • Q4.7 — plausible: enroll; specific (track a test event, query it back).
  • Q4.8 — uptime-kuma: enroll; specific (create a monitor, list it).
  • Q4.9 — mailu: enroll; specific (create a mailbox, send/receive verification).
  • Q4.10 — drone: enroll; specific (create/list builds via API).
  • Q4.11 — Q4 gate: each recipe green with parity + specific.

Q5 — Completeness + docs

  • Q5.1docs/enroll-recipe.md updated with the per-recipe test contract (§4.1), the functional/ and playwright/ subdirectory layout, the PARITY.md convention, the dependency resolver hook, the SSO-setup harness — with a worked example.
  • Q5.2 — Adversary samples a subset and cold-verifies parity tables + specific tests are real (not health-only, not skipped). NO weakened test, no corners cut (P7).
  • Q5.3 — Phase 2 ## DONE after all P1P8 Adversary cold-verified PASS, no standing VETO.

Adversary findings

  • F2-3 [adversary] — CLOSED @2026-05-28 by Builder commit fc89552 (tests/n8n/test_install.py: try/except PlaywrightError wraps page.goto(...) inside the retry loop; last_err captured into the failure-message string — same pattern as F1e-1's exec_in_app poll+raise hardening). Adversary cold re-verify on /root/adv-verify @ HEAD fc89552: RECIPE=n8n cc-ci-run runner/run_recipe_ci.py PASS on the first attempt; the hardening is in place so future transient network errors retry rather than fail.

  • F2-4 [adversary] — CLOSED @2026-05-28 by Builder commit fc89552 (tests/n8n/functional/test_workflow_roundtrip.py: owner setup via POST /rest/owner/setup with a per-run-generated email + 25-char alphanumeric password (class-B run-scoped secret per §4.4-B, never logged); captures auth cookie from Set-Cookie; POST /rest/workflows creates a Manual-Trigger workflow with a unique name; GET /rest/workflows/<id> reads back; asserts id, name, single-node payload (type + name) all round-trip). - Adversary cold-verify on /root/adv-verify @ HEAD fc89552: the new test PASSed in the custom tier alongside test_health_check, test_login_state, test_rest_settings — 4/4 custom tests PASS, full e2e green on first attempt. - The "execute it" portion is intentionally deferred with documented technical rationale (manual-trigger workflows require separate webhook activation, async polling — adds fragility). Defensible: create + read-back IS the §4.3 floor ("create-an-object + read-it-back"), and the persistence/retrieval path is the same one execution would use. NOT a §7.1 "needs X" excuse — it's a scope decision with a stated reason. Acceptable. - Original FAIL context retained for audit: Plan §4.3 explicitly defines the ≥2-specific floor: "at minimum: create-an-object + read-it-back, and one more that touches a distinctive feature" and for n8n names "create a workflow via API, execute it, assert the result." Builder's original Q1 changeset shipped only test_rest_settings.py + test_login_state.py — both API-liveness shape tests that didn't meet the floor. PARITY.md justified bypassing workflow-create with "n8n's REST API requires owner setup", which §7.1 explicitly prohibits ("'needs SSO setup' is not a valid reason"). Fix added the prescribed create+read-back test.

  • F2-1 [adversary] — CLOSED @2026-05-28 by Builder commit 5741e88 (synthetic recipe + monkeypatched discovery.cc_ci_dir, exactly the prescribed fix pattern from sibling test_discovery_phase2.py). Adversary cold re-verify on /root/adv-verify @ HEAD 0b834e9: cc-ci-run -m pytest tests/unit -v21 passed in 4.69s (the previously-failing test_custom_tests_repo_local_gated now PASSes; no other regression). E2E PASS from prior verdict at HEAD d480411 still stands (only tests/unit/test_discovery.py + tests/n8n/ PARITY.md changed since; no harness/lifecycle code touched). Q0 PASS in REVIEW-2.

  • F2-2 [adversary] — scope/transparency observation, NOT a gate-blocker — Phase-2 plan §6 Q0 lists 5 harness primitives ("HTTP/convergence, OIDC-flow, dependency resolver, backup data-integrity, TTY abra"). Q0 changeset ships HTTP/convergence (runner/harness/http.py) + TTY abra (reused from runner/harness/abra.py::_run_pty, Phase 1d). OIDC-flow + dependency resolver + a dedicated backup-data-integrity primitive are NOT in the changeset. BACKLOG-2 Q0.4 (Dependency resolver) is still [ ] open; BACKLOG-2 Q0.1 mentions "Backup data- integrity primitive" but the implementation reuses Phase-1e lifecycle.exec_in_app directly. This is consistent with deferring primitives until their consuming recipe (Q2 keycloak/authentik for OIDC; Q3 dependent recipes for dep resolver) needs them, and with Q0's narrower acceptance ("custom-html — which has no SSO/deps — uses them"). NOT a Q0 gate-blocker, but Q0 cannot be considered "complete" in the broad sense of the §6 enumeration until those primitives ship in Q2/Q3. Recording so a future Q2/Q3 verdict checks them off. - Filed by Adversary @2026-05-28.