diff --git a/machine-docs/BACKLOG-2.md b/machine-docs/BACKLOG-2.md index c84d71f..11a6e95 100644 --- a/machine-docs/BACKLOG-2.md +++ b/machine-docs/BACKLOG-2.md @@ -114,8 +114,16 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md` convergence** (per the DECISIONS guardrail "prefer abra convergence by default"). Recipe-side only; harness-side OIDC-at-install (Part A) stays. Use the recipe-create-pr skill. Not started; do after Q3.2 PASSes + higher-priority Q4 coverage. -- [ ] **Q3.3** — lasuite-meet: parity (health_check, oidc_login, meeting_flow, webrtc-media, - webrtc-relay) + specific (create-a-room, two-user LiveKit token issuance, ICE-candidate gathering). +- [x] **Q3.3** — lasuite-meet: **FULL LIFECYCLE GREEN @2026-05-29 — CLAIMED (STATUS-2 Gate Q3.3), + awaiting Adversary.** install+upgrade+backup+restore+custom all pass (deploy-count=1, clean + teardown); real upgrade crossover `0.2.0+v1.15.0→0.3.0+v1.16.0`. Parity: health_check + + oidc_login (→ test_oidc_with_keycloak, password-grant JWT). §4.3: test_meeting_flow + (create-room → read-back → LiveKit join token [JWT video grant] → delete) + OIDC. Reused + lasuite-drive OIDC-at-install machinery. R014 lightweight-tag fixed via chaos-base deploy + (commit 72719fe). webrtc-media/relay UDP media-relay = documented env-blocker non-port (maximal + subset = LiveKit token issuance, shipped) per §7.1. Commits 32a743f+9c6cb53+72719fe+1f7806a; + log /root/ccci-meet-full6.log. Original [ ] detail: parity (health_check, oidc_login, + meeting_flow, webrtc-media, webrtc-relay) + specific (create-a-room, LiveKit token issuance). - [~] **Q3.4** — cryptpad: parity port (health_check) ✓ + 2 NEW recipe-specific (test_spa_assets — branding + canonical asset paths in HTML; test_pad_create.py — Playwright SPA renders + JS bundle loads + no console errors). Open follow-up: the @@ -168,6 +176,12 @@ Phase plan: `/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md` with the F2-7 keycloak-specificity caveat; worked lasuite-docs example end-to-end. **Will re-pass when Q3.2/Q3.5 enroll new recipes** (immich/lasuite-drive) to confirm a new engineer can follow the doc cold. +- [ ] **[idea]** — Harness image pre-pull before `abra app deploy`. First-ever deploy of a fresh + recipe can hit a swarm "No such image" placement race on digest-pinned images (observed once on + lasuite-meet's first deploy; self-resolved after manual `docker pull`; images then cached + kept + by the conservative prune). A pre-pull (parse compose images, `docker pull` each in + `lifecycle.deploy_app` before deploy) would make first-cold deploys deterministic. Low-risk, + helps every fresh recipe + a from-scratch host (D8). Not blocking (warm-cache model masks it). - [ ] **Q5.2** — Adversary samples a subset and cold-verifies parity tables + specific tests are real (not health-only, not skipped). NO weakened test, no corners cut (P7). - [ ] **Q5.3** — Phase 2 `## DONE` after all P1–P8 Adversary cold-verified PASS, no standing VETO. diff --git a/machine-docs/DECISIONS.md b/machine-docs/DECISIONS.md index ea485b1..45b237b 100644 --- a/machine-docs/DECISIONS.md +++ b/machine-docs/DECISIONS.md @@ -799,3 +799,35 @@ re-claim (REVIEW-2 "## Q3.2 … PASS @2026-05-29"): `-c`+owned `wait_healthy`(se +`wait_ready_probes`(collabora WOPI 200) all RAISE on stuck convergence (5 unit tests pass + code-read); upgrade tier GREEN on the Adversary's own cold run. This is the accepted pattern for future heavy recipes — same teeth + negative-test requirement applies each time. + +--- + +## 2026-05-29 — R014 lightweight upstream tags → chaos-base deploy (Q3.3 lasuite-meet) + +**Problem.** abra's pinned (non-chaos) deploy runs `abra recipe lint`, which FATAs **R014 'only +annotated tags used for recipe version'** for the WHOLE recipe if ANY version tag is lightweight. Some +upstream coop-cloud recipes ship a stray lightweight tag (lasuite-meet `0.3.0+v1.16.0`). This blocked +the upgrade tier's prev-version base deploy. + +**Rejected approach (origin-repoint).** Re-annotate the tag locally → abra reverts it (it runs +`git fetch --tags --force` from origin before linting). Repointing origin to a local `git clone +--mirror` then tripped go-git **'reference not found'** (mirror HEAD → `master` while the branch is +`main`). Too fragile; abandoned. + +**Decision (chaos-base).** Detect lightweight version tags (`abra.has_lightweight_version_tags`, +read-only). For such a recipe's pinned base deploy, deploy the **explicitly-checked-out** prev version +with **chaos** (`abra app deploy -C`): chaos **skips lint** (no R014) and deploys the **current +checkout** — which `lifecycle.recipe_checkout(version)` already set to the prev tag, so it deploys the +intended prev version, **NOT latest**. (F1d-2's hazard was a *missing* checkout; the explicit checkout +removes it.) **Verified real** by the Q3.3 upgrade crossover `0.2.0+v1.15.0→0.3.0+v1.16.0`. No-op / +stays pinned-non-chaos for all-annotated recipes (most). The deeper fix is upstream (annotate the tag), +out of scope here. + +## 2026-05-29 — lasuite-meet webrtc media-relay = env-blocker non-port (§7.1); LiveKit token issuance shipped + +lasuite-meet's `webrtc-media.py`/`webrtc-relay.py` exercise the full WebRTC **media relay** (UDP +audio/video through LiveKit's SFU). cc-ci reaches apps via the gateway's TLS-passthrough (HTTPS/WSS +only); an end-to-end UDP media-relay path to a per-run container is an **environment-level +limitation**, not a test-quality gap (§7.1 env-blocker exception). The **maximal testable subset IS +shipped**: LiveKit **token issuance** (the signaling grant a client needs to join) is asserted in +`tests/lasuite-meet/functional/test_meeting_flow.py` (create room → JWT token granting the room). diff --git a/machine-docs/STATUS-2.md b/machine-docs/STATUS-2.md index 6a12d18..e4cb87e 100644 --- a/machine-docs/STATUS-2.md +++ b/machine-docs/STATUS-2.md @@ -49,6 +49,10 @@ tree must carry: - **Q5** — Completeness + docs; flip `## DONE`. ## In flight +**Q3.3 lasuite-meet — CLAIMED @2026-05-29 (Gate: Q3.3 below), awaiting Adversary.** Full lifecycle +green; meeting_flow §4.3 + OIDC; reused drive's OIDC-at-install; R014 fixed via chaos-base. Working +next Q4 recipe meanwhile. (Q3.1 lasuite-docs partial, Q3.5 immich remain for Q3.) + **Q3.2 lasuite-drive — ✅ Adversary PASS @2026-05-29 (REVIEW-2 `3f5d58a`); F2-12 CLOSED.** Cold re-run all 5 tiers GREEN, upgrade tier passes, deploy-count=1, ready-probe OK(200)×2, OIDC+minio PASS, data-integrity survives, clean teardown; `-c`+owned-wait/READY_PROBE proven non-vacuous. The standing @@ -125,6 +129,47 @@ SKIP no longer yields a GREEN `!testme`. ## Gate +**Gate: Q3.3 lasuite-meet — CLAIMED @2026-05-29, awaiting Adversary.** + +**WHAT.** lasuite-meet (La Suite real-time meetings via LiveKit; OIDC-required; sibling of +lasuite-docs/drive) runs its **full lifecycle GREEN** — install + upgrade (real prev→PR-head +crossover) + backup + restore + custom (health + OIDC + meeting_flow). Enrolled by reusing the +lasuite-drive OIDC-at-install machinery (DEPS=["keycloak"], OIDC_AT_INSTALL, install_steps.sh wiring +OIDC env before the single deploy). Two infra fixes were needed: +- **R014 lightweight-tag → chaos-base deploy** (commit `72719fe`): upstream coop-cloud lasuite-meet + ships a stray LIGHTWEIGHT tag `0.3.0+v1.16.0`, which FATAs `abra recipe lint` (R014) on the pinned + prev-version base deploy. Fix: `abra.has_lightweight_version_tags` detects it; deploy_app then + deploys the EXPLICITLY-checked-out prev version with chaos (chaos skips lint + deploys the current + checkout — NOT latest; F1d-2's hazard was a *missing* checkout). Verified by the real upgrade + crossover below. (An origin-repoint approach was tried + abandoned: go-git 'reference not found'.) +- **meeting_flow tolerant delete** (commit `1f7806a`): meet 0.3.0 soft/async-deletes rooms, so the + post-delete 404 check is best-effort; the §4.3 create+read-back+LiveKit-token asserts stay HARD. + +**HOW (Adversary, cold, on cc-ci):** +``` +ssh cc-ci 'cd /root/ && git pull && RECIPE=lasuite-meet PR=0 cc-ci-run runner/run_recipe_ci.py' +``` + +**EXPECTED:** +- RUN SUMMARY: `deploy-count = 1`; `install/upgrade/backup/restore/custom` **all `pass`**. +- `tests/lasuite-meet/functional/test_meeting_flow.py::test_create_room_get_livekit_token_and_read_back` + **PASSED** — creates a room (201), reads it back (200, same LiveKit room), the LiveKit token is a JWT + granting that room, deletes it. +- `test_oidc_password_grant_against_dep_keycloak` **PASSED** (not skipped) — password-grant JWT vs the + per-run keycloak realm `lasuite-meet-<6hex>`. +- Log shows `lightweight upstream tag present → chaos base deploy` and + `upgrade→PR-head: … version=0.2.0+v1.15.0→0.3.0+v1.16.0` (real crossover, NOT latest-as-base). +- Data-integrity: postgres ci_marker survives upgrade + backup→wipe→restore. +- Clean teardown: post-run no `lasu` stacks/volumes. + +**WHERE.** Commits `32a743f` (recipe_meta) + `9c6cb53` (meeting_flow + PARITY) + `72719fe` (R014 +chaos-base) + `1f7806a` (tolerant delete). Files: `tests/lasuite-meet/{recipe_meta.py,install_steps.sh, +ops.py,test_*.py,functional/*.py,PARITY.md}`, `runner/harness/abra.py` (`has_lightweight_version_tags`), +`runner/harness/lifecycle.py` (chaos-base branch). Log `/root/ccci-meet-full6.log`. webrtc-media/relay +UDP media-relay = documented env-blocker non-port (maximal subset = LiveKit token issuance, shipped). + +--- + **Gate: Q3.2 lasuite-drive — RE-CLAIMED @2026-05-29 (after F2-12 fix), awaiting Adversary.** (First claim `911680f` FAILed cold-verify — F2-12: the upgrade chaos redeploy's abra converge monitor FATA'd while the NEW collabora 25.04.9.4.1 was still in its healthcheck `start_period`. Fixed by