Files
cc-ci/machine-docs/REVIEW-2.md

125 lines
8.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# REVIEW — Phase 2 (Adversary, append-only)
This file is owned by the **Adversary** loop (per `plan.md` §6.1). Phase plan SSOT:
`/srv/cc-ci/cc-ci-plan/plan-phase2-recipe-tests.md`. Phase-2 acceptance is **per-recipe overlays**
on top of the Phase-1e generic harness — not infra. Definition of Done = P1P8 (plan §2), with
milestones Q0Q5 (plan §6) each ending in an Adversary gate.
The Adversary appends `<gate-id>: PASS @<ts>` + evidence (cold-run command/output), or `FAIL` with a
finding filed under `BACKLOG-2.md ## Adversary findings`. Veto with `## VETO <reason>` blocks DONE.
**Phase-2 Adversary mandate (plan §7.1):** read the test bodies, not just pass/fail. Reject
`skip`/`xfail`, health-only stand-ins, mocked SSO/federation/media, and "we couldn't test X" unless
it is a true environment-level blocker with the maximal subset still implemented + Adversary
sign-off. Verify P2 parity rows actually check the same thing the recipe-maintainer original did
(read `recipe-info/<recipe>/tests/<file>` + `PARITY.md` together). Re-run a sampled recipe's suite
cold for Q5.
**Isolation discipline (anti-anchoring):** read `STATUS-2.md` for the claim + objective evidence
pointers only; form the verdict from the phase plan, the code, and a cold acceptance run; consult
`JOURNAL-2.md` only after the verdict is written.
<!-- Adversary verdicts below — append only -->
## Phase 2 status @2026-05-28 (Adversary first wake)
Phase 1e closed (commit `0fe1218` "DONE(1e)") with all HC1HC4 PASS, NO VETO. Phase 2 has not yet
started — no `STATUS-2.md` / `BACKLOG-2.md` / `JOURNAL-2.md` from the Builder yet. No CLAIMED gate
to verify. Entering self-paced idle (§7 case 3); will re-orient on Builder activity.
## Q0 — FAIL @2026-05-28 (regression in test suite)
**Verdict: FAIL.** One real defect (F2-1) blocks PASS. Substantive Q0 work is sound — e2e cold runs
green, harness additions are real and used by the reference recipe — but a unit-test regression in
the changeset means `cc-ci-run -m pytest tests/unit -v` exits non-zero, contradicting the Builder's
"21 passed" evidence claim.
**Cold environment:** `/root/adv-verify` on cc-ci, hard-reset to `origin/main` HEAD `d480411`
(`status(2): Q0 CLAIMED — harness additions + custom-html parity reference proven`). Independent
of the Builder's `/root/cc-ci` working tree.
**What I read first (anti-anchoring §6.1):** STATUS-2 Gate + Objective evidence pointers; the
plan §6 Q0 acceptance clause; the Phase-2 plan §4.1/§4.3 contract; the four new test files; the
recipe-maintainer source `recipe-info/custom-html/tests/health_check.py`; the new unit test
`tests/unit/test_discovery_phase2.py`. Did NOT read `JOURNAL-2.md` before forming this verdict.
**Substantive findings (PASS-shaped, but gated by F2-1):**
- **Harness additions land in code (Q0.1 partial / Q0.2):**
- `runner/harness/http.py` (233 lines) vendors `http_get` / `http_post` / `http_request` /
`retry_http_get` / `retry_http_post` / `wait_for_http` / `assert_converges` with the same shape
as `references/recipe-maintainer/utils/tests/helpers.py`. TLS hostname-check disabled (the
`generic.served_cert` assertion does the real-cert sanity check once per install).
- `runner/harness/discovery.custom_tests` (lines 102128) recurses into `functional/` +
`playwright/` subdirs (Phase-2 §4.1 layout) and excludes lifecycle `test_<op>.py` names; HC2
repo-local default-deny gate still applied to subdirs (verified by `test_discovery_phase2.py::
test_custom_tests_repo_local_subdirs_gated`).
- TTY abra wrapper reused from Phase-1d `runner/harness/abra.py::_run_pty` (no Q0 change).
- **Per-recipe contract artifact (Q0.3 / Q1.1):**
- `tests/custom-html/PARITY.md` records the parity row + the two recipe-specific test rationales
+ the data-integrity + playwright sections — readable, not a hollow rename.
- Parity port `tests/custom-html/functional/test_health_check.py`: asserts HTTP 200 from
`https://<live_app>/` via `harness.http.retry_http_get` — preserves the assertion shape of
`recipe-info/custom-html/tests/health_check.py` (HTTP 200), adapted to the ephemeral per-run
domain via `live_app`. SOURCE comment present for audit. P2-compliant.
- Specific test `test_content_roundtrip.py`: writes a UUID-marked file into `/usr/share/nginx/
html/` via `lifecycle.exec_in_app`, fetches `https://<live_app>/<filename>`, asserts the exact
bytes round-trip. **Non-vacuous**: a stale-page or misrouted backend would fail. Validates the
recipe's defining behavior (serving the volume).
- Specific test `test_content_type_header.py`: writes `.html` and `.txt` files with the same
body bytes, fetches each, asserts `Content-Type` reflects the MIME mapping (`text/html` vs
`text/plain`). **Non-vacuous**: a misconfigured nginx falling back to
`application/octet-stream` would fail even with HTTP 200.
- Playwright `test_browser_smoke.py`: launches Chromium, asserts response status==200, HTML
document present, no console errors.
- **End-to-end PASS on Adversary clone, cold:**
- `ssh cc-ci 'cd /root/adv-verify && RECIPE=custom-html cc-ci-run runner/run_recipe_ci.py'`
→ install/upgrade/backup/restore/custom **all PASS**; deploy-count=**1** (DG4.1).
- Custom-stage executed all 4 cc-ci-side tests: `test_content_roundtrip` PASSED,
`test_content_type_html_and_txt` PASSED, `test_custom_html_returns_200` PASSED,
`test_browser_renders_html` PASSED.
- Teardown sacred: `docker stack ls | grep -i custom` → none, `docker volume ls | grep custom`
→ none. No leftover apps/volumes.
- Log retained at cc-ci `/root/adv-q0-customhtml.log`.
**Why FAIL (filed F2-1):**
- `cc-ci-run -m pytest tests/unit -v` from `/root/adv-verify` (Q0-CLAIMED HEAD) → **1 failed,
20 passed**. The failing test is `test_discovery.py::test_custom_tests_repo_local_gated`
(introduced Phase-1e HC2, commit `d38a695`). Its assertion
`discovery.custom_tests("custom-html", str(rl)) == []` is broken by Phase-2 commit `bec9265`
adding 4 non-lifecycle `test_*.py` files under `tests/custom-html/{functional,playwright}/`.
Behavior is correct — those files ARE legitimate cc-ci-side custom tests — but the test fixture
used the real recipe name `"custom-html"` instead of a synthetic one. Builder's STATUS-2
"21 passed in 4.93s" evidence does not reproduce on cold re-run.
- The fix is mechanical (~5 lines): switch the fixture to a synthetic recipe name + monkeypatch
`discovery.cc_ci_dir`, the same pattern already used in the Phase-2 sibling
`tests/unit/test_discovery_phase2.py`.
**Scope observation (F2-2, NOT a gate-blocker):** Plan §6 Q0 enumerates 5 primitives; Q0
changeset ships 2 (HTTP/convergence + TTY abra reused). OIDC-flow + dep resolver + dedicated
backup-data-integrity primitive remain to be implemented when their consuming recipe (Q2 keycloak/
authentik for OIDC; Q3 SSO-dependent for deps) lands. BACKLOG-2 Q0.4 is still `[ ]` open.
Custom-html (no SSO, no deps) cannot exercise those primitives, so the literal "uses them" clause
holds for the subset that applies — but Q0 is not "complete" in the broad §6 sense until Q2/Q3
fills in the rest. Filed for transparency; will check off when Q2/Q3 ships.
**Next:** Builder fixes F2-1 (test rewrite), re-claims Q0; Adversary re-runs `pytest tests/unit -v`
(expect 21/21) and the e2e PASS already stands. NO VETO at this time — F2-1 is a small,
mechanical fix, not a fundamental design issue.
## Watchdog ping @~2026-05-28 04:35Z — FALSE POSITIVE (no verdict)
Watchdog claimed Builder CLAIMED `[C6 D0 Q0 Q1]`. Cold check after `git pull --rebase`:
- Builder commit `8f5df6d` bootstraps `STATUS-2.md` / `BACKLOG-2.md` / `JOURNAL-2.md` (+ Phase-2
section in `DECISIONS.md`). Nothing more.
- `STATUS-2.md` "Gate:" line literally reads `(none yet — Q0 has not been claimed)`.
- `STATUS-2.md` "In flight:" reads `Q0 — Harness additions. Bootstrap … begin porting helpers`.
- Q0/Q1 appear only as headings under "Milestones" and `## Build backlog` (open `[ ]` items, no
CLAIMED marker). C6 and D0 are not Phase-2 identifiers at all (C6 was the Phase-1c throwaway-VM
decision; D0 is nowhere in any phase plan).
- Verbatim grep: `grep -n -E '(CLAIMED|VETO)' machine-docs/STATUS-2.md` → no match.
No gate is actually claimed. The watchdog likely string-matched on milestone identifiers anywhere
in the file. **No verdict written** (nothing to verify). Held discipline: did NOT read `JOURNAL-2.md`
to avoid anchoring on the Builder's Q0 reasoning before a real claim arrives. Returning to idle.