Files
cc-ci-orchestrator/cc-ci-plan/adversary-verify-pr6.md
autonomic-bot 1f52795534 skill(ci-dev-workflow): capture the cc-ci feature-dev flow + adversary plan template
Documents the end-to-end workflow used to land the intentional-skips/4-rung-ladder
feature: explore harness → branch a local cc-ci clone → implement + unit-verify
cold on cc-ci → live full-stage check → open PR (never push main) → independent
adversary verdict → squash-merge on PASS → deploy via /root/builder-clone rebuild.
Includes the adversary-verify-pr6.md plan as a reusable template.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 03:16:47 +00:00

4.0 KiB

Adversary verification plan — cc-ci PR #6

PR: recipe-maintainers/cc-ci#6 Branch: feat/expected-na-and-tiny-functional (cc-ci repo) Stance: disbelieve and verify. Try to make it fail. Default to REJECT unless every check below is green with evidence. You did NOT author this code — verify it independently, cold.

What the PR claims

  1. A custom-html-tiny functional test (tests/custom-html-tiny/functional/test_serves_content.py): exact-byte round-trip from the served content volume + a real 404.
  2. recipe_meta.EXPECTED_NA = {rung: reason} lists rungs a recipe intentionally skips; any essential rung skipped and NOT listed is unintentional. Skips still cap the level (never inflate).
  3. The level ladder is the FOUR essential rungs only: install · upgrade · backup_restore · functional (top = L4). integration and recipe_local are OPTIONAL — not rungs, never cap, not shown as skips. SSO must still be enforced for the run VERDICT (the sso_dep_unverified / F2-11 path in run_recipe_ci.py must be intact).
  4. results.json carries skips:{intentional:{rung:reason}, unintentional:[rung]} + level_cap_rung; the card shows INTENTIONAL/UNINTENTIONAL SKIP rows; the badge shows an expected/gap? 3rd segment.

Verification steps (run on the cc-ci host; creds in /srv/cc-ci/.testenv)

  1. Fresh independent checkout of the PR head (do NOT reuse my working dirs): git clone … cc-cigit checkout feat/expected-na-and-tiny-functional. Record the HEAD sha.
  2. Full unit suite cold (not just the touched files): cc-ci-run -m pytest tests/unit/ -q. ALL must pass. A single failure → REJECT. Capture the count + any failure.
  3. Diff regression review: git diff origin/main...HEAD. Confirm: (a) level.py RUNGS is the 4 essential rungs; (b) derive_rungs no longer emits integration/recipe_local; (c) the SSO VERDICT logic in run_recipe_ci.py (sso_dep_unverified, the F2-11 fail-the-run block) is UNCHANGED — the PR must not have weakened SSO enforcement; (d) no test was weakened/skipped; (e) no secrets.
  4. Live end-to-end harness run, FULL stages on the real CI server: RECIPE=custom-html-tiny STAGES=install,upgrade,backup,restore,custom cc-ci-run runner/run_recipe_ci.py (set CCCI_RUNS_DIR to a temp dir; source .testenv). Then read its results.json and assert:
    • install PASS and upgrade PASS — the upgrade tier MUST actually run and pass (essential rung; prove it's not silently skipped). The custom tier (functional serve test) PASS.
    • level == 2, level_cap_rung == "backup_restore", level_cap_reason mentions L3 backup/restore.
    • rungs has exactly install/upgrade/backup_restore/functional — no integration/recipe_local.
    • skips.intentional == {"backup_restore": <reason>}, skips.unintentional == [].
    • backup_restore is N/A because there is no backupbot.backup label (a real intentional skip, NOT a masked failure) — confirm by inspecting the recipe compose.
    • badge.svg contains the muted expected third segment (not gap?).
    • The summary card renders and shows a green INTENTIONAL SKIP row for backup/restore with the reason.
  5. Non-vacuity of the functional test: confirm it asserts the exact random bytes round-trip and a 404 on a random path (so a 200-everything stub would fail it) — read the test; reason about whether it could pass against a broken server. Bonus: confirm it writes via the volume mountpoint (the SWS image is shell-less).
  6. Teardown + hygiene: after the run, confirm the harness left NO orphan stack/volume/container for custom-html-tiny (deploy-count == 1; clean teardown). Confirm no secret value appears in results.json.

Verdict

Return a clear PASS or REJECT with the evidence for each numbered step (the HEAD sha, the unit count, the key results.json fields, the upgrade-tier verdict, the teardown state). REJECT if ANYTHING is unproven. Do not merge — the orchestrator merges on your PASS.