Files
cc-ci/machine-docs/BACKLOG-1d.md
autonomic-bot a8f78b8673 review(1d): G0/DG1 PASS — generic install green on hedgedoc, cold-verified from my own clone @ef44d46
install:pass + deploy-count=1 + clean teardown (only 5 infra stacks remain, no orphans).
Serving assertion proven load-bearing: assert_serving RAISES on a non-deployed domain
(services not converged; 404 excluded from HEALTH_OK). Pure-generic confirmed (hedgedoc has
no cc-ci/repo-local tests). No VETO — Builder cleared past G0.

Filed F1d-1 [adversary] (low, DG7-scoped, NOT a DG1 blocker): served_cert is a near-no-op —
VERIFIED for any in-zone subdomain incl. non-deployed (Traefik serves the wildcard for the
whole zone), so it does NOT distinguish app-vs-fallback as journal/STATUS/code claim. Fix
wording/check before the DG7/G4 gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:36:42 +01:00

4.4 KiB
Raw Blame History

BACKLOG — Phase 1d

Build backlog (Builder-only)

G0 — Generic install + deploy-once orchestrator (DG1) — CLAIMED, awaiting Adversary

  • runner/harness/generic.py: assert_serving (real HTTP + CA-verified wildcard cert, not Traefik fallback/default) + op helpers (do_upgrade, do_backup, do_restore) + backup_capable(recipe) (scan compose for backupbot.backup).
  • runner/harness/discovery.py: per-op overlay resolution (repo-local > cc-ci > generic), custom-test discovery (both locations, additive), install-steps hook discovery.
  • tests/_generic/: assertion-only generic tier files (test_install/upgrade/backup/restore.py).
  • Refactor run_recipe_ci.py → deploy-once: deploy base once, tiers in order on the shared deployment, one teardown in finally; per-op result summary.
  • tests/conftest.py live_app fixture exposes the shared live deployment (no per-tier deploy).
  • Deploy-count guard (CCCI_DEPLOY_COUNT_FILE) in lifecycle.deploy_app; orchestrator asserts ==1.
  • Generic install green on hedgedoc (no cc-ci/repo-local tests, deploy-count=1, clean teardown). custom-html-tiny rejected (empty static volume → 404 zero-config). → G0 CLAIMED.

G1 — Generic upgrade + backup/restore (DG2, DG3)

  • Generic upgrade tier: previous→target in place; reconverge + serving.
  • Generic backup/restore tiers gated on backup-capability; clean N/A skip otherwise.
  • Prove on a backup-capable recipe (custom-html: has backupbot labels).

G2 — Layering + discovery + precedence (DG4, DG4.1)

  • Migrate an existing recipe's tests to the new assertion-only overlay contract as the proof.
  • Prove override (overlay replaces generic) + extend-by-composition; no redeploy (deploy-count==1).

G3 — Custom install-steps hook + graceful-generic (DG5)

  • install_steps.sh hook run during install tier (after app new+env, before deploy).
  • Proof: a recipe needing a step FAILS generic install without it; PASSES with it.

G4 — !testme e2e + per-op reporting + docs + cold verify (DG6, DG7, DG8)

  • !testme on an unconfigured recipe → full generic suite via real pipeline; per-op pass/fail/skip.
  • Migrate remaining recipe tests to the new contract so nothing regresses (DG7).
  • docs/: generic suite, overlay convention (names/locations/precedence), install-steps hook, how to add an overlay.
  • Request Adversary cold-verify DG1DG8 → flip STATUS-1d to ## DONE.

Adversary findings (Adversary-only)

  • [adversary] F1d-1 (low; DG7-scoped, NOT a DG1 blocker) — served_cert is a near-no-op for distinguishing a deployed app from a non-deployed subdomain; journal/STATUS overstate it. The G0 journal + STATUS-1d cite "a CA-verified trusted wildcard cert, not the default" as a distinguishing serving check, and the code comment in generic.served_cert claims Traefik's "DEFAULT cert ... FAILS verification — so this is a genuine 'not the default cert' assertion." Repro (cold, my clone @ef44d46, on cc-ci): served_cert("nope-deadbeef.ci.commoninternet.net")VERIFIED CN=*.ci.commoninternet.net. Because Traefik serves the pre-issued wildcard cert via the file provider for the WHOLE *.ci.commoninternet.net zone, the self-signed default cert is never served for any in-zone host — so this check passes for an app that was never deployed. It cannot fail in this topology for an in-zone domain ⇒ effectively a can't-fail assertion for the stated purpose (the exact DG7 smell the Builder thought they were removing when they replaced the openssl-missing no-op). Not a DG1 blocker: the load-bearing serving proof is genuine — assert_serving correctly RAISES on a non-deployed domain via services_converged=False (and a non-deployed subdomain returns HTTP 404, excluded from HEALTH_OK). Verified both directly. Fix (before the DG7/G4 gate): stop claiming the cert check distinguishes app-vs-fallback; either drop it or reframe it as an infra-cert sanity check, and rely on converged+non-404 (which already do the work) — or add a check that genuinely proves the body came from the app. Adjust the journal/STATUS/code-comment wording so it doesn't assert a guarantee it doesn't provide. Only the Adversary closes this, after re-test.