Files
cc-ci/machine-docs/STATUS-1e.md
autonomic-bot 0fe12188f2 DONE(1e): Phase 1e complete — HC1-HC4 all Adversary cold-verified PASS, NO VETO
build #155 (own !testme on custom-html PR#2): head_ref=db9a9502 == chaos-version=db9a9502
(1.10.0→1.13.0), additive generic+overlay both ran (8 assertions PASS), HC2 default-deny held under
load, deploy-count=1, teardown sacred, D6 secret-leak grep 0/58. F1e-1 CLOSED. F1e-2 pre-existing
(not a 1e regression). The generic-harness corrections are landed; foundation ready for Phase 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 04:26:42 +01:00

144 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# STATUS — Phase 1e (generic-harness corrections HC1HC4)
## DONE
**Phase 1e COMPLETE @2026-05-28.** All HC1HC4 Adversary cold-verified PASS within 24 h, NO VETO
(REVIEW-1e final summary). The Adversary explicitly cleared `## DONE` ("Builder may write `## DONE`").
- **HC1 ✓** (E2, commit 7472561): upgrade tier upgrades to PR-HEAD via `abra app deploy --chaos`;
`assert_upgraded` requires `chaos-version == head_ref` (non-vacuous). Adversary cold-verified on
custom-html + a monkey-patch probe; production build **#155** (own `!testme` on custom-html PR#2)
showed `head_ref=db9a9502 == chaos-version=db9a9502`, version `1.10.0+1.28.0→1.13.0+1.31.1`,
deploy-count=1. `$REF` flows bridge→Drone→runner→re-checkout→chaos correctly.
- **HC2 ✓** (E0, commit c7ae296): repo-local default-deny via `tests/repo-local-approved.txt`;
Adversary hostile-code probe + production build #155 (custom-html not on allowlist → cc-ci+generic
only, no repo-local consulted under load).
- **HC3 ✓** (E1 re-claim e75ec1b; F1e-1 fix 6eabfdc): generic runs additively alongside overlays;
opt-out via `CCCI_SKIP_GENERIC[_OP]` / `recipe_meta.SKIP_GENERIC`; op runs ONCE; deploy-count=1.
Production build #155: every tier ran BOTH `assert (generic)` and `assert (cc-ci)` (8 assertions
PASSED across install/upgrade/backup/restore). **F1e-1 CLOSED** (Adversary fix-verified the
`exec_in_app` poll+raise hardening on commit 6eabfdc).
- **HC4 ✓** (E3, commit 6397cd5 + Adversary build #155): no regression — D1 trigger 9 s latency, D6
secret-leak grep clean (0/58 patterns), DG4.1 deploy-count=1, teardown sacred (no leftover
stack/volume), DG1DG8 surface preserved or per DECISIONS-documented evolution. **F1e-2**
(pre-existing concurrent `abra recipe fetch` race) confirmed not a 1e regression; tracked in
BACKLOG-1e for breadth-ramp; not blocking DONE (Drone caps `MAX_TESTS=1`).
**The generic-harness corrections are landed and the foundation is ready for Phase 2.** Builder loop
stops; next is Phase 2 (recipe-test authoring on top of this corrected harness).
---
**Phase plan (SSOT):** `/srv/cc-ci/cc-ci-plan/plan-phase1e-harness-corrections.md`
**Loop state for THIS phase:** STATUS-1e / BACKLOG-1e / REVIEW-1e / JOURNAL-1e (DECISIONS.md shared).
Phase-1/1b/1c/1d STATUS/BACKLOG/REVIEW files are HISTORY (1d DONE) — not this phase's state.
## Phase
Phase 1e corrects the Phase-1d shared generic-test harness, before Phase 2 authors overlays on top.
Three corrections, each Adversary cold-verified, no test weakened:
- **HC1** — upgrade tier upgrades to the **PR head** (code under test) via `abra app deploy --chaos`,
not a published tag.
- **HC2** — repo-local (PR-authored) `test_*.py`/`install_steps.sh` run **only for recipes on an
explicit cc-ci approval allowlist** (default-deny); else cc-ci+generic only.
- **HC3** — the **generic runs by default (additive)** alongside any overlay; skipping it is explicit
(env/recipe_meta opt-out). Op runs once (harness-owned); generic + overlay assertions both evaluate
post-op state.
- **HC4** — Adversary cold re-verifies no regression (D1D10/DG1DG8) + the three new behaviors.
## Definition of Done (Phase 1e) — HC1HC4, each Adversary cold-verified in REVIEW-1e
- [x] **HC1** — PR-head upgrade proven to deploy PR-head; deploy-count guard reconciled (==1).
Adversary PASS @2026-05-28 (commit 7472561): own custom-html cold-verify
`head_ref=8a026066 == chaos-version=8a026066`, version 1.10.0→1.11.0, deploy-count=1, additive
generic+overlay both ran post-op, clean teardown; plus an adversarial monkey-patch probe proved
`assert_upgraded` fails loudly on a wrong PR-head — strictly non-vacuous.
- [x] **HC2** — repo-local ignored for a non-approved recipe, run for an approved one.
Adversary PASS @2026-05-28 (hostile-code probe, no finding; commit c7ae296).
- [x] **HC3** — generic runs alongside an overlay by default; skipped only with the opt-out set.
Adversary PASS @2026-05-28 (re-claim commit e75ec1b; F1e-1 fix commit 6eabfdc; opt-out + default
cold-verified, deploy-count=1, no assertion weakened).
- [x] **HC4** — no regression cold-verified; deploy-once + teardown still sacred.
Adversary PASS @2026-05-28 (build #155, own `!testme` on custom-html PR#2): D1 trigger 9 s, HC1
live (`head_ref=db9a9502 == chaos-version=db9a9502`), HC3 additive in production (both generic
and overlay tiers ran, 8 assertions PASSED), HC2 default-deny under load, deploy-count=1,
teardown sacred, D6 secret-leak grep clean (0/58). F1e-2 not a 1e regression.
## Milestones (plan §3)
- **E0** — HC2 trust gate (allowlist, default-deny). *Accept: repo-local ignored unless approved.*
- **E1** — HC3 additive + op/assertion split. *Accept: overlay+generic both run; opt-out skips; count=1.*
- **E2** — HC1 upgrade-to-PR-head. *Accept: upgrade demonstrably deploys PR-head.*
- **E3** — HC4 cold re-verification + docs → DONE.
## In flight
(none) — **Phase 1e DONE.** See top.
## Gate
**Gate: E3/HC4 — Adversary PASS @2026-05-28** (build #155, custom-html PR#2; full Adversary
production-pipeline verification — see REVIEW-1e "Final summary"). NO VETO.
**Gate: E3/HC4 — CLAIMED, awaiting Adversary @2026-05-28** (cleared by the PASS above). All three HC corrections are
Adversary-PASS; no regression introduced (rationale per HC4 line in Definition-of-Done above):
deploy-once + clean teardown demonstrated in every HC1 and HC3 cold run (deploy-count=1; no leftover
stack/volume); no assertion weakened (already verified per HC3 PASS — overlays migrated to
assertion-only, all data-survival/return checks kept); the comment-bridge / Drone / `!testme` trigger
path is unchanged from Phase 1d (DG6 still holds); intentional behaviour evolutions are documented in
DECISIONS (HC2 default-denies repo-local, HC3 makes layering additive, HC1 upgrades to PR-head via
chaos). **F1e-2** (concurrent same-recipe `fetch_recipe` race) is pre-existing in Phase 1d, filed by
the Adversary for HC4 visibility but explicitly "not blocking E1" (Drone caps `MAX_TESTS=1`); not a
1e regression — tracked for a future phase (per plan §1 HC4 scope: "no test weakened, deploy-once
still holds, teardown sacred, three new behaviors demonstrated" — all met).
**Gate: E2/HC1 — Adversary PASS @2026-05-28** (commit 7472561; own custom-html cold-verify
`head_ref==chaos-version`, deploy-count=1, additive, clean; monkey-patch probe confirmed
non-vacuous). The upgrade tier now
upgrades to the PR-HEAD code under test via `abra app deploy --chaos`, not a published tag. After
`fetch_recipe` the orchestrator captures `head_ref` (preferring `$REF` — the PR head sha; falls back
to the recipe checkout HEAD for non-PR `!testme`). On the upgrade tier: re-checkout the recipe to
`head_ref`, capture pre-upgrade identity, then `abra.deploy(chaos=True)` redeploys in place. The op
calls abra.deploy directly (NOT deploy_app), so `_record_deploy()` does not fire — **deploy-count
stays 1** (HC1/DG4.1 reconciled). `generic.assert_upgraded`, when head_ref is known, REQUIRES the
deployed `coop-cloud.<stack>.chaos-version` commit to MATCH head_ref — direct, non-vacuous proof the
code under test was deployed (a stale prev-checkout chaos redeploy would stamp prev's commit ≠
head_ref → FAIL). Fallback to version/image/chaos move check when head_ref is unknown.
**Cold-verifiable evidence on cc-ci** (hedgedoc, log `/root/ccci-1e-hc1-hed4.log`):
```
== cc-ci run: recipe=hedgedoc ref=None pr=0 stages=['install', 'upgrade']
===== TIER: upgrade (generic=run, overlay=none) =====
upgrade→PR-head: head_ref=09bf4d54 chaos-version=09bf4d54 version=3.0.9+1.10.7→3.0.10+1.10.8
PASSED tests/_generic/test_upgrade.py::test_upgrade_reconverges
===== RUN SUMMARY =====
deploy-count = 1 (expect 1)
install : pass
upgrade : pass
```
`head_ref == chaos-version` (09bf4d54) — deterministic proof of PR-head deploy. Plus a real version
move (3.0.9→3.0.10). deploy-count=1; clean teardown. The HC1 path also covers F1e-1's exec hardening
(used by the data-continuity overlays' exec_in_app reads).
**Gate: E1/HC3 — Adversary PASS @2026-05-28** (REVIEW-1e final; F1e-1 fix commit 6eabfdc verified
cold under opt-out; deploy-count=1; no assertion weakened; no concurrency confound).
**Gate: E0/HC2 — Adversary PASS @2026-05-28** (REVIEW-1e; hostile-code probe, no finding).
Prior CLAIM detail:
Adversary FAILed the prior claim (REVIEW-1e) with F1e-1: under `CCCI_SKIP_GENERIC=1` the backup
overlay flaked (`'' == 'original'`) because `lifecycle.exec_in_app` silently returned the empty stdout
of a failed `docker exec` (post-backup container cycle, no readiness buffer; the generic pytest spawn
had been an accidental ~1s buffer). **Fix (no assertion weakened):** `exec_in_app` now polls
(re-resolve container + re-exec) until `rc==0` or 90s, then RAISES — never masks an exec failure as
empty data. **Re-verified cold on cc-ci** (commit 6eabfdc): opt-out
`STAGES=install,backup,restore CCCI_SKIP_GENERIC=1` → install/backup/restore=pass, **0** generic files
ran, deploy-count=1, clean teardown (log `/root/ccci-1e-f1e1.log`). HC3 additive (default + opt-out)
otherwise unchanged from the prior claim's PASS evidence on commit b7e6cbd.
**Gate: E0/HC2 — Adversary PASS @2026-05-28** (REVIEW-1e; hostile-code probe, no finding).
Prior CLAIM detail: Repo-local (PR-authored)
`test_*.py`/`install_steps.sh`/`ops.py` is default-deny: consulted only for recipes on the cc-ci
approval allowlist `tests/repo-local-approved.txt` (empty ⇒ deny). Centralized gate in
`discovery.py` (`repo_local_approved`/`_gated`); `resolve_overlay_op`/`custom_tests`/`install_steps`/
`pre_op_hook` all honor it. Evidence: `cc-ci-run -m pytest tests/unit -q`**8 passed** on cc-ci
(commit d38a695), incl. repo-local ignored-when-unapproved / honored-when-approved; cc-ci hook
(custom-html-tiny) still resolves so DG5 is unaffected. Allowlist location overridable via
`CCCI_REPO_LOCAL_APPROVED_FILE` for cold demonstration.
## Blocked
(none) — bootstrap access re-verified @2026-05-28: `ssh cc-ci` ok (root, NixOS).