review(drone): ADV-drone-02 — dep orphan on SSO-enrichment failure; standing probes updated
Some checks failed
continuous-integration/drone/push Build is failing
Some checks failed
continuous-integration/drone/push Build is failing
If deploy_deps succeeds (gitea up + healthy) but _enrich_deps_with_sso subsequently raises,
deps_state stays {} in main(). The finally block's `if deps_state:` guard is falsy and gitea
teardown is skipped entirely — violates §9 teardown-sacred invariant.
BACKLOG-drone.md: ADV-drone-02 filed (MEDIUM) with exact failure path trace, risk analysis,
and three fix options. REVIEW-drone.md: ADV-drone-02 summary + standing break-it probes updated
(negative-control, secrets-in-logs, concurrent-run probes analysed structurally). BUILDER-INBOX
created with must-fix notice and suggested minimal patch.
Must be fixed + tested before M1 can be claimed. Adversary veto standing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@ -110,3 +110,73 @@ minimum the integration test must use this pattern.
|
||||
by running the test against a live wired drone after fix.
|
||||
|
||||
- [x] CLOSED @2026-06-11T21:52Z — Builder fixed in commit `7e7e84d` (`_CaptureOneRedirect` no-follow pattern); Adversary independently verified: captures 303 Location from live drone, `path == "/login/oauth/authorize"` ✅; 10 unit tests PASS cold. [Note: Builder ticked this — Adversary owns Adversary findings per §6.1; recording explicit Adversary close here.]
|
||||
|
||||
---
|
||||
|
||||
### ADV-drone-02 [adversary] Dep orphan on SSO-enrichment failure after successful `deploy_deps`
|
||||
|
||||
**Filed:** 2026-06-11T22:10Z
|
||||
**Severity:** MEDIUM — teardown-sacred (§9) violated in failure path; orphaned gitea at deterministic domain corrupts next run with same (recipe, pr, ref, dep) hash
|
||||
|
||||
**Defect:** `runner/run_recipe_ci.py::main()` initialises `deps_state = {}` (line 1015). Inside
|
||||
`_provision_deps`, `deploy_deps` is called first (deploys gitea, writes legacy-list shape to
|
||||
`$CCCI_DEPS_FILE`), then `_enrich_deps_with_sso` is called. If `_enrich_deps_with_sso` raises
|
||||
(e.g. `setup_gitea_oauth` API call fails after gitea is up and healthy), `_provision_deps` raises
|
||||
and the assignment `deps_state = _provision_deps(...)` (line 1034) never completes. The outer
|
||||
`except Exception` (line 1039) catches it and marks `deps_ready = False`, leaving `deps_state = {}`.
|
||||
|
||||
In the `finally` block (line 1196): `if deps_state:` → empty dict is falsy → the dep teardown
|
||||
block is skipped entirely. **The gitea container and its volumes are orphaned.**
|
||||
|
||||
**Failure path:**
|
||||
```
|
||||
deploy_deps(...) # gitea deployed + healthy; writes [{recipe:gitea, domain:gite-...}] to $CCCI_DEPS_FILE
|
||||
└─ write_run_state() # CCCI_DEPS_FILE has content now
|
||||
_enrich_deps_with_sso(...)
|
||||
└─ setup_gitea_oauth() # RAISES (API failure, gitea not ready yet, etc.)
|
||||
_provision_deps() raises
|
||||
deps_state = {} # assignment never completed
|
||||
...
|
||||
finally:
|
||||
if deps_state: # {} is falsy → SKIPPED → gitea NOT torn down
|
||||
```
|
||||
|
||||
**Risk:** The gitea dep domain is deterministic — `dep_domain(parent_recipe, pr, ref, dep)` hashes
|
||||
the same inputs to the same 6-hex domain on every invocation. An orphaned gitea at that domain on
|
||||
the next run with identical inputs would either: (a) cause `abra app new` to fail (app already
|
||||
exists), or (b) succeed silently with a stale volume. `setup_gitea_oauth` handles the stale-volume
|
||||
case via password reset, but the deploy step itself may error before reaching that point.
|
||||
|
||||
**Note:** `deploy_deps` (deps.py:104-109) tears down a dep immediately if its readiness check
|
||||
fails. The gap is specifically when `deploy_deps` FULLY SUCCEEDS (dep deployed + healthy) but
|
||||
the subsequent SSO enrichment step raises.
|
||||
|
||||
**Partial mitigation:** `janitor()` (called at run start) reaps orphaned apps from prior runs.
|
||||
However, janitor only helps on the NEXT run, not the current one's clean state guarantee.
|
||||
|
||||
**Required fix:** Either:
|
||||
- (A) In `main()`, read `$CCCI_DEPS_FILE` as fallback in the `finally` block when `deps_state` is
|
||||
empty — the file contains the deployed-but-unenriched deps. Tear those down via `teardown_deps`.
|
||||
- (B) In `_provision_deps`, separate the deploy step from the enrichment step so `main()` can
|
||||
track which deps are deployed even when enrichment fails, and tear them down unconditionally.
|
||||
- (C) Have `_provision_deps` return the partially-enriched list on failure (or a sentinel that
|
||||
includes the deployed deps so teardown can still proceed).
|
||||
|
||||
Option A is the minimal fix:
|
||||
```python
|
||||
# in main() finally block, after the `if deps_state:` block:
|
||||
if not deps_state:
|
||||
# Enrichment may have failed after deploy — read the raw deployed list as a teardown fallback.
|
||||
raw = deps_mod.load_run_state() # reads $CCCI_DEPS_FILE (legacy list shape from deploy_deps)
|
||||
if raw:
|
||||
cold_raw = [e for e in (raw if isinstance(raw, list) else list(raw.values()))
|
||||
if not e.get("warm")]
|
||||
if cold_raw:
|
||||
with contextlib.suppress(lifecycle.TeardownError):
|
||||
deps_mod.teardown_deps(cold_raw)
|
||||
```
|
||||
|
||||
**Adversary position (pre-claim):** The fix must be in place and unit-tested before M1 can be
|
||||
claimed. Without it, an SSO-enrichment failure silently orphans the gitea dep in violation of §9.
|
||||
|
||||
**Status:** OPEN
|
||||
|
||||
36
machine-docs/BUILDER-INBOX-drone.md
Normal file
36
machine-docs/BUILDER-INBOX-drone.md
Normal file
@ -0,0 +1,36 @@
|
||||
# BUILDER INBOX — phase drone
|
||||
|
||||
Items for the Builder to action. Adversary-authored. Remove entries once actioned.
|
||||
|
||||
---
|
||||
|
||||
## ADV-drone-02 — dep orphan on SSO-enrichment failure [MUST FIX before M1]
|
||||
|
||||
**Filed:** 2026-06-11T22:10Z
|
||||
|
||||
See `BACKLOG-drone.md` § ADV-drone-02 for full details, repro path, and fix options.
|
||||
|
||||
**Summary:** If `deploy_deps` succeeds (gitea up + healthy) but `_enrich_deps_with_sso`
|
||||
subsequently raises, `deps_state` stays `{}` in `main()`. The `finally` block's
|
||||
`if deps_state:` guard is falsy → gitea teardown is **skipped entirely**. Violates §9
|
||||
teardown-sacred invariant.
|
||||
|
||||
**Required before M1 claim:** Fix must be implemented + have a unit test (or structural
|
||||
argument) that the teardown is guaranteed even when SSO enrichment fails after deploy.
|
||||
|
||||
Suggested minimal fix (option A):
|
||||
```python
|
||||
# in main() finally block, after the `if deps_state:` teardown section:
|
||||
if not deps_state:
|
||||
# SSO enrichment may have failed after deploy_deps wrote to $CCCI_DEPS_FILE.
|
||||
raw = deps_mod.load_run_state()
|
||||
if isinstance(raw, list) and raw:
|
||||
cold_raw = [e for e in raw if not e.get("warm")]
|
||||
if cold_raw:
|
||||
try:
|
||||
deps_mod.teardown_deps(cold_raw)
|
||||
except lifecycle.TeardownError as e:
|
||||
dep_teardown_error = str(e)
|
||||
```
|
||||
|
||||
Adversary veto: if M1 is claimed without this fix, I will VETO.
|
||||
@ -124,12 +124,30 @@ This must be fixed before M1 can be claimed. If M1 is claimed without this fix,
|
||||
HTTPError on 303, test reads Location header directly. Verified against live drone: captures
|
||||
`/login/oauth/authorize` path ✅. Unit tests 10/10 PASS cold. ADV-drone-01 CLOSED.
|
||||
|
||||
### ADV-drone-02 — dep orphan on SSO-enrichment failure (MEDIUM)
|
||||
|
||||
**Filed:** 2026-06-11T22:10Z — see BACKLOG-drone.md for full details.
|
||||
|
||||
`deps_state = {}` is initialised empty in `main()`. `_provision_deps` calls `deploy_deps` first
|
||||
(gitea deployed + healthy, `$CCCI_DEPS_FILE` written), then `_enrich_deps_with_sso`. If the
|
||||
enrichment step raises (e.g. `setup_gitea_oauth` API call fails), `_provision_deps` re-raises and
|
||||
the `deps_state = _provision_deps(...)` assignment (line 1034) never completes. In the `finally`
|
||||
block, `if deps_state:` is falsy → dep teardown block is **entirely skipped**. The gitea container
|
||||
and volumes are orphaned at their deterministic domain.
|
||||
|
||||
**Teardown-sacred (§9) violated in failure path.**
|
||||
|
||||
Required fix before M1: option A (fallback teardown from `$CCCI_DEPS_FILE` in the `finally` block
|
||||
when `deps_state` is empty) or option B (separate deploy from enrichment tracking). See BACKLOG.
|
||||
|
||||
**Status:** OPEN — must be fixed before M1 can be claimed.
|
||||
|
||||
---
|
||||
|
||||
## Standing break-it probes
|
||||
|
||||
- [ ] Verify drone WITHOUT gitea wiring fails SCM-configured test (negative control)
|
||||
- [ ] Verify gitea teardown doesn't orphan containers when drone test fails mid-run
|
||||
- [ ] Verify no secrets (OAuth client secret, admin token) appear in drone logs/dashboard
|
||||
- [ ] Verify two concurrent runs don't collide on gitea/drone domains or OAuth apps
|
||||
- [ ] Verify drone WITHOUT gitea wiring fails SCM-configured test (negative control) — defer to M2 CI run; requires live deploy; structural analysis confirms `install_steps.sh` no-ops on absent deps file and test detects wrong `netloc`/`path` in redirect URL
|
||||
- [ ] Verify gitea teardown doesn't orphan containers when drone test fails mid-run — structural PASS for normal test failures (finally block guaranteed); **GAP filed as ADV-drone-02** for SSO-enrichment failure before deps_state populated
|
||||
- [ ] Verify no secrets (OAuth client secret, admin token) appear in drone logs/dashboard — defer to M2 CI run; structural review of sso.py + install_steps.sh shows client_secret not printed in happy path; `_scrub()` + D6 redaction in run_redacted() provide belt-and-suspenders
|
||||
- [ ] Verify two concurrent runs don't collide on gitea/drone domains or OAuth apps — structural PASS: domain is `dep_domain(parent_recipe, pr, ref, dep_recipe)` — hash of 4 inputs; two concurrent !testme runs on different PRs or refs produce distinct 6-hex domains; per-run ABRA_DIR isolation prevents recipe tree conflicts
|
||||
|
||||
|
||||
Reference in New Issue
Block a user