STATUS: M3 CLAIMED (polling primary verified) + resource-safety section; clear webhook blocker
All checks were successful
continuous-integration/drone/push Build is passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 02:56:28 +01:00
parent 72ff8e213d
commit 6bdf43febd
3 changed files with 79 additions and 27 deletions

View File

@ -446,3 +446,44 @@ remains 3-stage green (M5). docs/enroll-recipe.md written.
**M6 CLAIMED.** keycloak's full 3-stage (DB data survival via a realm marker) folds into M6.5.
**Next:** M6.5 — keycloak upgrade/backup, then recipes 36 across the remaining D10 categories.
---
## 2026-05-27 — Trigger redesign (polling primary) + resource safety + M3 verified
Session restarted by watchdog (prior tmux died mid-turn with uncommitted bridge WIP). Re-oriented
from STATUS + plan; two orchestrator design changes landed and are now implemented + verified.
**(1) Trigger: POLLING PRIMARY, webhook optional, org-membership auth** (plan §4.1/§1.5; commit
7addb96). Rewrote `bridge/bridge.py`: a poll thread (`poll_loop`, always-on, primary) scans each
`POLL_REPOS` repo's open PRs every 30s for new `!testme`; the `/hook` webhook stays as an optional
admin-registered push optimization. Both share an in-memory comment-id seen-set → a comment seen by
both fires once. First poll marks pre-existing comments seen (no startup re-fire). Authorization now
`GET /orgs/{owner}/members/{user}` (204=member, read-level) + optional `AUTH_ALLOWLIST`, replacing
the admin-requiring `/collaborators/{user}/permission`. Bot never self-registers webhooks.
- Verified org endpoint at read level (bot basic-auth):
`members/{autonomic-bot,trav,notplants}` → 204; `members/definitely-not-a-member-xyz` → 404.
- Deployed (nixos-rebuild, deploy-bridge reconcile); new container logs:
`poller (primary) watching ['recipe-maintainers/cc-ci'] every 30s` + `(poll primary + optional webhook)`.
- **End-to-end M3 trigger (poll path):** posted `!testme` on PR #1 (comment 13705, by bot) →
Drone build **#26** appeared after **6s** (latest was #25); bridge logged
`[poll] triggered build 26 for cc-ci@d397720a (PR #1, comment 13705) by autonomic-bot`; bridge
posted back `cc-ci: started CI run for cc-ci @ d397720a → https://drone.ci.commoninternet.net/...`.
Satisfies D1 (<60s) over the read-only outbound path — no operator webhook whitelist needed.
**(2) Resource safety: bound live test apps** (plan §4.2/§4.3; commit 72ff8e2). MAX_TESTS =
`DRONE_RUNNER_CAPACITY` = 1 (`modules/drone-runner.nix`) → Drone runs ≤1 build at once, queues the
rest natively. Per-build timeout = 60m, reconciled best-effort in `modules/drone.nix`
(`PATCH /api/repos/.../cc-ci {"timeout":60}`, non-fatal). Janitor remains the backstop for
SIGKILL'd/timed-out builds (reaps orphaned run apps at run-start before each deploy).
- Verified on host after rebuild: `DRONE_RUNNER_CAPACITY=1`; deploy-drone logged
`set cc-ci build timeout = 60m`; Drone API confirms repo `timeout: 60`.
**Gap noted (next item):** `.drone.yml` still only has the `self-test` pipeline — a bridge-triggered
build runs the self-test, NOT `runner/run_recipe_ci.py`. M4/M5 ran the orchestrator by hand
(`cc-ci-run`). Need a recipe-CI pipeline keyed on the `RECIPE` build param (runs
`cc-ci-run runner/run_recipe_ci.py` with STAGES=install,upgrade,backup, `CCCI_JANITOR_MAX_AGE=0`,
`concurrency:{limit:1}`) to connect bridge→Drone→harness end-to-end (required for D2/D10 via real
`!testme`). Added to Build backlog.
**M3 CLAIMED** (gate). Trigger + auth + comment-back demoed live; the webhook-delivery blocker is
moot now that polling is primary.