bridge: polling primary + org-membership auth (orchestrator design change)
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing

Polling is now the primary, read-only trigger (always-on thread); the /hook
webhook is an optional admin-registered push optimization deduped by comment id.
Authorize commenters via GET /orgs/{owner}/members/{user} (204, read-level) +
optional allowlist, replacing the admin-requiring /collaborators permission
endpoint. Bot never self-registers webhooks. Enroll = POLL_REPOS + tests/<recipe>/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 02:41:25 +01:00
parent 25b628e959
commit 7addb9686c
4 changed files with 179 additions and 65 deletions

View File

@ -48,6 +48,27 @@ Architecture decisions and dead-ends. One line of rationale each. (§0, §8)
wildcard means bumping `SECRET_WILDCARD_*_VERSION` (operator) so the next reconcile re-inserts.
Documented in docs/secrets.md at M7.
- **Trigger: POLLING primary, webhook optional — SETTLED (orchestrator design change 2026-05-27,
supersedes the earlier "keep webhook, do NOT pivot to polling" steer).** Hard constraint: the
bot/server runs at **READ level, never repo-admin**, and **never self-registers a webhook**.
- **Polling is PRIMARY and the source of truth for D1.** The bridge polls each enrolled repo's
open PRs for new `!testme` comments every `POLL_INTERVAL` (30s ≤ 60s). Outbound
(cc-ci → git.autonomic.zone, the reliably-working direction), needs only read+comment. On
startup the first poll marks pre-existing comments seen so it doesn't fire on old comments.
- **Webhook is an OPTIONAL push optimization.** The `/hook` endpoint stays live (HMAC-verified)
so an *admin-registered* `issue_comment` webhook lowers latency, but the bridge never registers
one. Manual registration is documented in `docs/enroll-recipe.md`. Both paths share an
in-memory seen-set keyed by comment id → a comment seen by both fires at most once.
- **Commenter authorization via org membership (read-level, no admin).** Allowed iff
`GET /orgs/{owner}/members/{user}` → 204 (verified 2026-05-27: admits bot/trav/notplants, 404
for a non-member, works with bot read-level basic-auth) **or** the user is in the optional
`AUTH_ALLOWLIST`. Replaces the earlier `/collaborators/{user}/permission` check, which needs
repo-admin. Fail-closed on any error.
- **Enrollment** = add the repo to the bridge `POLL_REPOS` csv + ensure `tests/<recipe>/` exists.
No webhook required for CI to work. (Why root cause of the old webhook non-delivery doesn't
matter: polling makes it irrelevant; the operator was whitelisting `ci.commoninternet.net` in
Gitea's `ALLOWED_HOST_LIST`, but D1 no longer depends on that.)
## Open (defaults from §8, to confirm as reality lands)
- **Deploy mechanism — SETTLED (M0):** `nixos-rebuild switch --flake /root/cc-ci#cc-ci` run *on

View File

@ -1,24 +1,38 @@
#!/usr/bin/env python3
"""cc-ci comment-bridge (§4.1).
Receives Gitea `issue_comment` webhooks; when a *collaborator* comments exactly `!testme` on an
open PR, triggers a parameterized Drone build of the cc-ci pipeline for that PR's head commit and
posts a PR comment linking the run. Everything else is ignored. Python stdlib only.
When an *authorized* user comments exactly `!testme` on an open PR in an enrolled recipe repo,
trigger a parameterized Drone build of the cc-ci pipeline for that PR's head commit and post a PR
comment linking the run. Everything else is ignored.
Config (env):
BRIDGE_LISTEN host:port to bind (default 0.0.0.0:8080)
GITEA_API e.g. https://git.autonomic.zone/api/v1
DRONE_URL e.g. https://drone.ci.commoninternet.net
CI_REPO the pipeline repo, e.g. recipe-maintainers/cc-ci
HMAC_FILE file with the webhook HMAC secret
DRONE_TOKEN_FILE file with the Drone API token
GITEA_TOKEN_FILE file with the Gitea API token
Trigger paths (§4.1, SETTLED):
* POLLING is PRIMARY (always on): the bridge polls each enrolled repo's open PRs for new
`!testme` comments every POLL_INTERVAL seconds. This is outbound (cc-ci -> git.autonomic.zone)
and needs only READ + comment access — never repo-admin. It is the source of truth for D1.
* WEBHOOK is an OPTIONAL push optimization: the `/hook` endpoint stays live so a Gitea
`issue_comment` webhook, *if an admin registered one*, lowers latency. The bridge NEVER
self-registers a webhook (that needs repo-admin, which we refuse). Manual registration is
documented in docs/enroll-recipe.md.
Both paths share an in-memory seen-set keyed by comment id, so a comment seen by both fires at most
once (no double-trigger). On startup the first poll marks pre-existing comments seen so old comments
don't re-fire. Python stdlib only.
Authorization: a commenter is allowed iff they are a member of the repo's owning org
(`GET /orgs/{owner}/members/{user}` -> 204), which is readable by any org member (read-level, no
admin). An optional AUTH_ALLOWLIST (csv of usernames) is also honored. Fail-closed on any error.
Config (env): BRIDGE_LISTEN, GITEA_API, DRONE_URL, CI_REPO, HMAC_FILE, DRONE_TOKEN_FILE,
GITEA_TOKEN_FILE, POLL_INTERVAL (default 30), POLL_REPOS (csv of enrolled repos), AUTH_ALLOWLIST
(csv, optional).
"""
import hashlib
import hmac
import json
import os
import sys
import threading
import time
import urllib.error
import urllib.parse
import urllib.request
@ -28,6 +42,7 @@ GITEA_API = os.environ.get("GITEA_API", "https://git.autonomic.zone/api/v1")
DRONE_URL = os.environ.get("DRONE_URL", "https://drone.ci.commoninternet.net")
CI_REPO = os.environ.get("CI_REPO", "recipe-maintainers/cc-ci")
TRIGGER = "!testme"
ALLOWLIST = {u.strip() for u in os.environ.get("AUTH_ALLOWLIST", "").split(",") if u.strip()}
def _read(path):
@ -39,13 +54,18 @@ HMAC_SECRET = _read(os.environ["HMAC_FILE"]).encode()
DRONE_TOKEN = _read(os.environ["DRONE_TOKEN_FILE"])
GITEA_TOKEN = _read(os.environ["GITEA_TOKEN_FILE"])
# Shared dedup across the poll + webhook paths: a comment id triggers at most one run.
_PROCESSED: set = set()
_PROCESSED_LOCK = threading.Lock()
def log(*a):
print(*a, file=sys.stderr, flush=True)
def _api(url, token, method="GET", data=None):
headers = {"Authorization": "token " + token} if token else {}
def _api(url, token, method="GET", data=None, scheme="token"):
# Gitea wants "Authorization: token <t>"; Drone wants "Authorization: Bearer <t>".
headers = {"Authorization": f"{scheme} {token}"} if token else {}
body = None
if data is not None:
body = json.dumps(data).encode()
@ -57,11 +77,22 @@ def _api(url, token, method="GET", data=None):
return resp.status, (json.loads(raw) if raw else None)
except urllib.error.HTTPError as e:
return e.code, None
except (urllib.error.URLError, OSError) as e:
log("api error", url, e)
return None, None
def is_collaborator(full_name, user):
# 204 => the user has push access (collaborator or org member with access).
status, _ = _api(f"{GITEA_API}/repos/{full_name}/collaborators/{user}", GITEA_TOKEN)
def is_authorized(full_name, user):
"""Allowed iff the user is a member of the repo's owning org (read-level membership check) or in
the static AUTH_ALLOWLIST. Uses GET /orgs/{owner}/members/{user} (204=member), which any org
member can read — no repo-admin needed. Fail-closed: anything other than a clean 204/allowlist
hit is rejected."""
if not user:
return False
if user in ALLOWLIST:
return True
owner = full_name.partition("/")[0]
status, _ = _api(f"{GITEA_API}/orgs/{owner}/members/{user}", GITEA_TOKEN)
return status == 204
@ -79,7 +110,7 @@ def trigger_build(recipe, ref, pr, src):
{"branch": "main", "RECIPE": recipe, "REF": ref, "PR": str(pr), "SRC": src}
)
url = f"{DRONE_URL}/api/repos/{CI_REPO}/builds?{q}"
status, build = _api(url, DRONE_TOKEN, method="POST")
status, build = _api(url, DRONE_TOKEN, method="POST", scheme="Bearer")
if status in (200, 201) and build:
return build.get("number")
log("drone trigger failed", status)
@ -87,12 +118,52 @@ def trigger_build(recipe, ref, pr, src):
def post_comment(owner, repo, number, body):
_api(
f"{GITEA_API}/repos/{owner}/{repo}/issues/{number}/comments",
GITEA_TOKEN,
method="POST",
data={"body": body},
)
_api(f"{GITEA_API}/repos/{owner}/{repo}/issues/{number}/comments", GITEA_TOKEN,
method="POST", data={"body": body})
def list_open_prs(full_name):
status, prs = _api(f"{GITEA_API}/repos/{full_name}/pulls?state=open&limit=50", GITEA_TOKEN)
return prs if status == 200 and prs else []
def list_comments(full_name, number):
status, cs = _api(f"{GITEA_API}/repos/{full_name}/issues/{number}/comments", GITEA_TOKEN)
return cs if status == 200 and cs else []
def _claim(comment_id) -> bool:
"""Atomically claim a comment id for processing. Returns False if already claimed (dedup)."""
if comment_id is None:
return True
with _PROCESSED_LOCK:
if comment_id in _PROCESSED:
return False
_PROCESSED.add(comment_id)
return True
def process_testme(full_name, owner, name, number, user, comment_id, source):
"""Shared by both paths. Dedupes by comment id, checks authorization, resolves the PR head,
triggers the build, comments the run link. Returns (run_url|None, reason)."""
if not _claim(comment_id):
return None, "duplicate"
if not is_authorized(full_name, user):
log(f"rejected: {user} is not an authorized org member on {full_name}")
return None, "not authorized"
head = pr_head(owner, name, number)
if not head or not head["sha"]:
return None, "cannot resolve PR head"
num = trigger_build(name, head["sha"], number, head["repo"] or full_name)
if not num:
post_comment(owner, name, number, "cc-ci: failed to start a CI run (see bridge logs).")
return None, "trigger failed"
run_url = f"{DRONE_URL}/{CI_REPO}/{num}"
post_comment(owner, name, number,
f"cc-ci: started CI run for `{name}` @ `{head['sha'][:8]}` → {run_url}")
log(f"[{source}] triggered build {num} for {name}@{head['sha'][:8]} "
f"(PR #{number}, comment {comment_id}) by {user}")
return run_url, "ok"
class Handler(BaseHTTPRequestHandler):
@ -103,80 +174,81 @@ class Handler(BaseHTTPRequestHandler):
self.wfile.write(msg.encode())
def do_GET(self):
# health endpoint
if self.path.rstrip("/") in ("/hook/healthz", "/healthz"):
return self._send(200, "ok")
return self._send(404, "not found")
def do_POST(self):
# Optional push optimization; polling is primary. Deduped against the poller by comment id.
length = int(self.headers.get("Content-Length", 0))
body = self.rfile.read(length)
# 1) verify HMAC (Gitea sends hex sha256 in X-Gitea-Signature)
sig = self.headers.get("X-Gitea-Signature", "")
expected = hmac.new(HMAC_SECRET, body, hashlib.sha256).hexdigest()
if not hmac.compare_digest(sig, expected):
log(f"rejected: bad signature event={self.headers.get('X-Gitea-Event')} "
f"got={sig[:12]} want={expected[:12]} bodylen={len(body)} seclen={len(HMAC_SECRET)} "
f"hub256={(self.headers.get('X-Hub-Signature-256') or '')[:20]}")
log(f"rejected: bad signature event={self.headers.get('X-Gitea-Event')}")
return self._send(401, "bad signature")
if self.headers.get("X-Gitea-Event") != "issue_comment":
return self._send(204, "ignored")
try:
payload = json.loads(body)
except ValueError:
return self._send(400, "bad json")
action = payload.get("action")
comment = (payload.get("comment") or {}).get("body", "")
c = payload.get("comment") or {}
issue = payload.get("issue") or {}
repo = payload.get("repository") or {}
user = (payload.get("comment") or {}).get("user", {}).get("login", "")
full_name = repo.get("full_name", "")
owner = (repo.get("owner") or {}).get("login", "")
name = repo.get("name", "")
number = issue.get("number")
# 2) only a created comment, exactly "!testme", on a PR
if action != "created" or comment.strip() != TRIGGER:
if action != "created" or (c.get("body") or "").strip() != TRIGGER:
return self._send(204, "ignored")
if not issue.get("pull_request"):
return self._send(204, "not a PR")
# 3) commenter must be a collaborator / org member with access
if not is_collaborator(full_name, user):
log(f"rejected: {user} not a collaborator on {full_name}")
return self._send(403, "not authorized")
# 4) resolve PR head (test the code at the PR head commit)
head = pr_head(owner, name, number)
if not head or not head["sha"]:
return self._send(502, "cannot resolve PR head")
# 5) trigger the parameterized Drone build
num = trigger_build(name, head["sha"], number, head["repo"] or full_name)
if not num:
post_comment(owner, name, number, "cc-ci: failed to start a CI run (see bridge logs).")
return self._send(502, "trigger failed")
run_url = f"{DRONE_URL}/{CI_REPO}/{num}"
post_comment(
owner, name, number,
f"cc-ci: started CI run for `{name}` @ `{head['sha'][:8]}` → {run_url}",
)
log(f"triggered build {num} for {name}@{head['sha'][:8]} (PR #{number}) by {user}")
run_url, reason = process_testme(
repo.get("full_name", ""), (repo.get("owner") or {}).get("login", ""),
repo.get("name", ""), issue.get("number"),
c.get("user", {}).get("login", ""), c.get("id"), "webhook")
if not run_url:
if reason == "duplicate":
return self._send(200, "already handled")
return self._send(403 if reason == "not authorized" else 502, reason)
return self._send(201, run_url)
def log_message(self, *a): # quiet default access logging
def log_message(self, *a):
pass
def poll_loop():
"""Primary trigger path. Outbound, read-only. Fires on NEW `!testme` comments only (the first
pass marks pre-existing comments seen)."""
repos = [r.strip() for r in os.environ.get("POLL_REPOS", CI_REPO).split(",") if r.strip()]
interval = int(os.environ.get("POLL_INTERVAL", "30"))
first = True
log(f"poller (primary) watching {repos} every {interval}s")
while True:
for full_name in repos:
owner, _, name = full_name.partition("/")
for pr in list_open_prs(full_name):
number = pr.get("number")
for c in list_comments(full_name, number):
if (c.get("body") or "").strip() != TRIGGER:
continue
cid = c.get("id")
if first:
_claim(cid) # mark pre-existing comments seen; don't fire on startup
continue
user = (c.get("user") or {}).get("login", "")
process_testme(full_name, owner, name, number, user, cid, "poll")
first = False
time.sleep(interval)
def main():
# Polling is the primary trigger; start it unconditionally.
threading.Thread(target=poll_loop, daemon=True).start()
host, _, port = os.environ.get("BRIDGE_LISTEN", "0.0.0.0:8080").rpartition(":")
srv = ThreadingHTTPServer((host or "0.0.0.0", int(port)), Handler)
log(f"comment-bridge listening on {host or '0.0.0.0'}:{port}")
log(f"comment-bridge listening on {host or '0.0.0.0'}:{port} (poll primary + optional webhook)")
srv.serve_forever()

View File

@ -41,11 +41,27 @@ If the recipe's own repo contains `tests/test_*.py`, the runner snapshots them r
runs them against the **live deployment** as a `recipe-local` stage. Contract: those tests receive
env `CCCI_BASE_URL` (e.g. `https://<app>.ci.commoninternet.net/`) and `CCCI_APP_DOMAIN`.
## 4. Register the trigger webhook
## 4. Add the repo to the bridge poll list
The trigger is **polling** (primary): add the repo's full name to the comment-bridge `POLL_REPOS`
csv (`modules/bridge.nix`) and `nixos-rebuild switch`. The bridge then polls that repo's open PRs
every 30s and fires a run on a new `!testme` comment from an authorized org member. This needs only
**read + comment** access — no webhook, no repo-admin.
Add the per-repo Gitea webhook so `!testme` on a PR starts a run (see the bridge / runbook). Then
`!testme` on a PR runs install/upgrade/backup + any recipe-local tests, and reports back to the PR.
### Optional: lower-latency webhook (admin-registered)
Polling already satisfies D1 (<60s). For lower latency an **admin** may *optionally* register a
Gitea `issue_comment` webhook (the bot does **not** self-register one that needs repo-admin):
- URL `https://ci.commoninternet.net/hook`, content-type `application/json`, event `Issue Comment`,
secret = the shared webhook HMAC (`secrets/secrets.yaml` `webhook_hmac`).
- The Gitea instance must allow the host (admin: add `ci.commoninternet.net` to the
`[webhook] ALLOWED_HOST_LIST`).
The webhook and poller are deduped by comment id, so a comment seen by both fires only once.
## Run locally
```sh

View File

@ -31,6 +31,11 @@ let
- DRONE_URL=https://drone.ci.commoninternet.net
- CI_REPO=recipe-maintainers/cc-ci
- BRIDGE_LISTEN=0.0.0.0:8080
# Polling is PRIMARY (outbound, read-only, always on); the /hook webhook is an optional
# admin-registered push optimization deduped against the poller (§4.1). Enrollment = add
# the repo to POLL_REPOS (csv) + ensure tests/<recipe>/ exists.
- POLL_INTERVAL=30
- POLL_REPOS=recipe-maintainers/cc-ci
- HMAC_FILE=/run/secrets/webhook_hmac
- DRONE_TOKEN_FILE=/run/secrets/drone_token
- GITEA_TOKEN_FILE=/run/secrets/gitea_token