Commit Graph

119 Commits

Author SHA1 Message Date
bd0a565680 review+inbox(canon): DEFECT-3 — real timer fire reds gitea on MISSING git-lfs in nightly-sweep.service runtimeInputs (same class as bash gap); manual sweep env (had git-lfs, gitea cold-green) != production timer env → M2.2 promote evidence must be re-validated under the real timer; heads-up sent
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 12:57:58 +00:00
930335972a chore(canon): consume BUILDER-INBOX (gitea 3.6.0 advance — fixing; drone promoted clean)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 12:00:53 +00:00
a6c506844a review+inbox(canon): final-sweep crux — drone PROMOTED CLEAN (residue fix works, DEFECT-2 closing) but gitea 3.6.0 advance FAILED AGAIN (GREEN-BUT-PROMOTE-FAILED, canon kept 3.5.3) → CLAIM-BLOCKER for M2.6 (advance undemonstrated) + M2.3 (green recipe re-runs, not a red); heads-up sent
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 11:59:14 +00:00
fb2fe307dc chore(canon): consume BUILDER-INBOX (concurrent-sweep alert — killing wedged old sweep, will re-run clean serial)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 10:21:42 +00:00
4d5b03b485 inbox+review(canon): TWO concurrent sweeps — wedged old sweep (PID1712141, drone deadlock child ~46m) still alive alongside new re-run (PID1736506); violates §4 serial + breaks release_app_locks precondition; M2 evidence from overlapping run not acceptable
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 10:20:49 +00:00
24579383f4 fix(canon): mirror-sync detects upstream default branch (master vs main)
All checks were successful
continuous-integration/drone/push Build is passing
Adversary-flagged: drone/gitea mirror-sync hit rc=128 ('couldn't find remote ref main') —
coopcloud/coop-cloud/{drone,gitea} use `master`, not `main`. The script hardcoded
`git fetch upstream main` → sync skipped (non-fatal) so the mirror wasn't reconciled (the trigger
still used correct upstream tags from the local abra-fetch clone, so the version tested was right;
only the mirror push was missed). Now resolves the upstream HEAD symref and fetches that branch,
force-pushing it to the mirror's `main`. Consumes BUILDER-INBOX.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 09:37:24 +00:00
d9987a0fbf inbox(canon): heads-up to Builder before M2 claim — (1) drone mirror-sync rc=128 swallowed (clarify §2.C); (2) determinism run-twice-skip-all vs red/promote-failed recipes (reconcile in claim evidence)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 09:35:35 +00:00
4cf1b32f4c chore(canon): consume BUILDER-INBOX (promote failing ~4/5 + misleading PASS label — diagnosing)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:41:28 +00:00
ba28a8897a inbox(canon): heads-up — sweep logs PASS(promoted) but 4/5 promotes FAILED (only cryptpad wrote a canonical); label derives from rc not record; determinism M2.3 at risk
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:40:16 +00:00
0f2f57b5ca chore(canon): consume BUILDER-INBOX (discourse wedge heads-up; will time out → RED → sweep continues)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:17:27 +00:00
7ca77f95ca inbox(canon): heads-up — M2.2 sweep stuck on discourse ~51m (abra deploy hung, 0 containers, ~08:24Z timeout); canonical count 2
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 08:15:59 +00:00
a3d115d6e3 diagnose(regall): A-regall-2 root cause — recipe bug in 3.0.1+v2.0.0, NOT prevb
All checks were successful
continuous-integration/drone/push Build is passing
backupbot.backup.path: "/postgres.dump.gz" places dump in container writable
layer (not a volume), so restic never captures it. Restore post-hook fails
with "No such file or directory". PR#3 (3.1.0+v2.0.0) fixes this with
backupbot.backup.volumes.db-data.path. Baseline run 658 tested PR#3 (working
mechanism), not 3.0.1+v2.0.0 (broken). Re-opened PR#3 + !testme triggered
(comment 14651) to demonstrate backup_restore=pass. BUILDER-INBOX consumed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 02:58:06 +00:00
3edd0713d2 review(regall): A-regall-2 CONFIRMED — plausible backup_restore=fail 2/2 (genuine regression)
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
Runs 750 and 754 both fail: ci_marker absent after restore.
No-op upgrade (3.0.1+v2.0.0→3.0.1+v2.0.0) via UPGRADE_BASE_VERSION path is prevb-specific.
Baseline run 658 had genuine git-ref upgrade and passed L5.

Builder-INBOX written. M1 blocked pending plausible fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-17 02:34:04 +00:00
7c6134a773 fix(regall): correct mailu baseline upgrade=pass (A-regall-1); consume Adversary inbox; batch 2 in flight
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 02:05:42 +00:00
4ad3c9d907 review(regall): BP-1 baseline verified (A-regall-1: mailu upgrade=pass not skip); BP-2 upgrade-base=main-tip confirmed; batch-1 all L5
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 02:04:48 +00:00
e1b32ea650 fix(prevb): prune orphan services on upgrade redeploy (head's dropped services); re-add EXPECTED_NA-other-rung test; consume Adversary inbox
All checks were successful
continuous-integration/drone/push Build is passing
docker stack deploy doesn't prune services the head compose dropped (discourse PR#4 drops sidekiq),
leaving them orphaned on the base image. perform_upgrade now reconciles the live stack to the head
compose service set (lifecycle.prune_orphan_services). Makes the deployed stack faithfully reflect
the head — no test weakened. No-op when service sets match / compose unresolvable.
2026-06-17 00:29:00 +00:00
7f3e7c26f6 recon(prevb): M1 code pre-review (sound; 63 prevb unit tests pass cold) + builder heads-up (pre-existing red test)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-17 00:27:06 +00:00
d832b353e4 fix(gtea): UPGRADE_SECRET_PREP hook — pre-insert lfs_jwt_secret with correct 43-char format
Some checks failed
continuous-integration/drone/push Build is failing
Blocker 4 fix: abra `secret generate --all` uses .env.sample for length hints; the
lfs-plain-gitea PR has SECRET_LFS_JWT_SECRET_VERSION=v1 COMMENTED OUT, so abra produces
a wrong-length secret. gitea requires exactly 43 chars (32 bytes base64 URL-safe); wrong
length → gitea fatals trying to save the JWT secret to the read-only Docker Config
app.ini → health check fails → swarm rolls back.

Fix: new UPGRADE_SECRET_PREP hook (meta.py) called before `abra secret generate --all`
in the upgrade path. abra's `--all` is idempotent (skips existing secrets), so the
correctly pre-inserted secret survives. gitea's recipe_meta.py implements the hook using
`docker secret create` directly to guarantee correct format regardless of .env.sample.

Also consumes machine-docs/BUILDER-INBOX.md (Adversary Blocker 4 digest).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 21:46:28 +00:00
1efab2e1e6 review(gtea): M2 re-verify — #684 PASS, #685 FAIL (LFS upgrade rollback blocker)
Some checks failed
continuous-integration/drone/push Build is failing
Build #684 (RECIPE=gitea REF=main PR=0): PASS level=5 — all tiers pass, LFS correctly
SKIP on main, HC1 SHA match (e6a1cc79=e6a1cc79). M2 main-branch DoD MET.

Build #685 (RECIPE=gitea PR=1 REF=357926f26e69): FAIL level=1 — new critical blocker:
upgrade chaos redeploy to PR head with compose.lfs.yml fails with rollback_completed.
Root cause: lfs_jwt_secret generated by abra --all with wrong length/format because
.env.sample in PR #1 has `SECRET_LFS_JWT_SECRET_VERSION=v1 # length=43` COMMENTED OUT.
Gitea starts but fails health check on bad JWT secret → Docker swarm rolls back.

Also filed: cc-ci self-test lint failures (9 ruff format violations in gtea files),
drone dep path not re-verified via live CI since a121d2c.

M2 still NOT claimable — Builder must fix lfs_jwt_secret generation and re-trigger #685.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 21:30:42 +00:00
a121d2c069 fix(gtea): fix M2 blockers — LFS upgrade and REF=main HC1
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone Build is failing
Blocker 1 (LFS roundtrip fails on PR #1):
- Add UPGRADE_EXTRA_ENV to gitea recipe_meta.py — after PR-head checkout
  (compose.lfs.yml now in ABRA_DIR), add compose.lfs.yml to COMPOSE_FILE
  and set SECRET_LFS_JWT_SECRET_VERSION=v1 so the upgrade chaos redeploy
  actually runs with LFS enabled. Without this, the base install checks out
  the 3.5.x tag (compose.lfs.yml removed), EXTRA_ENV sees no LFS, and the
  upgrade chaos redeploy inherits the no-LFS .env — so the LFS test runs
  (compose.lfs.yml is restored by recipe_checkout_ref) but LFS is off.
- Add abra.secret_generate(domain) in generic.perform_upgrade when
  upgrade_env is non-empty — generates lfs_jwt_secret before chaos redeploy.

Blocker 2 (REF=main upgrade fails HC1):
- Always use recipe_head_commit (git rev-parse HEAD) for head_ref instead
  of using ref directly. When ref="main" (a branch name), the HC1 commit
  check "head_ref.startswith(chaos_commit)" always fails since "main" ≠ SHA.
  recipe_head_commit returns the actual SHA after the fetch/checkout.

Side-fix (stale creds — build #675):
- ops.py pre_install: delete the per-domain creds file before calling
  _ensure_admin. A fresh install wipes gitea's DB; any creds file from a
  prior run on the same domain is stale and causes 401s in all API calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 21:01:21 +00:00
05bf5d5264 review(gtea): file M2 blockers to Builder-INBOX — LFS deploy + upgrade-REF=main
Some checks failed
continuous-integration/drone/push Build is failing
Two critical issues prevent M2: (1) lfs_jwt_secret not generated via disk .env → LFS disabled in
container; (2) upgrade tier fails when REF=main. Details + fix hints in BUILDER-INBOX.md.
2026-06-15 20:53:34 +00:00
446bafe408 inbox(gtea): consume BUILDER-INBOX (Adversary pre-M1 findings addressed)
Some checks failed
continuous-integration/drone/push Build is failing
Both issues fixed in 893a7b0:
- Issue 1 (git-lfs missing): added to nix/hosts/cc-ci/configuration.nix systemPackages
- Issue 2 (double /api/v1): fixed path in test_lfs_roundtrip.py restart poll

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 20:01:50 +00:00
4a4b75661e inbox(gtea): heads-up to Builder — git-lfs absent on cc-ci (M2 blocker) + double /api/v1 bug in LFS test
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-15 19:58:17 +00:00
d12d8a12ca inbox(poe2e): consume BUILDER-INBOX; take JOURNAL ownership (baseline preserved); set up STATUS/BACKLOG; heads-up to Adversary
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 19:30:10 +00:00
62efd76bc1 chore(poe2e): init Adversary phase files — D5 baseline snapshot, awaiting Builder
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 19:27:09 +00:00
b97d1e5345 inbox: remove orphan pxgate cold-boot note (phase already DONE; loops stopped) — evidence in orchestrator JOURNAL
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 13:52:55 +00:00
f09b7bf21f inbox(pxgate): cold-boot proof PASSED — deploy-proxy active 11s before dashboard on real reboot
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 13:52:13 +00:00
162f731e91 status(pxgate): ## DONE — M1+M2 PASS, cycle broken, cold-boot sim confirms no deadlock
Some checks failed
continuous-integration/drone/push Build is failing
M2 verified: nixos-rebuild @13:43Z deployed /api/version probe; deploy-proxy
active(exited) in 279ms (nixos-rebuild) and 17ms (cold-boot sim) — no alert, no
deadlock. All 9 services 1/1. Running server unaffected. Adversary PASS @13:44Z.
BUILDER-INBOX consumed.
2026-06-13 13:47:42 +00:00
927cbfa747 inbox(pxgate): orchestrator completed M2 nixos-rebuild — deploy-proxy on /api/version, cycle broken
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 13:45:39 +00:00
0e9fd388d2 claim(pxgate-M1): change traefik health probe to /api/version (A1 cycle fix)
Some checks failed
continuous-integration/drone/push Build is failing
Break the deploy-proxy ↔ dashboard health-gate circular dependency (Adversary A1, pvfix):

- runner/warm_reconcile.py: remove health_domain override (was ci.commoninternet.net,
  the dashboard). Change health_path from / to /api/version. The probe now uses
  traefik.ci.commoninternet.net/api/version — traefik's own API, no backend/dashboard dep.
- nix/modules/proxy.nix: update comment to reflect new health probe.
- machine-docs/DECISIONS.md: pxgate fix logged (supersedes pvfix manual workaround).
- machine-docs/DEFERRED.md: 2026-06-13 circular-dependency entry closed.
- Consumed BUILDER-INBOX.md (Adversary orientation msg).

Controlled reproduction (dashboard swarm scaled to 0):
  OLD probe (ci.commoninternet.net): HTTP 404  ← gate would loop → timeout
  NEW probe (traefik.../api/version): HTTP 200  ← passes immediately
Stale false-alarm alert 20260613T054428Z-traefik-unhealthy-on-latest.json cleared on host.
No After=deploy-proxy consumers changed (ordering preserved).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-13 12:46:34 +00:00
c798292598 chore(pxgate): BUILDER-INBOX — orientation done, live bug proven
Some checks failed
continuous-integration/drone/push Build is failing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-13 12:43:32 +00:00
f73bcf225e inbox(cf55): consume adversary launcher mismatch note
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 04:13:36 +00:00
d1fc6b9747 review(cf55): record launcher mismatch blocker
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-13 04:12:38 +00:00
87928a9096 status(cfold): seed phase state and consume inbox
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-12 15:57:50 +00:00
8fba68e27c review(cfold): record cold pre-claim audit
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-12 15:57:02 +00:00
87566b1c95 review(cfold): note missing phase status file
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-12 15:55:55 +00:00
7e7e84df34 fix(drone): ADV-drone-01 — no-follow redirect pattern in SCM test
Some checks failed
continuous-integration/drone/push Build is failing
test_scm_configured.py was following ALL redirects via urlopen; gitea redirects
unauthenticated users from /login/oauth/authorize → /user/login, so the path
assertion always failed even for a correctly-wired drone.

Fix: _CaptureOneRedirect urllib handler stops after drone's first 303 and reads
the Location header directly, before gitea's own redirect chain runs.

- Consume BUILDER-INBOX.md (ADV-drone-01 finding delivered and addressed)
- Close ADV-drone-01 in BACKLOG-drone.md
- Update test_gitea_dep.py terminology: "location_url" not "final_url"
- All 10 unit tests pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-11 21:48:36 +00:00
d20bffd597 review(drone): BUILDER-INBOX — ADV-drone-01 critical, fix before M1 claim
Some checks failed
continuous-integration/drone/push Build is failing
2026-06-11 21:43:40 +00:00
89dec5188f inbox(rcust): consumed 01:12Z be2026a-cleared note; bluesky-pds filed in DEFERRED.md as non-rcust upstream image breakage (justified M2 exclusion, A/B-proven harness-neutral)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 01:00:32 +00:00
24a203a098 review(rcust): be2026a fix-forward CLEARED (all 3 conditions met, independently verified) + ACCEPT L5≡L4+OIDC-pass equivalence — lasuite-* L5 baselines stale (c51cd84 4-rung predates rcust, git-proven), rcust innocent, OIDC coverage preserved. Consumed 01:10Z inbox. M2 still open: bluesky upstream-breakage note, drone-path runs, zero-leak, my sample re-check
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 00:59:29 +00:00
914c1663b5 inbox(rcust): consumed 00:31Z conditional APPROVE — merging be2026a, post-merge lasuite-drive re-run queued behind discourse A/B pair
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 00:33:07 +00:00
a531746e53 review(rcust): APPROVE fix-forward be2026a (services_converged completed-one-shot rule) — cold-verified diff+7 tests+199 unit+lint on fresh checkout, no false-green path (HTTP floor + minio custom test independent); conditional on post-merge lasuite-drive L5 + merged-diff==branch-diff + discourse PR=2 A/B cold re-check. Consumed 00:40Z inbox
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 00:31:54 +00:00
1ec0e772e8 inbox(rcust): consumed 23:53Z asks — lasuite-drive proof RUNNING, discourse same-ref 2x2 queued (new-main PR=2 + old-main PR=2 @7ae7b0f); m2b-discourse HC1 facts pinned (re-checkout persisted, eb96de94=base tag, sidekiq line benign); bluesky-pds = upstream image breakage (MODULE_NOT_FOUND x3, harness-neutral)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-11 00:06:13 +00:00
40b59b356b review(rcust): M2 proof-run cold analysis — 3/6 (immich/mattermost/plausible) reproduce baseline L4 at baseline ref on merged main (restructure innocent); discourse L4->L1 upgrade-HC1 at baseline ref UNexplained (A/B was at wrong ref) + lasuite-drive needs fresh L5 post-fix-forward; M2 OPEN
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 23:54:36 +00:00
efd7efc32b inbox(rcust): consumed 20:53Z approval — fix-forward pushed as 57c66ad; proof re-run at baseline REF queued behind tests 2+3
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 20:53:52 +00:00
57c66add51 review(rcust): APPROVE lasuite-drive pre_install fix-forward (scoped to line-54 bucket-poll raise→best-effort; verified old=best-effort, custom MinIO test is real gate, no coverage loss); conditioned on L5 re-run + my diff re-verify. Auditing other shell->python hook ports for same drift
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 20:52:53 +00:00
b9abf48116 inbox(rcust): consumed 20:33Z ACK — ref-mismatch independently confirmed; tests 2+3 concurred; proceeding
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 20:34:36 +00:00
4cb1f57e2c inbox(rcust): consumed Builder 20:35Z ref-mismatch heads-up + ACK — independently confirmed sweep ran default-branch heads (7d53d4ec/da159375) != baseline PR refs; concur tests 2+3 separate harness×content; will run own cold A/B at claim
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 20:33:56 +00:00
41033b4500 inbox(rcust): consumed 20:15Z follow-up — restore cluster confirmed pre-existing, VETO threat withdrawn; proceeding to satisfy the 4 M2 PASS conditions (re-runs at baseline, canary+zero-leak, log sample, !testme x2)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 20:19:12 +00:00
a7a558ada3 note(rcust): M2 follow-up — confirmed restore cluster is the PRE-EXISTING truncated-dump race (documented in discourse BACKUP_VERIFY docstring on pre-merge 49fb818); VETO-threat withdrawn; stated M2 PASS conditions (re-runs at baseline + spot-checks)
All checks were successful
continuous-integration/drone/push Build is passing
2026-06-10 20:18:26 +00:00