memory: commit session notes (drone P0, weekly-upgrade-queued, mailu/index updates)

Per AGENTS.md 'Agent memory lives in memory/ (in this repo)' — memory notes
must be committed + pushed like any repo change, not left only in the local
~/.claude symlink target.
This commit is contained in:
autonomic-bot
2026-06-11 20:56:24 +00:00
parent e144354668
commit c33b21fe8d
4 changed files with 55 additions and 8 deletions

View File

@ -8,5 +8,7 @@
- [Shared recipe-checkout race](shared-recipe-checkout-race.md) — never git-checkout ~/.abra/recipes/<recipe> on cc-ci while its CI build runs; harness deploys from that tree
- [immich pgvecto.rs DROP DATABASE panic](immich-pgvectors-drop-database-panic.md) — DROP DATABASE crashes immich's postgres image; use pg_dump --clean --if-exists + search_path rewrite
- [Drone sqlite log extraction](drone-sqlite-log-extraction.md) — copy /data/database.sqlite from drone container, query builds→stages→steps→logs for full step output
- [plausible upgrade-base trap](plausible-upgrade-base-trap.md) — CI REDs from published 3.0.0 base (no x86_64 arch → 404 → silent exit 1), not the PR; needs UPGRADE_BASE_VERSION=3.0.1+v2.0.0 in cc-ci tests
- [plausible upgrade-base trap](plausible-upgrade-base-trap.md) — RESOLVED: PR#3 GREEN L4; lessons: check harness base version pre-!testme; backupbot v2 label syntax; TinyLog not FREEZEable; BEAM exit-0 needs restart_policy any
- [Swarm UpdateStatus convergence gotchas](swarm-updatestatus-convergence-gotchas.md) — N/N is not converged mid stop-first update; paused flag persists forever; only updating/rollback_started are active
- [drone phase P0 host deploy](drone-phase-p0-host-deploy.md) — orchestrator must nixos-rebuild cc-ci host (/etc/timezone, commit 3bde76f) before phase `drone` can run gitea
- [Weekly upgrade queued after phases](weekly-upgrade-queued-after-phases.md) — 06-12 cron skipped; auto-runs /upgrade-all when phase queue (drone) finishes; don'\''t systemctl start the timer

View File

@ -0,0 +1,18 @@
---
name: drone-phase-p0-host-deploy
description: "Phase `drone` P0 — orchestrator must deploy /etc/timezone host fix (nixos-rebuild on cc-ci) before gitea/drone work can run"
metadata:
node_type: memory
type: project
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
When the cc-ci phase queue reaches phase `drone` (or STATUS-drone.md says BLOCKED on P0),
the ORCHESTRATOR must deploy the committed host fix: cc-ci repo commit `3bde76f` adds
`environment.etc."timezone" = "UTC\n"` to `nix/hosts/cc-ci/configuration.nix`. Deploy =
sync `/root/cc-ci` on the CI host (operator-synced non-git copy, STALE re this commit)
with the current repo state, then `nixos-rebuild switch --flake /root/cc-ci#cc-ci` (ssh
alias `cc-ci`, root). Verify `test -f /etc/timezone` (content `UTC`). Do it at a quiet
CI moment (rebuild may restart services). Loops must NOT do host changes themselves —
plan: cc-ci-plan/plan-phase-drone-enroll.md §0. Queued 2026-06-11; phases before it:
bsky → dstamp → mailu → kuma.

View File

@ -18,10 +18,12 @@ immutable history.
**Why:** the PR adds 3.1.0 above the newest published tag — the harness-documented case
where `[-2]` is the wrong base and `[-1]` (3.0.1) is correct.
**How to apply:** the fix is one line in the cc-ci repo (gated by --with-tests / operator):
`tests/plausible/recipe_meta.py: UPGRADE_BASE_VERSION = "3.0.1+v2.0.0"`. The recipe-side
hardening (verified cached binary on the persistent volume, Altinity URL, retries+timeout,
loud hard-fail, depends_on fix) is on PR #3 (commit 9f8bcbc). Diagnosis + ask posted at
https://git.autonomic.zone/recipe-maintainers/plausible/pulls/3#issuecomment-14261.
Before burning a !testme on an upgrade-stage recipe, check what base version the harness
will pick and whether that base can actually converge. See [[abra-chaos-deploy-checkout-gotcha]].
**How to apply:** RESOLVED 2026-06-09 — `UPGRADE_BASE_VERSION = "3.0.1+v2.0.0"` is merged to
cc-ci main, and PR #3 went GREEN level 4 (build 247). Full debrief:
/srv/cc-ci/.cc-ci-logs/upgrades/plausible-upgrade-2026-06-09.md. Lasting lessons: before
burning a !testme on an upgrade-stage recipe, check what base version the harness picks
(`recipe_versions[-2]`) and whether that base can converge; backup-bot-two 2.4.0 ignores v1
`backupbot.backup.path` labels (use `backupbot.backup.volumes.<vol>.path`, dump INTO a
volume); clickhouse-backup cannot FREEZE TinyLog tables (ecto's schema_migrations — TSV
roundtrip needed); BEAM apps exit 0 on supervision escalation, so swarm `restart_policy`
must be `any`, not `on-failure`. See [[abra-chaos-deploy-checkout-gotcha]].

View File

@ -0,0 +1,25 @@
---
name: weekly-upgrade-queued-after-phases
description: Weekly /upgrade-all cron skipped for 2026-06-12; queued to auto-run when the current phase queue finishes (drone)
metadata:
node_type: memory
type: project
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
Operator (2026-06-11) cancelled tonight's weekly `/upgrade-all` cron run and queued it to
run once after the current phase queue (…mailu→drone) completes instead.
- `cc-ci-upgrade-all.timer` was STOPPED (couldn't `disable` — /etc/systemd is read-only).
Its persistent stamp was forwarded to `2026-06-12 03:00 UTC` so a reboot/nixos-rebuild
tonight schedules the NEXT run for 06-19, not a catch-up of tonight's slot.
- **GOTCHA:** `systemctl start cc-ci-upgrade-all.timer` fires the service IMMEDIATELY
(Persistent=true). Do NOT `start` it to re-arm — let a host reboot/`nixos-rebuild`
reactivate it (the [[drone-phase-p0-host-deploy]] rebuild will); the forward stamp
prevents a catch-up fire.
- Post-phase auto-run is wired in launch.py (commit 3fa3178): the watchdog launches
`launch-upgrader.py start` when the LAST phase reaches `## DONE`, gated by the one-shot
flag `/srv/cc-ci/.cc-ci-logs/.run-upgrade-on-complete` (set 2026-06-11, consumed on fire).
So when phase `drone` finishes, `/upgrade-all` starts automatically (upgrader on sonnet).
Once this fires (or if the plan changes), this memory is stale — delete it.