memory: move agent memory into repo (memory/), note in AGENTS.md

Persistent agent memories now live in memory/ in this repo; the Claude
auto-memory path is symlinked here so future memories land in the repo
and get committed like any other change.
This commit is contained in:
autonomic-bot
2026-06-09 19:25:20 +00:00
parent 330378d30d
commit 542ed0afe3
11 changed files with 194 additions and 0 deletions

View File

@ -85,6 +85,15 @@ cc-ci VM"). The orchestrator is the human's steering wheel; the loops are the en
Never commit secret values. `.testenv`, `*.tfstate`, `*.key`/`*.pem`, and the loop runtime/clone
dirs are gitignored. Reference secret *locations*, never their contents (`plan.md` §9).
## Agent memory lives in `memory/` (in this repo)
The orchestrator's persistent agent memory is the **`memory/`** directory of this repo — one file
per fact with frontmatter, indexed by `memory/MEMORY.md`. The Claude auto-memory path
(`~/.claude/projects/-srv-cc-ci-orch/memory`) is a **symlink** to it, so memories written the normal
way land in the repo automatically. **Future memories must also go there**: after writing or
updating a memory file (and its `MEMORY.md` index line), commit it here and push, like any other
intentional repo change. Never put secret values in a memory file (see Hard rule).
## Commit discipline
When the orchestrator, Builder, or assistant makes intentional repository changes here, commit them

11
memory/MEMORY.md Normal file
View File

@ -0,0 +1,11 @@
# Memory index
- [Orchestrator host: Hetzner](orchestrator-host-hetzner.md) — runs on Hetzner cpx22; rebuild cmd, loops-service bounce, git-identity gotcha
- [Push commits to remote](push-commits-to-remote.md) — push to git.autonomic.zone right after every commit in this repo
- [Regression canary cadence](regression-canary-cadence.md) — server E2E canaries run on polish/review/release, not every commit
- [Recipe-mirrors public / org blocker](recipe-mirrors-public-org-blocker.md) — mirrors public but recipe-maintainers ORG is private → live PR-STATUS column dark until operator flips org public
- [abra chaos-deploy checkout gotcha](abra-chaos-deploy-checkout-gotcha.md) — `abra app new` moves recipe checkout to release tag; checkout PR branch after, or chaos deploys wrong tree
- [Shared recipe-checkout race](shared-recipe-checkout-race.md) — never git-checkout ~/.abra/recipes/<recipe> on cc-ci while its CI build runs; harness deploys from that tree
- [immich pgvecto.rs DROP DATABASE panic](immich-pgvectors-drop-database-panic.md) — DROP DATABASE crashes immich's postgres image; use pg_dump --clean --if-exists + search_path rewrite
- [Drone sqlite log extraction](drone-sqlite-log-extraction.md) — copy /data/database.sqlite from drone container, query builds→stages→steps→logs for full step output
- [plausible upgrade-base trap](plausible-upgrade-base-trap.md) — CI REDs from published 3.0.0 base (no x86_64 arch → 404 → silent exit 1), not the PR; needs UPGRADE_BASE_VERSION=3.0.1+v2.0.0 in cc-ci tests

View File

@ -0,0 +1,22 @@
---
name: abra-chaos-deploy-checkout-gotcha
description: "abra app new moves the recipe checkout to the release tag — checkout the PR branch AFTER app new, or chaos deploys the wrong tree"
metadata:
node_type: memory
type: project
originSessionId: fc17c9c2-ab6e-4c11-856e-a6a6e160a0ec
---
On cc-ci, `abra app new <recipe>` checks out the latest *published release tag* in
`~/.abra/recipes/<recipe>`, silently discarding whatever commit you had checked out. A
subsequent `abra app deploy --chaos` then deploys that tag's tree, not your WIP.
**Why:** abra pins app creation to the recipe's released version and moves the recipe
checkout to do it; `--chaos` only means "deploy the working tree as-is at deploy time".
**How to apply:** in the step-2b direct-deploy loop, order matters: `abra app new` first,
*then* `git checkout <PR-branch>` in the recipe dir, then `abra app deploy --chaos`.
Verify with the deploy overview (config versions / images) that the intended tree went out.
Also: plausible's `.env.sample` ships `DISABLE_AUTH/DISABLE_REGISTRATION=replace-me`, which
crash-loops the app (`binary_to_existing_atom("replace-me")`) — set them to true/false in
any dev env. See [[regression-canary-cadence]] for related CI cadence.

View File

@ -0,0 +1,14 @@
---
name: drone-sqlite-log-extraction
description: How to read full drone CI step logs on cc-ci — copy /data/database.sqlite from the drone container and query it
metadata:
node_type: memory
type: reference
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
Drone on cc-ci has no on-disk logs and no API token handy. To get full step logs:
1. `ssh cc-ci 'docker cp $(docker ps -qf name=drone):/data/database.sqlite /tmp/drone.sqlite'` then scp to orchestrator (no python3 on cc-ci PATH).
2. Query with python3 sqlite3: `builds` (build_number → build_id) → `stages` (stage_build_id) → `steps` (step_stage_id) → `logs` where log_id = step_id; `log_data` is a JSON array of `{pos,out,time}` lines.
**Why:** this is how the real root cause of immich CI builds 229/230 ("bash: /pg_backup.sh: No such file or directory" in the backup hook) was found after results.json/junit gave only the assertion failure. Related: [[shared-recipe-checkout-race]]

View File

@ -0,0 +1,14 @@
---
name: immich-pgvectors-drop-database-panic
description: "Never DROP DATABASE on immich's postgres image — pgvecto.rs worker PANICs and crashes postgres; use pg_dump --clean --if-exists instead"
metadata:
node_type: memory
type: project
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
On immich's DB image (ghcr.io/immich-app/postgres:14-vectorchord0.4.3-pgvectors0.2.0), `DROP DATABASE` destabilises the legacy pgvecto.rs (`vectors`) background worker: it loops on "IPC connection is closed unexpected" until `PANIC: ERRORDATA_STACK_SIZE exceeded` → postgres aborts (signal 6) → the app never reconverges. Per-table `DROP TABLE` is safe; only `DROP DATABASE` triggers it.
**Why:** confirmed live in dev-immich and in CI build 225 DB-service logs during the immich backup/restore fix (PR #2, June 2026).
**How to apply:** for a true point-in-time restore without dropping the DB, back up with `pg_dump --clean --if-exists` (per-object DROP+recreate) and on restore rewrite pg_dump's `set_config('search_path', '', false)` to `'public, pg_catalog', true` (VectorChord types unresolvable otherwise — same rewrite as docs.immich.app/administration/backup-and-restore). See the recipe's pg_backup.sh. Related: [[shared-recipe-checkout-race]], [[drone-sqlite-log-extraction]]

View File

@ -0,0 +1,26 @@
---
name: orchestrator-host-hetzner
description: The cc-ci orchestrator runs on a Hetzner cpx22; key host facts + the git-identity gotcha
metadata:
node_type: memory
type: project
originSessionId: cd772f12-1978-47c3-894b-0ebbe0d2987f
---
The cc-ci orchestrator (loops + watchdog + this session) runs on a **Hetzner cpx22** as of
2026-05-31, replacing the Incus VM (100.116.55.106).
- Hetzner server **134487234**, public **168.119.126.100**, tailnet **cc-ci-orchestrator-1** @
**100.84.190.30**. Flake host **cc-ci-orchestrator-hetzner**.
- Rebuild: `sudo nixos-rebuild switch --flake .#cc-ci-orchestrator-hetzner` from `/srv/cc-ci-orch`
(`/srv/cc-ci` is a symlink to it). The Bash tool runs as user **loops** (uid 1000, passwordless
sudo) — plain `nixos-rebuild switch` fails on the profile symlink; use `sudo`.
- Reboot-resilience: `cc-ci-loops.service` is **enabled** (wantedBy multi-user.target); ExecStartPre
`reboot-log.sh` auto-logs reboots to REBOOTS.md. Its `script` runs `launch.sh start`, which
**stops+restarts the loops** — so any rebuild that (re)starts the unit bounces the loops (they
re-orient from git; harmless but noticeable).
- **Git-identity gotcha:** the box had no git user.name/email configured; commits fail with "Author
identity unknown". Set per-repo to match prior commits: `autonomic-bot
<autonomic-bot@git.autonomic.zone>`.
Full record: `cc-ci-plan/plan-orchestrator-hetzner-migration.md`.

View File

@ -0,0 +1,27 @@
---
name: plausible-upgrade-base-trap
description: "plausible CI REDs come from the published 3.0.0 base deploy (no x86_64 arch → 404 → silent exit 1), not the PR tree; needs UPGRADE_BASE_VERSION=3.0.1+v2.0.0 in cc-ci tests"
metadata:
node_type: memory
type: project
originSessionId: fc17c9c2-ab6e-4c11-856e-a6a6e160a0ec
---
cc-ci's upgrade tier deploys `recipe_versions[-2]` as the base before upgrading to the PR
head (deploy-once design: the install tier asserts against that base too). For plausible,
tags are `…, 3.0.0+v2.0.0, 3.0.1+v2.0.0` so the default base is **3.0.0+v2.0.0**, whose
entrypoint lacks an x86_64 ARCH mapping → requests `clickhouse-backup-linux-x86_64.tar.gz`
→ HTTP 404 always → `set -e` + silenced wget → container exits 1 with **empty service
logs** → crash-loop → install timeout RED. Nothing in the PR can fix this: the base tag is
immutable history.
**Why:** the PR adds 3.1.0 above the newest published tag — the harness-documented case
where `[-2]` is the wrong base and `[-1]` (3.0.1) is correct.
**How to apply:** the fix is one line in the cc-ci repo (gated by --with-tests / operator):
`tests/plausible/recipe_meta.py: UPGRADE_BASE_VERSION = "3.0.1+v2.0.0"`. The recipe-side
hardening (verified cached binary on the persistent volume, Altinity URL, retries+timeout,
loud hard-fail, depends_on fix) is on PR #3 (commit 9f8bcbc). Diagnosis + ask posted at
https://git.autonomic.zone/recipe-maintainers/plausible/pulls/3#issuecomment-14261.
Before burning a !testme on an upgrade-stage recipe, check what base version the harness
will pick and whether that base can actually converge. See [[abra-chaos-deploy-checkout-gotcha]].

View File

@ -0,0 +1,14 @@
---
name: push-commits-to-remote
description: "Operator wants every commit pushed to git.autonomic.zone right after it's made"
metadata:
node_type: memory
type: feedback
originSessionId: 7b5366a6-263c-421b-be7d-9f888067336b
---
In the cc-ci orchestrator repo (`/srv/cc-ci-orch`), push to `origin` (git.autonomic.zone/recipe-maintainers/cc-ci-orchestrator) immediately after committing — don't leave commits sitting locally waiting to be asked.
**Why:** the operator treats the remote as the source of truth / backup; local-only commits are a loss risk on this autonomous box.
**How to apply:** after any `git commit` here, run `git push origin main` (or the current branch) in the same turn. The remote is already credentialed in the URL. Mind the [[orchestrator-host-hetzner]] git-identity gotcha (commit as `autonomic-bot`). This standing preference replaces the default "commit/push only when asked" for this repo.

View File

@ -0,0 +1,29 @@
---
name: recipe-mirrors-public-org-blocker
description: "Recipe mirrors are public repos but the recipe-maintainers ORG is private-visibility, so anon reads 404; bot can't flip the org"
metadata:
node_type: memory
type: project
originSessionId: f7960036-d990-4a21-a81e-f7c486d97fea
---
As of 2026-06-09 all 21 recipe mirrors under `recipe-maintainers` were flipped `private=false`
(secret-scanned first), to power the Recipe Report's live PR-STATUS column via the tokenless
same-origin proxy `report.ci.commoninternet.net/pr/<recipe>/<n>` (shipped in cc-ci
`nix/modules/reports.nix`). BUT the **org itself is `visibility: private`**, which makes Gitea 404
all its repos for anonymous users — so the live STATUS column shows a muted "?" instead of open/✓.
**Blocker:** `autonomic-bot` cannot flip the org (PATCH `/orgs/recipe-maintainers` → 403 "Must be an
organization owner"; `is_admin=false`; the basic-auth credential lacks `write:organization` scope,
even though the bot is in the Owners team). Confirmed model: `autonomic-cooperative` is a public org
and its repos ARE anonymously visible; `recipe-maintainers` is private and they are not.
**Why:** the whole live-status feature is dark until this is resolved. Private repos stay hidden even
in a public org, so flipping the org public does NOT expose the four locked-private repos (`cc-ci`,
`cc-ci-secrets`, `cc-ci-orchestrator`, `archived-cc-ci-orchestrator`).
**How to apply:** operator (an org owner) must set `recipe-maintainers` org visibility to **public**
in the Gitea UI (Settings → make org public), OR provision a token with `write:organization` scope.
The instant that happens, the proxy returns 200 PR JSON and the column lights up — no redeploy needed.
Verify: `curl https://report.ci.commoninternet.net/pr/cryptpad/5` should return PR JSON, not a 404.
Related: [[push-commits-to-remote]].

View File

@ -0,0 +1,14 @@
---
name: regression-canary-cadence
description: "The cc-ci server regression canaries are expensive — run on polish/review/release, not every commit"
metadata:
node_type: memory
type: feedback
originSessionId: 7b5366a6-263c-421b-be7d-9f888067336b
---
The cc-ci **server regression canaries** (the codified E2E pytest suite — full lifecycle on `custom-html-tiny` + `lasuite-docs` good canaries plus a known-bad false-green-guard fixture; plan: `cc-ci-plan/plan-server-regression-canaries.md`) must **NOT** run on every commit/PR.
**Why:** they're slow and resource-heavy — full lifecycle on lasuite-docs is minutes and needs the live server/abra/Swarm. Running them per-commit would be wasteful and slow the loop.
**How to apply:** run them **deliberately at milestones** — polishing passes, code reviews, and releases of the cc-ci server — before trusting a batch of changes, not per incremental commit. Keep them opt-in behind the `@pytest.mark.canary` marker; if ever wired to `!testme` on the cc-ci repo, gate behind a deliberate trigger (label / `--canary`), never an automatic run on every PR.

View File

@ -0,0 +1,14 @@
---
name: shared-recipe-checkout-race
description: Never run git checkout on ~/.abra/recipes/<recipe> on cc-ci while a CI build for that recipe is running — the harness chaos-deploys from that same working tree
metadata:
node_type: memory
type: feedback
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
The cc-ci harness (run_recipe_ci.py) deploys the upgrade tier from the SHARED `~/.abra/recipes/<recipe>` working tree on cc-ci via `abra app deploy --chaos`. Dev debugging that switches that checkout (`git checkout -f`, repro scripts) while a CI build runs makes CI deploy the wrong tree.
**Why:** immich builds 229/230 went RED with "bash: /pg_backup.sh: No such file or directory" — the configs stanza wasn't in the tree CI deployed, because concurrent dev repro scripts were flipping the same checkout between base tag and PR head. A faithful manual repro with no concurrent churn mounted the config fine.
**How to apply:** before triggering !testme, park the recipe checkout clean at the PR head and do zero abra/git activity on cc-ci for that recipe until the build verdicts. Also remember [[abra-chaos-deploy-checkout-gotcha]] (`abra app new` moves the checkout to the release tag). Related: [[drone-sqlite-log-extraction]], [[immich-pgvectors-drop-database-panic]]