memory: move agent memory into repo (memory/), note in AGENTS.md
Persistent agent memories now live in memory/ in this repo; the Claude auto-memory path is symlinked here so future memories land in the repo and get committed like any other change.
This commit is contained in:
@ -85,6 +85,15 @@ cc-ci VM"). The orchestrator is the human's steering wheel; the loops are the en
|
||||
Never commit secret values. `.testenv`, `*.tfstate`, `*.key`/`*.pem`, and the loop runtime/clone
|
||||
dirs are gitignored. Reference secret *locations*, never their contents (`plan.md` §9).
|
||||
|
||||
## Agent memory lives in `memory/` (in this repo)
|
||||
|
||||
The orchestrator's persistent agent memory is the **`memory/`** directory of this repo — one file
|
||||
per fact with frontmatter, indexed by `memory/MEMORY.md`. The Claude auto-memory path
|
||||
(`~/.claude/projects/-srv-cc-ci-orch/memory`) is a **symlink** to it, so memories written the normal
|
||||
way land in the repo automatically. **Future memories must also go there**: after writing or
|
||||
updating a memory file (and its `MEMORY.md` index line), commit it here and push, like any other
|
||||
intentional repo change. Never put secret values in a memory file (see Hard rule).
|
||||
|
||||
## Commit discipline
|
||||
|
||||
When the orchestrator, Builder, or assistant makes intentional repository changes here, commit them
|
||||
|
||||
11
memory/MEMORY.md
Normal file
11
memory/MEMORY.md
Normal file
@ -0,0 +1,11 @@
|
||||
# Memory index
|
||||
|
||||
- [Orchestrator host: Hetzner](orchestrator-host-hetzner.md) — runs on Hetzner cpx22; rebuild cmd, loops-service bounce, git-identity gotcha
|
||||
- [Push commits to remote](push-commits-to-remote.md) — push to git.autonomic.zone right after every commit in this repo
|
||||
- [Regression canary cadence](regression-canary-cadence.md) — server E2E canaries run on polish/review/release, not every commit
|
||||
- [Recipe-mirrors public / org blocker](recipe-mirrors-public-org-blocker.md) — mirrors public but recipe-maintainers ORG is private → live PR-STATUS column dark until operator flips org public
|
||||
- [abra chaos-deploy checkout gotcha](abra-chaos-deploy-checkout-gotcha.md) — `abra app new` moves recipe checkout to release tag; checkout PR branch after, or chaos deploys wrong tree
|
||||
- [Shared recipe-checkout race](shared-recipe-checkout-race.md) — never git-checkout ~/.abra/recipes/<recipe> on cc-ci while its CI build runs; harness deploys from that tree
|
||||
- [immich pgvecto.rs DROP DATABASE panic](immich-pgvectors-drop-database-panic.md) — DROP DATABASE crashes immich's postgres image; use pg_dump --clean --if-exists + search_path rewrite
|
||||
- [Drone sqlite log extraction](drone-sqlite-log-extraction.md) — copy /data/database.sqlite from drone container, query builds→stages→steps→logs for full step output
|
||||
- [plausible upgrade-base trap](plausible-upgrade-base-trap.md) — CI REDs from published 3.0.0 base (no x86_64 arch → 404 → silent exit 1), not the PR; needs UPGRADE_BASE_VERSION=3.0.1+v2.0.0 in cc-ci tests
|
||||
22
memory/abra-chaos-deploy-checkout-gotcha.md
Normal file
22
memory/abra-chaos-deploy-checkout-gotcha.md
Normal file
@ -0,0 +1,22 @@
|
||||
---
|
||||
name: abra-chaos-deploy-checkout-gotcha
|
||||
description: "abra app new moves the recipe checkout to the release tag — checkout the PR branch AFTER app new, or chaos deploys the wrong tree"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: project
|
||||
originSessionId: fc17c9c2-ab6e-4c11-856e-a6a6e160a0ec
|
||||
---
|
||||
|
||||
On cc-ci, `abra app new <recipe>` checks out the latest *published release tag* in
|
||||
`~/.abra/recipes/<recipe>`, silently discarding whatever commit you had checked out. A
|
||||
subsequent `abra app deploy --chaos` then deploys that tag's tree, not your WIP.
|
||||
|
||||
**Why:** abra pins app creation to the recipe's released version and moves the recipe
|
||||
checkout to do it; `--chaos` only means "deploy the working tree as-is at deploy time".
|
||||
|
||||
**How to apply:** in the step-2b direct-deploy loop, order matters: `abra app new` first,
|
||||
*then* `git checkout <PR-branch>` in the recipe dir, then `abra app deploy --chaos`.
|
||||
Verify with the deploy overview (config versions / images) that the intended tree went out.
|
||||
Also: plausible's `.env.sample` ships `DISABLE_AUTH/DISABLE_REGISTRATION=replace-me`, which
|
||||
crash-loops the app (`binary_to_existing_atom("replace-me")`) — set them to true/false in
|
||||
any dev env. See [[regression-canary-cadence]] for related CI cadence.
|
||||
14
memory/drone-sqlite-log-extraction.md
Normal file
14
memory/drone-sqlite-log-extraction.md
Normal file
@ -0,0 +1,14 @@
|
||||
---
|
||||
name: drone-sqlite-log-extraction
|
||||
description: How to read full drone CI step logs on cc-ci — copy /data/database.sqlite from the drone container and query it
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: reference
|
||||
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
|
||||
---
|
||||
|
||||
Drone on cc-ci has no on-disk logs and no API token handy. To get full step logs:
|
||||
1. `ssh cc-ci 'docker cp $(docker ps -qf name=drone):/data/database.sqlite /tmp/drone.sqlite'` then scp to orchestrator (no python3 on cc-ci PATH).
|
||||
2. Query with python3 sqlite3: `builds` (build_number → build_id) → `stages` (stage_build_id) → `steps` (step_stage_id) → `logs` where log_id = step_id; `log_data` is a JSON array of `{pos,out,time}` lines.
|
||||
|
||||
**Why:** this is how the real root cause of immich CI builds 229/230 ("bash: /pg_backup.sh: No such file or directory" in the backup hook) was found after results.json/junit gave only the assertion failure. Related: [[shared-recipe-checkout-race]]
|
||||
14
memory/immich-pgvectors-drop-database-panic.md
Normal file
14
memory/immich-pgvectors-drop-database-panic.md
Normal file
@ -0,0 +1,14 @@
|
||||
---
|
||||
name: immich-pgvectors-drop-database-panic
|
||||
description: "Never DROP DATABASE on immich's postgres image — pgvecto.rs worker PANICs and crashes postgres; use pg_dump --clean --if-exists instead"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: project
|
||||
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
|
||||
---
|
||||
|
||||
On immich's DB image (ghcr.io/immich-app/postgres:14-vectorchord0.4.3-pgvectors0.2.0), `DROP DATABASE` destabilises the legacy pgvecto.rs (`vectors`) background worker: it loops on "IPC connection is closed unexpected" until `PANIC: ERRORDATA_STACK_SIZE exceeded` → postgres aborts (signal 6) → the app never reconverges. Per-table `DROP TABLE` is safe; only `DROP DATABASE` triggers it.
|
||||
|
||||
**Why:** confirmed live in dev-immich and in CI build 225 DB-service logs during the immich backup/restore fix (PR #2, June 2026).
|
||||
|
||||
**How to apply:** for a true point-in-time restore without dropping the DB, back up with `pg_dump --clean --if-exists` (per-object DROP+recreate) and on restore rewrite pg_dump's `set_config('search_path', '', false)` to `'public, pg_catalog', true` (VectorChord types unresolvable otherwise — same rewrite as docs.immich.app/administration/backup-and-restore). See the recipe's pg_backup.sh. Related: [[shared-recipe-checkout-race]], [[drone-sqlite-log-extraction]]
|
||||
26
memory/orchestrator-host-hetzner.md
Normal file
26
memory/orchestrator-host-hetzner.md
Normal file
@ -0,0 +1,26 @@
|
||||
---
|
||||
name: orchestrator-host-hetzner
|
||||
description: The cc-ci orchestrator runs on a Hetzner cpx22; key host facts + the git-identity gotcha
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: project
|
||||
originSessionId: cd772f12-1978-47c3-894b-0ebbe0d2987f
|
||||
---
|
||||
|
||||
The cc-ci orchestrator (loops + watchdog + this session) runs on a **Hetzner cpx22** as of
|
||||
2026-05-31, replacing the Incus VM (100.116.55.106).
|
||||
|
||||
- Hetzner server **134487234**, public **168.119.126.100**, tailnet **cc-ci-orchestrator-1** @
|
||||
**100.84.190.30**. Flake host **cc-ci-orchestrator-hetzner**.
|
||||
- Rebuild: `sudo nixos-rebuild switch --flake .#cc-ci-orchestrator-hetzner` from `/srv/cc-ci-orch`
|
||||
(`/srv/cc-ci` is a symlink to it). The Bash tool runs as user **loops** (uid 1000, passwordless
|
||||
sudo) — plain `nixos-rebuild switch` fails on the profile symlink; use `sudo`.
|
||||
- Reboot-resilience: `cc-ci-loops.service` is **enabled** (wantedBy multi-user.target); ExecStartPre
|
||||
`reboot-log.sh` auto-logs reboots to REBOOTS.md. Its `script` runs `launch.sh start`, which
|
||||
**stops+restarts the loops** — so any rebuild that (re)starts the unit bounces the loops (they
|
||||
re-orient from git; harmless but noticeable).
|
||||
- **Git-identity gotcha:** the box had no git user.name/email configured; commits fail with "Author
|
||||
identity unknown". Set per-repo to match prior commits: `autonomic-bot
|
||||
<autonomic-bot@git.autonomic.zone>`.
|
||||
|
||||
Full record: `cc-ci-plan/plan-orchestrator-hetzner-migration.md`.
|
||||
27
memory/plausible-upgrade-base-trap.md
Normal file
27
memory/plausible-upgrade-base-trap.md
Normal file
@ -0,0 +1,27 @@
|
||||
---
|
||||
name: plausible-upgrade-base-trap
|
||||
description: "plausible CI REDs come from the published 3.0.0 base deploy (no x86_64 arch → 404 → silent exit 1), not the PR tree; needs UPGRADE_BASE_VERSION=3.0.1+v2.0.0 in cc-ci tests"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: project
|
||||
originSessionId: fc17c9c2-ab6e-4c11-856e-a6a6e160a0ec
|
||||
---
|
||||
|
||||
cc-ci's upgrade tier deploys `recipe_versions[-2]` as the base before upgrading to the PR
|
||||
head (deploy-once design: the install tier asserts against that base too). For plausible,
|
||||
tags are `…, 3.0.0+v2.0.0, 3.0.1+v2.0.0` so the default base is **3.0.0+v2.0.0**, whose
|
||||
entrypoint lacks an x86_64 ARCH mapping → requests `clickhouse-backup-linux-x86_64.tar.gz`
|
||||
→ HTTP 404 always → `set -e` + silenced wget → container exits 1 with **empty service
|
||||
logs** → crash-loop → install timeout RED. Nothing in the PR can fix this: the base tag is
|
||||
immutable history.
|
||||
|
||||
**Why:** the PR adds 3.1.0 above the newest published tag — the harness-documented case
|
||||
where `[-2]` is the wrong base and `[-1]` (3.0.1) is correct.
|
||||
|
||||
**How to apply:** the fix is one line in the cc-ci repo (gated by --with-tests / operator):
|
||||
`tests/plausible/recipe_meta.py: UPGRADE_BASE_VERSION = "3.0.1+v2.0.0"`. The recipe-side
|
||||
hardening (verified cached binary on the persistent volume, Altinity URL, retries+timeout,
|
||||
loud hard-fail, depends_on fix) is on PR #3 (commit 9f8bcbc). Diagnosis + ask posted at
|
||||
https://git.autonomic.zone/recipe-maintainers/plausible/pulls/3#issuecomment-14261.
|
||||
Before burning a !testme on an upgrade-stage recipe, check what base version the harness
|
||||
will pick and whether that base can actually converge. See [[abra-chaos-deploy-checkout-gotcha]].
|
||||
14
memory/push-commits-to-remote.md
Normal file
14
memory/push-commits-to-remote.md
Normal file
@ -0,0 +1,14 @@
|
||||
---
|
||||
name: push-commits-to-remote
|
||||
description: "Operator wants every commit pushed to git.autonomic.zone right after it's made"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: feedback
|
||||
originSessionId: 7b5366a6-263c-421b-be7d-9f888067336b
|
||||
---
|
||||
|
||||
In the cc-ci orchestrator repo (`/srv/cc-ci-orch`), push to `origin` (git.autonomic.zone/recipe-maintainers/cc-ci-orchestrator) immediately after committing — don't leave commits sitting locally waiting to be asked.
|
||||
|
||||
**Why:** the operator treats the remote as the source of truth / backup; local-only commits are a loss risk on this autonomous box.
|
||||
|
||||
**How to apply:** after any `git commit` here, run `git push origin main` (or the current branch) in the same turn. The remote is already credentialed in the URL. Mind the [[orchestrator-host-hetzner]] git-identity gotcha (commit as `autonomic-bot`). This standing preference replaces the default "commit/push only when asked" for this repo.
|
||||
29
memory/recipe-mirrors-public-org-blocker.md
Normal file
29
memory/recipe-mirrors-public-org-blocker.md
Normal file
@ -0,0 +1,29 @@
|
||||
---
|
||||
name: recipe-mirrors-public-org-blocker
|
||||
description: "Recipe mirrors are public repos but the recipe-maintainers ORG is private-visibility, so anon reads 404; bot can't flip the org"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: project
|
||||
originSessionId: f7960036-d990-4a21-a81e-f7c486d97fea
|
||||
---
|
||||
|
||||
As of 2026-06-09 all 21 recipe mirrors under `recipe-maintainers` were flipped `private=false`
|
||||
(secret-scanned first), to power the Recipe Report's live PR-STATUS column via the tokenless
|
||||
same-origin proxy `report.ci.commoninternet.net/pr/<recipe>/<n>` (shipped in cc-ci
|
||||
`nix/modules/reports.nix`). BUT the **org itself is `visibility: private`**, which makes Gitea 404
|
||||
all its repos for anonymous users — so the live STATUS column shows a muted "?" instead of open/✓.
|
||||
|
||||
**Blocker:** `autonomic-bot` cannot flip the org (PATCH `/orgs/recipe-maintainers` → 403 "Must be an
|
||||
organization owner"; `is_admin=false`; the basic-auth credential lacks `write:organization` scope,
|
||||
even though the bot is in the Owners team). Confirmed model: `autonomic-cooperative` is a public org
|
||||
and its repos ARE anonymously visible; `recipe-maintainers` is private and they are not.
|
||||
|
||||
**Why:** the whole live-status feature is dark until this is resolved. Private repos stay hidden even
|
||||
in a public org, so flipping the org public does NOT expose the four locked-private repos (`cc-ci`,
|
||||
`cc-ci-secrets`, `cc-ci-orchestrator`, `archived-cc-ci-orchestrator`).
|
||||
|
||||
**How to apply:** operator (an org owner) must set `recipe-maintainers` org visibility to **public**
|
||||
in the Gitea UI (Settings → make org public), OR provision a token with `write:organization` scope.
|
||||
The instant that happens, the proxy returns 200 PR JSON and the column lights up — no redeploy needed.
|
||||
Verify: `curl https://report.ci.commoninternet.net/pr/cryptpad/5` should return PR JSON, not a 404.
|
||||
Related: [[push-commits-to-remote]].
|
||||
14
memory/regression-canary-cadence.md
Normal file
14
memory/regression-canary-cadence.md
Normal file
@ -0,0 +1,14 @@
|
||||
---
|
||||
name: regression-canary-cadence
|
||||
description: "The cc-ci server regression canaries are expensive — run on polish/review/release, not every commit"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: feedback
|
||||
originSessionId: 7b5366a6-263c-421b-be7d-9f888067336b
|
||||
---
|
||||
|
||||
The cc-ci **server regression canaries** (the codified E2E pytest suite — full lifecycle on `custom-html-tiny` + `lasuite-docs` good canaries plus a known-bad false-green-guard fixture; plan: `cc-ci-plan/plan-server-regression-canaries.md`) must **NOT** run on every commit/PR.
|
||||
|
||||
**Why:** they're slow and resource-heavy — full lifecycle on lasuite-docs is minutes and needs the live server/abra/Swarm. Running them per-commit would be wasteful and slow the loop.
|
||||
|
||||
**How to apply:** run them **deliberately at milestones** — polishing passes, code reviews, and releases of the cc-ci server — before trusting a batch of changes, not per incremental commit. Keep them opt-in behind the `@pytest.mark.canary` marker; if ever wired to `!testme` on the cc-ci repo, gate behind a deliberate trigger (label / `--canary`), never an automatic run on every PR.
|
||||
14
memory/shared-recipe-checkout-race.md
Normal file
14
memory/shared-recipe-checkout-race.md
Normal file
@ -0,0 +1,14 @@
|
||||
---
|
||||
name: shared-recipe-checkout-race
|
||||
description: Never run git checkout on ~/.abra/recipes/<recipe> on cc-ci while a CI build for that recipe is running — the harness chaos-deploys from that same working tree
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: feedback
|
||||
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
|
||||
---
|
||||
|
||||
The cc-ci harness (run_recipe_ci.py) deploys the upgrade tier from the SHARED `~/.abra/recipes/<recipe>` working tree on cc-ci via `abra app deploy --chaos`. Dev debugging that switches that checkout (`git checkout -f`, repro scripts) while a CI build runs makes CI deploy the wrong tree.
|
||||
|
||||
**Why:** immich builds 229/230 went RED with "bash: /pg_backup.sh: No such file or directory" — the configs stanza wasn't in the tree CI deployed, because concurrent dev repro scripts were flipping the same checkout between base tag and PR head. A faithful manual repro with no concurrent churn mounted the config fine.
|
||||
|
||||
**How to apply:** before triggering !testme, park the recipe checkout clean at the PR head and do zero abra/git activity on cc-ci for that recipe until the build verdicts. Also remember [[abra-chaos-deploy-checkout-gotcha]] (`abra app new` moves the checkout to the release tag). Related: [[drone-sqlite-log-extraction]], [[immich-pgvectors-drop-database-panic]]
|
||||
Reference in New Issue
Block a user