Files
cc-ci/machine-docs/JOURNAL-mailu.md

135 lines
6.7 KiB
Markdown

# JOURNAL — phase mailu
Design rationale, dead-ends, investigation notes. Not for Adversary pre-verdict reading.
---
## 2026-06-11 ADV-mailu-01 fix — build #477 LEVEL 5 re-verified
### ADV-mailu-01 resolution confirmed
Build #477 result confirms both volumes are now specifically tested:
- `test_backup_captures_mail_message` PASS: `ccci-backup-probe` message in INBOX at backup time
- `test_restore_returns_mail_message` PASS: message survives Maildir wipe + restore from snapshot
- Both maildir-specific tests ran in the `backup` and `restore` stages respectively
- Full build level 5, clean_teardown=true, no_secret_leak=true
The `sendmail` delivery path (smtp container → postfix → dovecot deliver) worked correctly
for injecting the test message. The `doveadm search` poll with 60s timeout was sufficient.
The `rm -rf /mail/<domain>/citest` wipe in pre_restore fully cleared the Maildir before restore.
Re-claiming M1 with build #477 as the evidence build.
---
## 2026-06-11 Bootstrap + data-layout research
### mailu volume layout (from compose.yml analysis)
Services and their durable volumes:
- `admin` service: mounts `mailu` vol → `/data` (sqlite DB: users, mailboxes, domains, settings)
- `imap` (dovecot) service: mounts `mail` vol → `/mail` (Maildir message storage)
- `admin` service also mounts `dkim` vol → `/dkim` (DKIM private keys)
- `antispam` service: mounts `rspamd` vol → `/var/lib/rspamd` (antispam training data — ephemeral)
- `db` (redis) service: mounts `redis` vol → `/data` (session cache — ephemeral)
- `webmail` service: mounts `webmail` vol → `/data` (roundcube prefs — ephemeral)
- `smtp` service: mounts `mailqueue` vol → `/queue` (postfix queue — ephemeral)
- `app` (nginx) + `certdumper`: mount `certs` vol (TLS cert dumps — regenerable)
### Backup decision: admin/data + imap/mail
For genuine backup/restore coverage:
- **`admin:/data`** = sqlite DB → primary source of truth for mailboxes/users. If this is lost,
all accounts are gone. Must backup.
- **`imap:/mail`** = Maildir storage → the actual messages. Loss = all mail gone. Must backup.
- `dkim:/dkim` = DKIM keys. In production, loss = need re-keying + DNS update. BUT: for CI testing,
we don't have DNS-side DKIM records anyway, so DKIM regeneration is harmless. NOT labeled for
CI simplicity (can add in a follow-up if operator wants DKIM key recovery tested).
- Other volumes: ephemeral / regenerable. Not labeled.
### Backupbot v2 syntax decision
From studying n8n and discourse examples:
- v2 uses `backupbot.backup: "true"` + `backupbot.backup.path: "<container-path>"`
- v1 used `backupbot.volumes.<name>=true/false` (immich pattern — do NOT use for new work)
- mailu has no Postgres (uses SQLite), so no pg_dump hook needed
- For `admin`: `backupbot.backup.path: "/data"` (whole sqlite DB dir)
- For `imap`: `backupbot.backup.path: "/mail"` (whole Maildir)
### mailu compose.yml structure note
mailu uses `deploy.labels` (list form with `- "key=value"` strings) for the app service's traefik labels. The backupbot labels need to go on the services that own the data:
- `admin` service uses `labels:` directly (not `deploy.labels`) — no traefik label there
- `imap` service similarly uses `labels:` directly
Wait, actually checking the compose.yml — there's no `labels:` on `admin` or `imap` at all.
The `app` (nginx) service has `deploy.labels` for traefik. For backupbot, the labels need to be
on the DEPLOYED service (under `deploy.labels` or top-level `labels`). In Docker Swarm, backupbot
uses service labels (which are deploy-time labels). So we need `deploy.labels` on admin + imap.
The `app` service already uses `deploy.labels` (list form) for traefik. For admin + imap we need
to add `deploy:``labels:` sections.
### Version bump
Current version: `3.0.1+2024.06.52` (on `app` service `deploy.labels``coop-cloud.${STACK_NAME}.version`)
New version: `3.1.0+2024.06.52` (minor version bump for backupbot feature addition)
### CI test design
**ops.py hooks** (consistent with n8n pattern):
- `pre_backup(ctx)`: create a test mailbox `citest@<domain>` via `flask mailu user citest <domain> '<password>'` in the admin container
- `pre_restore(ctx)`: delete the mailbox via `flask mailu user delete citest@<domain>` (or equivalent) to simulate data loss
**test_backup.py**: assert `citest@<domain>` is in `config-export` at backup time
**test_restore.py**: assert `citest@<domain>` is back in `config-export` after restore
The `_mailu.py` helpers already provide:
- `flask_mailu(domain, cmd)` → runs flask mailu CLI in admin container
- `config_export(domain)` → parses config-export JSON
- `user_emails(cfg)` → list of email addresses from config
### Delete-user CLI for pre_restore
Need to confirm the delete command. From mailu docs, the admin CLI:
- Create: `flask mailu user <local> <domain> '<password>'`
- Delete: `flask mailu user delete <email>` (where email = local@domain)
- Or: `flask mailu user delete <local>@<domain>`
Need to verify the exact syntax. Will use `flask mailu user delete citest@<domain>` and add error handling.
---
## 2026-06-11 ADV-mailu-01 fix — extend seed to cover /mail Maildir
### Adversary finding (M1 FAIL)
The M1 claim was rejected because ops.py only proved SQLite (`/data`) backup/restore. The `/mail`
Maildir volume was labeled and backed up but never specifically tested for restoration. If backupbot
silently skipped restoring `/mail`, the test would still PASS.
### Fix (cc-ci commit b9352e8)
Extended the seed in three steps:
**ops.py `pre_backup`**: After creating `citest@<domain>`, inject a test message via in-container
`sendmail` (smtp container → postfix → rspamd → dovecot deliver). Subject: `ccci-backup-probe`.
Wait up to 60s for dovecot to deliver (polling `doveadm search`). This is identical to the pattern
proven in `test_mail_flow.py`.
**ops.py `pre_restore`**: Now wipes BOTH:
1. The user from sqlite: `DELETE FROM user WHERE localpart='citest'` via python3 in admin container
2. The user's Maildir: `rm -rf /mail/<domain>/citest` in imap container
**test_backup.py**: Added `test_backup_captures_mail_message` — asserts the message is present
at backup time via `doveadm search` in imap container.
**test_restore.py**: Added `test_restore_returns_mail_message` — asserts the message is back in
INBOX after restore via `doveadm search` in imap container.
### Why rm -rf over doveadm expunge
Used `rm -rf /mail/<domain>/citest/` in pre_restore rather than `doveadm expunge` because:
- `rm -rf` directly wipes the Maildir from disk — observable, immediate, unambiguous
- `doveadm expunge` marks messages for deletion but depends on dovecot's expunge/purge cycle
- The goal is a clear divergence: after pre_restore, the maildir DOES NOT EXIST; after restore, it DOES
### Build #477 in flight to verify