135 lines
6.7 KiB
Markdown
135 lines
6.7 KiB
Markdown
# JOURNAL — phase mailu
|
|
|
|
Design rationale, dead-ends, investigation notes. Not for Adversary pre-verdict reading.
|
|
|
|
---
|
|
|
|
## 2026-06-11 ADV-mailu-01 fix — build #477 LEVEL 5 re-verified
|
|
|
|
### ADV-mailu-01 resolution confirmed
|
|
|
|
Build #477 result confirms both volumes are now specifically tested:
|
|
- `test_backup_captures_mail_message` PASS: `ccci-backup-probe` message in INBOX at backup time
|
|
- `test_restore_returns_mail_message` PASS: message survives Maildir wipe + restore from snapshot
|
|
- Both maildir-specific tests ran in the `backup` and `restore` stages respectively
|
|
- Full build level 5, clean_teardown=true, no_secret_leak=true
|
|
|
|
The `sendmail` delivery path (smtp container → postfix → dovecot deliver) worked correctly
|
|
for injecting the test message. The `doveadm search` poll with 60s timeout was sufficient.
|
|
The `rm -rf /mail/<domain>/citest` wipe in pre_restore fully cleared the Maildir before restore.
|
|
|
|
Re-claiming M1 with build #477 as the evidence build.
|
|
|
|
---
|
|
|
|
## 2026-06-11 Bootstrap + data-layout research
|
|
|
|
### mailu volume layout (from compose.yml analysis)
|
|
|
|
Services and their durable volumes:
|
|
- `admin` service: mounts `mailu` vol → `/data` (sqlite DB: users, mailboxes, domains, settings)
|
|
- `imap` (dovecot) service: mounts `mail` vol → `/mail` (Maildir message storage)
|
|
- `admin` service also mounts `dkim` vol → `/dkim` (DKIM private keys)
|
|
- `antispam` service: mounts `rspamd` vol → `/var/lib/rspamd` (antispam training data — ephemeral)
|
|
- `db` (redis) service: mounts `redis` vol → `/data` (session cache — ephemeral)
|
|
- `webmail` service: mounts `webmail` vol → `/data` (roundcube prefs — ephemeral)
|
|
- `smtp` service: mounts `mailqueue` vol → `/queue` (postfix queue — ephemeral)
|
|
- `app` (nginx) + `certdumper`: mount `certs` vol (TLS cert dumps — regenerable)
|
|
|
|
### Backup decision: admin/data + imap/mail
|
|
|
|
For genuine backup/restore coverage:
|
|
- **`admin:/data`** = sqlite DB → primary source of truth for mailboxes/users. If this is lost,
|
|
all accounts are gone. Must backup.
|
|
- **`imap:/mail`** = Maildir storage → the actual messages. Loss = all mail gone. Must backup.
|
|
- `dkim:/dkim` = DKIM keys. In production, loss = need re-keying + DNS update. BUT: for CI testing,
|
|
we don't have DNS-side DKIM records anyway, so DKIM regeneration is harmless. NOT labeled for
|
|
CI simplicity (can add in a follow-up if operator wants DKIM key recovery tested).
|
|
- Other volumes: ephemeral / regenerable. Not labeled.
|
|
|
|
### Backupbot v2 syntax decision
|
|
|
|
From studying n8n and discourse examples:
|
|
- v2 uses `backupbot.backup: "true"` + `backupbot.backup.path: "<container-path>"`
|
|
- v1 used `backupbot.volumes.<name>=true/false` (immich pattern — do NOT use for new work)
|
|
- mailu has no Postgres (uses SQLite), so no pg_dump hook needed
|
|
- For `admin`: `backupbot.backup.path: "/data"` (whole sqlite DB dir)
|
|
- For `imap`: `backupbot.backup.path: "/mail"` (whole Maildir)
|
|
|
|
### mailu compose.yml structure note
|
|
|
|
mailu uses `deploy.labels` (list form with `- "key=value"` strings) for the app service's traefik labels. The backupbot labels need to go on the services that own the data:
|
|
- `admin` service uses `labels:` directly (not `deploy.labels`) — no traefik label there
|
|
- `imap` service similarly uses `labels:` directly
|
|
|
|
Wait, actually checking the compose.yml — there's no `labels:` on `admin` or `imap` at all.
|
|
The `app` (nginx) service has `deploy.labels` for traefik. For backupbot, the labels need to be
|
|
on the DEPLOYED service (under `deploy.labels` or top-level `labels`). In Docker Swarm, backupbot
|
|
uses service labels (which are deploy-time labels). So we need `deploy.labels` on admin + imap.
|
|
|
|
The `app` service already uses `deploy.labels` (list form) for traefik. For admin + imap we need
|
|
to add `deploy:` → `labels:` sections.
|
|
|
|
### Version bump
|
|
|
|
Current version: `3.0.1+2024.06.52` (on `app` service `deploy.labels` → `coop-cloud.${STACK_NAME}.version`)
|
|
New version: `3.1.0+2024.06.52` (minor version bump for backupbot feature addition)
|
|
|
|
### CI test design
|
|
|
|
**ops.py hooks** (consistent with n8n pattern):
|
|
- `pre_backup(ctx)`: create a test mailbox `citest@<domain>` via `flask mailu user citest <domain> '<password>'` in the admin container
|
|
- `pre_restore(ctx)`: delete the mailbox via `flask mailu user delete citest@<domain>` (or equivalent) to simulate data loss
|
|
|
|
**test_backup.py**: assert `citest@<domain>` is in `config-export` at backup time
|
|
|
|
**test_restore.py**: assert `citest@<domain>` is back in `config-export` after restore
|
|
|
|
The `_mailu.py` helpers already provide:
|
|
- `flask_mailu(domain, cmd)` → runs flask mailu CLI in admin container
|
|
- `config_export(domain)` → parses config-export JSON
|
|
- `user_emails(cfg)` → list of email addresses from config
|
|
|
|
### Delete-user CLI for pre_restore
|
|
|
|
Need to confirm the delete command. From mailu docs, the admin CLI:
|
|
- Create: `flask mailu user <local> <domain> '<password>'`
|
|
- Delete: `flask mailu user delete <email>` (where email = local@domain)
|
|
- Or: `flask mailu user delete <local>@<domain>`
|
|
Need to verify the exact syntax. Will use `flask mailu user delete citest@<domain>` and add error handling.
|
|
|
|
---
|
|
|
|
## 2026-06-11 ADV-mailu-01 fix — extend seed to cover /mail Maildir
|
|
|
|
### Adversary finding (M1 FAIL)
|
|
The M1 claim was rejected because ops.py only proved SQLite (`/data`) backup/restore. The `/mail`
|
|
Maildir volume was labeled and backed up but never specifically tested for restoration. If backupbot
|
|
silently skipped restoring `/mail`, the test would still PASS.
|
|
|
|
### Fix (cc-ci commit b9352e8)
|
|
Extended the seed in three steps:
|
|
|
|
**ops.py `pre_backup`**: After creating `citest@<domain>`, inject a test message via in-container
|
|
`sendmail` (smtp container → postfix → rspamd → dovecot deliver). Subject: `ccci-backup-probe`.
|
|
Wait up to 60s for dovecot to deliver (polling `doveadm search`). This is identical to the pattern
|
|
proven in `test_mail_flow.py`.
|
|
|
|
**ops.py `pre_restore`**: Now wipes BOTH:
|
|
1. The user from sqlite: `DELETE FROM user WHERE localpart='citest'` via python3 in admin container
|
|
2. The user's Maildir: `rm -rf /mail/<domain>/citest` in imap container
|
|
|
|
**test_backup.py**: Added `test_backup_captures_mail_message` — asserts the message is present
|
|
at backup time via `doveadm search` in imap container.
|
|
|
|
**test_restore.py**: Added `test_restore_returns_mail_message` — asserts the message is back in
|
|
INBOX after restore via `doveadm search` in imap container.
|
|
|
|
### Why rm -rf over doveadm expunge
|
|
Used `rm -rf /mail/<domain>/citest/` in pre_restore rather than `doveadm expunge` because:
|
|
- `rm -rf` directly wipes the Maildir from disk — observable, immediate, unambiguous
|
|
- `doveadm expunge` marks messages for deletion but depends on dovecot's expunge/purge cycle
|
|
- The goal is a clear divergence: after pre_restore, the maildir DOES NOT EXIST; after restore, it DOES
|
|
|
|
### Build #477 in flight to verify
|