Files
cc-ci/machine-docs/JOURNAL-mailu.md

6.7 KiB

JOURNAL — phase mailu

Design rationale, dead-ends, investigation notes. Not for Adversary pre-verdict reading.


2026-06-11 ADV-mailu-01 fix — build #477 LEVEL 5 re-verified

ADV-mailu-01 resolution confirmed

Build #477 result confirms both volumes are now specifically tested:

  • test_backup_captures_mail_message PASS: ccci-backup-probe message in INBOX at backup time
  • test_restore_returns_mail_message PASS: message survives Maildir wipe + restore from snapshot
  • Both maildir-specific tests ran in the backup and restore stages respectively
  • Full build level 5, clean_teardown=true, no_secret_leak=true

The sendmail delivery path (smtp container → postfix → dovecot deliver) worked correctly for injecting the test message. The doveadm search poll with 60s timeout was sufficient. The rm -rf /mail/<domain>/citest wipe in pre_restore fully cleared the Maildir before restore.

Re-claiming M1 with build #477 as the evidence build.


2026-06-11 Bootstrap + data-layout research

mailu volume layout (from compose.yml analysis)

Services and their durable volumes:

  • admin service: mounts mailu vol → /data (sqlite DB: users, mailboxes, domains, settings)
  • imap (dovecot) service: mounts mail vol → /mail (Maildir message storage)
  • admin service also mounts dkim vol → /dkim (DKIM private keys)
  • antispam service: mounts rspamd vol → /var/lib/rspamd (antispam training data — ephemeral)
  • db (redis) service: mounts redis vol → /data (session cache — ephemeral)
  • webmail service: mounts webmail vol → /data (roundcube prefs — ephemeral)
  • smtp service: mounts mailqueue vol → /queue (postfix queue — ephemeral)
  • app (nginx) + certdumper: mount certs vol (TLS cert dumps — regenerable)

Backup decision: admin/data + imap/mail

For genuine backup/restore coverage:

  • admin:/data = sqlite DB → primary source of truth for mailboxes/users. If this is lost, all accounts are gone. Must backup.
  • imap:/mail = Maildir storage → the actual messages. Loss = all mail gone. Must backup.
  • dkim:/dkim = DKIM keys. In production, loss = need re-keying + DNS update. BUT: for CI testing, we don't have DNS-side DKIM records anyway, so DKIM regeneration is harmless. NOT labeled for CI simplicity (can add in a follow-up if operator wants DKIM key recovery tested).
  • Other volumes: ephemeral / regenerable. Not labeled.

Backupbot v2 syntax decision

From studying n8n and discourse examples:

  • v2 uses backupbot.backup: "true" + backupbot.backup.path: "<container-path>"
  • v1 used backupbot.volumes.<name>=true/false (immich pattern — do NOT use for new work)
  • mailu has no Postgres (uses SQLite), so no pg_dump hook needed
  • For admin: backupbot.backup.path: "/data" (whole sqlite DB dir)
  • For imap: backupbot.backup.path: "/mail" (whole Maildir)

mailu compose.yml structure note

mailu uses deploy.labels (list form with - "key=value" strings) for the app service's traefik labels. The backupbot labels need to go on the services that own the data:

  • admin service uses labels: directly (not deploy.labels) — no traefik label there
  • imap service similarly uses labels: directly

Wait, actually checking the compose.yml — there's no labels: on admin or imap at all. The app (nginx) service has deploy.labels for traefik. For backupbot, the labels need to be on the DEPLOYED service (under deploy.labels or top-level labels). In Docker Swarm, backupbot uses service labels (which are deploy-time labels). So we need deploy.labels on admin + imap.

The app service already uses deploy.labels (list form) for traefik. For admin + imap we need to add deploy:labels: sections.

Version bump

Current version: 3.0.1+2024.06.52 (on app service deploy.labelscoop-cloud.${STACK_NAME}.version) New version: 3.1.0+2024.06.52 (minor version bump for backupbot feature addition)

CI test design

ops.py hooks (consistent with n8n pattern):

  • pre_backup(ctx): create a test mailbox citest@<domain> via flask mailu user citest <domain> '<password>' in the admin container
  • pre_restore(ctx): delete the mailbox via flask mailu user delete citest@<domain> (or equivalent) to simulate data loss

test_backup.py: assert citest@<domain> is in config-export at backup time

test_restore.py: assert citest@<domain> is back in config-export after restore

The _mailu.py helpers already provide:

  • flask_mailu(domain, cmd) → runs flask mailu CLI in admin container
  • config_export(domain) → parses config-export JSON
  • user_emails(cfg) → list of email addresses from config

Delete-user CLI for pre_restore

Need to confirm the delete command. From mailu docs, the admin CLI:

  • Create: flask mailu user <local> <domain> '<password>'
  • Delete: flask mailu user delete <email> (where email = local@domain)
  • Or: flask mailu user delete <local>@<domain> Need to verify the exact syntax. Will use flask mailu user delete citest@<domain> and add error handling.

2026-06-11 ADV-mailu-01 fix — extend seed to cover /mail Maildir

Adversary finding (M1 FAIL)

The M1 claim was rejected because ops.py only proved SQLite (/data) backup/restore. The /mail Maildir volume was labeled and backed up but never specifically tested for restoration. If backupbot silently skipped restoring /mail, the test would still PASS.

Fix (cc-ci commit b9352e8)

Extended the seed in three steps:

ops.py pre_backup: After creating citest@<domain>, inject a test message via in-container sendmail (smtp container → postfix → rspamd → dovecot deliver). Subject: ccci-backup-probe. Wait up to 60s for dovecot to deliver (polling doveadm search). This is identical to the pattern proven in test_mail_flow.py.

ops.py pre_restore: Now wipes BOTH:

  1. The user from sqlite: DELETE FROM user WHERE localpart='citest' via python3 in admin container
  2. The user's Maildir: rm -rf /mail/<domain>/citest in imap container

test_backup.py: Added test_backup_captures_mail_message — asserts the message is present at backup time via doveadm search in imap container.

test_restore.py: Added test_restore_returns_mail_message — asserts the message is back in INBOX after restore via doveadm search in imap container.

Why rm -rf over doveadm expunge

Used rm -rf /mail/<domain>/citest/ in pre_restore rather than doveadm expunge because:

  • rm -rf directly wipes the Maildir from disk — observable, immediate, unambiguous
  • doveadm expunge marks messages for deletion but depends on dovecot's expunge/purge cycle
  • The goal is a clear divergence: after pre_restore, the maildir DOES NOT EXIST; after restore, it DOES

Build #477 in flight to verify