M10 finding: Docker Hub rate limit blocks lasuite-docs upgrade — A1 registry creds needed (5/6 green)
All checks were successful
continuous-integration/drone/push Build is passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 10:09:23 +01:00
parent 432487f4e8
commit dc5aca90bd
3 changed files with 36 additions and 2 deletions

View File

@ -105,6 +105,16 @@ Architecture decisions and dead-ends. One line of rationale each. (§0, §8)
matrix-synapse, multi-service+S3/object-storage=lasuite-docs); n8n adds a 6th real deployable app
(workflow automation) behind the normal terminate-at-Traefik path.
- **Docker Hub rate limit + mid-breadth prune — FINDING (2026-05-27).** D10 real-`!testme` breadth
runs exhausted Docker Hub's anonymous pull rate limit (lasuite-docs, 9 images, upgrade stage:
`toomanyrequests`). Two lessons: (1) **registry pull creds are an A1 operator input** needed for
reliable heavy-recipe deploys under load (request + sops-store + wire into docker daemon). (2)
**Don't `docker image prune -af` mid-breadth** — it evicts cached recipe images and forces re-pulls
that hit the limit. The first lasuite failure was disk pressure (90% full); pruning fixed disk but
triggered re-pulls → rate limit. Better: rely on the daily autoprune, prune only `dangling`
(not `-a`) between runs, or grow disk so heavy images stay cached. Net for D10: 5/6 recipes green
via real !testme; lasuite-docs gated on the rate limit (transient ~hours; durable fix = creds).
## Open (defaults from §8, to confirm as reality lands)
- **Deploy mechanism — SETTLED (M0):** `nixos-rebuild switch --flake /root/cc-ci#cc-ci` run *on

View File

@ -708,3 +708,15 @@ Fired !testme on all 6 recipe PRs (capacity=1, sequential). Results (real PR-tri
So the real-!testme path + the upgrade fixes (upstream tags + `upgrade -o`) work across simple, DB,
DB+media, workflow, and stateful recipes. lasuite-docs (the object-storage/S3 category, required)
needs its upgrade to pass on the real path for the 6/6 D10 proof.
---
## 2026-05-27 — M10: 5/6 real-!testme green; lasuite-docs blocked on Docker Hub rate limit (A1)
lasuite-docs #88/#92 upgrade failed "deploy failed" → diagnosed: node disk at 90% (2.7G free) — a
9-service rolling upgrade couldn't converge. Pruned 30 unused images (reclaimed 12GB → 15G free).
Retry #93: got further (5/8 services up) but redis task Rejected "No such image: redis:8.2.6" →
`docker pull redis:8.2.6` on the node = `toomanyrequests: unauthenticated pull rate limit`. So the
prune fixed disk but forced re-pulls that hit Docker Hub's anonymous limit (A1 registry-creds
finding, §1.5/§4.4). Recorded in STATUS ## Blocked + DECISIONS; surfaced to operator (provide Docker
Hub creds). 5/6 recipes green via real !testme; lasuite install+backup green, upgrade gated.
Pivoting to M9 (docs/reproducibility, unblocked) while the limit resets / creds arrive.

View File

@ -59,8 +59,20 @@ Drone build with RECIPE=<r> (or `cc-ci-run runner/run_recipe_ci.py` with RECIPE/
the recipe-CI pipeline will set `CCCI_JANITOR_MAX_AGE=0` (safe — no concurrent runs). See DECISIONS.
## Blocked
- (none) — M3 webhook blocker cleared by the polling-primary redesign (polling is
read-only/outbound and needs no Gitea `ALLOWED_HOST_LIST` whitelist).
- **Docker Hub anonymous pull rate limit — registry pull creds needed (A1, operator).** During the
D10 real-`!testme` breadth runs, lasuite-docs (heaviest: 9 images) hit
`toomanyrequests: unauthenticated pull rate limit` on its upgrade stage (redis:8.2.6 task
Rejected "No such image" → couldn't pull). Confirmed: `docker pull redis:8.2.6` on the node →
rate-limited. This is the plan's flagged A1 input (§1.5/§4.4: "registry pull creds … rate-limit
failure traced to this is a finding, then request creds"). **Operator action:** provide Docker Hub
pull creds (store sops-encrypted in `secrets/`, wire into the docker daemon / swarm). NOT globally
blocking: **5/6 recipes already green via real `!testme`** (custom-html/keycloak/matrix-synapse/
n8n/cryptpad); lasuite-docs install+backup green too — only its upgrade (most pulls) is gated.
Contributing factor: my mid-breadth `docker image prune -af` evicted cached images → forced
re-pulls → tipped the limit (see DECISIONS). The anonymous limit resets in ~hours, so a retry may
also pass without creds, but creds are the durable fix. Working M9 (docs) meanwhile.
- (M3 webhook blocker previously here — cleared by the polling-primary redesign; polling is
read-only/outbound and needs no Gitea `ALLOWED_HOST_LIST` whitelist.)
## Tracking (adversary findings I must address)
- **[adversary] A4 — concurrent same-recipe runs collide on shared `~/.abra/recipes/<recipe>`.**