feat(po): drop periodic fleet sweep — operator-driven, recover-if-dead only

The PO's job is to manage projects on request, not watch them live. Remove the hourly wake/sweep entirely: - agents.toml: watch="heal" (recover-if-dead), no `wake` field - prompts/supervise.md: deleted - prompts/orchestrator.md, README.md, docs/bootstrap.md, docs/manage-projects.md: drop sweep/wake references; document operator-driven, no periodic sweep Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 15:04:08 +00:00
parent 6cc3ed4f13
commit 0456837444
6 changed files with 19 additions and 33 deletions
--- a/README.md
+++ b/README.md
@ -10,7 +10,7 @@ is no special "control-plane" code path.
 project-orchestrator/
  agents.toml          this project's harness config (one persistent fleet-management agent)
  engine/              the agent-orchestrator harness, pinned as a submodule @ v0.1.0
-  prompts/             the PO agent's role (orchestrator.md) + periodic sweep (supervise.md)
+  prompts/             the PO agent's role (orchestrator.md)
  fleet.toml           THE fleet registry — the only record of project ↔ harness ↔ ref ↔ location
  scripts/             fleet.py + create/start/stop/update-project.sh — the management helpers
  docs/                runbooks: manage-projects.md, fleet-registry.md, bootstrap.md
@ -41,9 +41,10 @@ python3 engine/agents.py up                   # start the PO agent + its watchdo
 python3 engine/agents.py down                 # stop everything
 ```

-The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`) with an hourly
-fleet sweep (`prompts/supervise.md`), supervised by the harness watchdog. See `engine/README.md`
-for the full harness reference.
+The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`), kept alive by the
+harness watchdog (recover-if-dead) but **never** woken on a timer: it is operator-driven and manages
+projects on request rather than watching them live. See `engine/README.md` for the full harness
+reference.

 ## Managing the fleet

--- a/agents.toml
+++ b/agents.toml
@ -45,7 +45,8 @@ fatal_re  = "redacted_thinking|blocks cannot be modified|cannot be modified"
 # A single persistent fleet-management agent is enough to start (the plan: "add a loop only if
 # useful"). It is NOT a build loop — it manages a fleet of *other* projects: create / start / stop
 # / update / list / status, reading each project's harness docs to work out how to drive it. Its
-# startup prompt + periodic wake nudge live in prompts/.
+# startup prompt lives in prompts/. It is operator-driven: NO periodic wake — the PO manages
+# projects on request, it does not watch them live.

 [[agent]]
 name        = "project-orchestrator"   # tmux session: po-project-orchestrator
@ -53,6 +54,6 @@ kind        = "persistent"
 backend     = "claude"
 model       = "claude-opus-4-8"
 resume      = true                      # resume its session across restarts (--resume <state id>)
-watch       = "heal"                    # keep it alive/healed; never reboot just for being idle
+watch       = "heal"                    # recover-if-dead (crash/wedge/wrong-backend); never reboot for idle
 prompt_file = "prompts/orchestrator.md" # startup prompt: read your role + fleet, then report
-wake        = { interval = 3600, prompt_file = "prompts/supervise.md" }   # hourly fleet sweep
+# no `wake`: the watchdog sends NO periodic prompts. It heals a dead session but never nudges a live one.
--- a/docs/bootstrap.md
+++ b/docs/bootstrap.md
@ -36,13 +36,13 @@ fleet pieces.

 3. **Write the harness config** — `agents.toml` declaring the PO's own agent(s). A single
   `kind = "persistent"` `project-orchestrator` agent (backend `claude`) is enough to start; its
-   startup prompt is `prompts/orchestrator.md` and it gets an hourly `wake` →
-   `prompts/supervise.md`. (You can scaffold a starter with `python3 engine/agents.py init .` and
-   then edit it, or copy this repo's `agents.toml`.)
+   startup prompt is `prompts/orchestrator.md`, with `watch = "heal"` (recover-if-dead) and **no**
+   `wake` — the PO is operator-driven, not woken on a timer. (You can scaffold a starter with
+   `python3 engine/agents.py init .` and then edit it, or copy this repo's `agents.toml`.)

 4. **Add the fleet pieces** (what makes this project a PO):
   - `fleet.toml` — the registry (schema: `docs/fleet-registry.md`), starting empty or with a sample.
-   - `prompts/orchestrator.md` + `prompts/supervise.md` — the PO agent's role and periodic sweep.
+   - `prompts/orchestrator.md` — the PO agent's role / startup prompt.
   - `scripts/` — `fleet.py` (read/validate the registry) and `create/start/stop/update-project.sh`.
   - `docs/` — these runbooks.

--- a/docs/manage-projects.md
+++ b/docs/manage-projects.md
@ -91,5 +91,5 @@ python3 scripts/fleet.py status    # + a total/enabled/disabled summary
 ```

 This reads only `fleet.toml`. To also check live state, drive each enabled project's harness
-(`engine/agents.py status --config <project>/agents.toml`) — `prompts/supervise.md` does this on the
-PO's periodic wake.
+(`engine/agents.py status --config <project>/agents.toml`). The PO does this **on request** — there
+is no periodic fleet sweep; this repo manages projects when asked, it does not watch them live.
--- a/prompts/orchestrator.md
+++ b/prompts/orchestrator.md
@ -41,7 +41,8 @@ For each flow, follow the runbook in `docs/manage-projects.md`. In short:
 1. Read `fleet.toml` and `docs/manage-projects.md` so you know the current fleet and your runbooks.
 2. Run `python3 scripts/fleet.py status` to see the fleet's declared state.
 3. Report a short summary: how many projects, which are enabled, anything that looks wrong. Then idle
-   until your next wake or an operator instruction.
+   until an operator instruction.

-Do not invent work. You act when an operator asks you to create/start/stop/update a project, or when
-your periodic wake (`prompts/supervise.md`) tells you to sweep the fleet.
+Do not invent work. You are operator-driven: you act when an operator asks you to
+create/start/stop/update/list/status a project. There is no periodic fleet sweep — this repo's job is
+to *manage* projects on request, not to watch them live.
--- a/prompts/supervise.md
+++ b/prompts/supervise.md
@ -1,17 +0,0 @@
-# Periodic fleet sweep
-
-A scheduled wake. Do a light, read-only sweep of the fleet — do not start work unless something is
-clearly wrong and a runbook covers the fix.
-
-1. `python3 scripts/fleet.py status` — list every project in `fleet.toml` with its location, harness,
-   pinned ref, and enabled flag.
-2. For each **enabled** project whose location is reachable from this host, optionally check whether
-   its harness reports it running (for an `agent-orchestrator` project:
-   `engine/agents.py status --config <project>/agents.toml`). Reading its harness docs first if the
-   harness is unfamiliar.
-3. Report a one-paragraph summary: total / enabled / disabled, anything unreachable or stopped that
-   should be running. If a fix is needed and `docs/manage-projects.md` covers it, you may apply it;
-   otherwise just flag it.
-
-Remember the one-directional rule: never write fleet/PO state into a project repo. The fleet's truth
-is `fleet.toml` here.