diff --git a/README.md b/README.md index 273d219..7d031dc 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ is no special "control-plane" code path. project-orchestrator/ agents.toml this project's harness config (one persistent fleet-management agent) engine/ the agent-orchestrator harness, pinned as a submodule @ v0.1.0 - prompts/ the PO agent's role (orchestrator.md) + periodic sweep (supervise.md) + prompts/ the PO agent's role (orchestrator.md) fleet.toml THE fleet registry — the only record of project ↔ harness ↔ ref ↔ location scripts/ fleet.py + create/start/stop/update-project.sh — the management helpers docs/ runbooks: manage-projects.md, fleet-registry.md, bootstrap.md @@ -41,9 +41,10 @@ python3 engine/agents.py up # start the PO agent + its watchdo python3 engine/agents.py down # stop everything ``` -The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`) with an hourly -fleet sweep (`prompts/supervise.md`), supervised by the harness watchdog. See `engine/README.md` -for the full harness reference. +The PO agent itself is one persistent `claude` session (`prompts/orchestrator.md`), kept alive by the +harness watchdog (recover-if-dead) but **never** woken on a timer: it is operator-driven and manages +projects on request rather than watching them live. See `engine/README.md` for the full harness +reference. ## Managing the fleet diff --git a/agents.toml b/agents.toml index 3c516ae..1604806 100644 --- a/agents.toml +++ b/agents.toml @@ -45,7 +45,8 @@ fatal_re = "redacted_thinking|blocks cannot be modified|cannot be modified" # A single persistent fleet-management agent is enough to start (the plan: "add a loop only if # useful"). It is NOT a build loop — it manages a fleet of *other* projects: create / start / stop # / update / list / status, reading each project's harness docs to work out how to drive it. Its -# startup prompt + periodic wake nudge live in prompts/. +# startup prompt lives in prompts/. It is operator-driven: NO periodic wake — the PO manages +# projects on request, it does not watch them live. [[agent]] name = "project-orchestrator" # tmux session: po-project-orchestrator @@ -53,6 +54,6 @@ kind = "persistent" backend = "claude" model = "claude-opus-4-8" resume = true # resume its session across restarts (--resume ) -watch = "heal" # keep it alive/healed; never reboot just for being idle +watch = "heal" # recover-if-dead (crash/wedge/wrong-backend); never reboot for idle prompt_file = "prompts/orchestrator.md" # startup prompt: read your role + fleet, then report -wake = { interval = 3600, prompt_file = "prompts/supervise.md" } # hourly fleet sweep +# no `wake`: the watchdog sends NO periodic prompts. It heals a dead session but never nudges a live one. diff --git a/docs/bootstrap.md b/docs/bootstrap.md index 41ee4d0..b2dabc6 100644 --- a/docs/bootstrap.md +++ b/docs/bootstrap.md @@ -36,13 +36,13 @@ fleet pieces. 3. **Write the harness config** — `agents.toml` declaring the PO's own agent(s). A single `kind = "persistent"` `project-orchestrator` agent (backend `claude`) is enough to start; its - startup prompt is `prompts/orchestrator.md` and it gets an hourly `wake` → - `prompts/supervise.md`. (You can scaffold a starter with `python3 engine/agents.py init .` and - then edit it, or copy this repo's `agents.toml`.) + startup prompt is `prompts/orchestrator.md`, with `watch = "heal"` (recover-if-dead) and **no** + `wake` — the PO is operator-driven, not woken on a timer. (You can scaffold a starter with + `python3 engine/agents.py init .` and then edit it, or copy this repo's `agents.toml`.) 4. **Add the fleet pieces** (what makes this project a PO): - `fleet.toml` — the registry (schema: `docs/fleet-registry.md`), starting empty or with a sample. - - `prompts/orchestrator.md` + `prompts/supervise.md` — the PO agent's role and periodic sweep. + - `prompts/orchestrator.md` — the PO agent's role / startup prompt. - `scripts/` — `fleet.py` (read/validate the registry) and `create/start/stop/update-project.sh`. - `docs/` — these runbooks. diff --git a/docs/manage-projects.md b/docs/manage-projects.md index 10c15aa..8550ffc 100644 --- a/docs/manage-projects.md +++ b/docs/manage-projects.md @@ -91,5 +91,5 @@ python3 scripts/fleet.py status # + a total/enabled/disabled summary ``` This reads only `fleet.toml`. To also check live state, drive each enabled project's harness -(`engine/agents.py status --config /agents.toml`) — `prompts/supervise.md` does this on the -PO's periodic wake. +(`engine/agents.py status --config /agents.toml`). The PO does this **on request** — there +is no periodic fleet sweep; this repo manages projects when asked, it does not watch them live. diff --git a/prompts/orchestrator.md b/prompts/orchestrator.md index 2c5b612..cc8d445 100644 --- a/prompts/orchestrator.md +++ b/prompts/orchestrator.md @@ -41,7 +41,8 @@ For each flow, follow the runbook in `docs/manage-projects.md`. In short: 1. Read `fleet.toml` and `docs/manage-projects.md` so you know the current fleet and your runbooks. 2. Run `python3 scripts/fleet.py status` to see the fleet's declared state. 3. Report a short summary: how many projects, which are enabled, anything that looks wrong. Then idle - until your next wake or an operator instruction. + until an operator instruction. -Do not invent work. You act when an operator asks you to create/start/stop/update a project, or when -your periodic wake (`prompts/supervise.md`) tells you to sweep the fleet. +Do not invent work. You are operator-driven: you act when an operator asks you to +create/start/stop/update/list/status a project. There is no periodic fleet sweep — this repo's job is +to *manage* projects on request, not to watch them live. diff --git a/prompts/supervise.md b/prompts/supervise.md deleted file mode 100644 index 918c8e3..0000000 --- a/prompts/supervise.md +++ /dev/null @@ -1,17 +0,0 @@ -# Periodic fleet sweep - -A scheduled wake. Do a light, read-only sweep of the fleet — do not start work unless something is -clearly wrong and a runbook covers the fix. - -1. `python3 scripts/fleet.py status` — list every project in `fleet.toml` with its location, harness, - pinned ref, and enabled flag. -2. For each **enabled** project whose location is reachable from this host, optionally check whether - its harness reports it running (for an `agent-orchestrator` project: - `engine/agents.py status --config /agents.toml`). Reading its harness docs first if the - harness is unfamiliar. -3. Report a one-paragraph summary: total / enabled / disabled, anything unreachable or stopped that - should be running. If a fix is needed and `docs/manage-projects.md` covers it, you may apply it; - otherwise just flag it. - -Remember the one-directional rule: never write fleet/PO state into a project repo. The fleet's truth -is `fleet.toml` here.