Commit Graph

10 Commits

Author SHA1 Message Date
90375f004e docs(examples): add builder-adversary-deferred — verify after a long segment
Coarsest review cadence: the Builder self-certifies the build phases and the
Adversary does ONE comprehensive cold-verification of the whole accumulated build
in a final `review` phase (vs orig per-phase, lean per-gate). Full original
prompts + a DEFERRED REVIEW CADENCE override, so it isolates verification cadence.
Cheapest coordination; the trade-off is the independent check arrives late (late
rework risk + self-certification drift on build phases). README spells it out.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 00:02:44 +00:00
c6c7ce8640 change: base stateless + lean on the FULL original prompts (not minimal)
So that "stateless vs builder-adversary" and "lean vs stateless" isolate context
hygiene / review granularity WITHOUT the confound of the minimal prompts' reduced
testing pressure (which we found cuts ~25% of test methods). stateless = orig +
context hygiene; lean = orig + context hygiene + per-gate review. min stays the
pure minimal-prompt variant (isolates verbosity vs orig).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 03:17:47 +00:00
a0f7652e9e docs(examples): add builder-solo — single builder, no adversary (control)
A single Builder that builds AND self-verifies (same DoD rigor), with NO
independent Adversary and no claim/review handoff. The control for measuring
what the AI adversary costs (its tokens, ~half of a loop-pair run) and buys
(independent cold verification vs self-certification).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 02:34:50 +00:00
e0425e6108 docs(examples): add builder-adversary-lean — context hygiene + per-gate review
Isolates the two effects conflated in builder-adversary-stateless: keeps all the
CONTEXT HYGIENE (compact/diffs/lean loads) but ENFORCES full per-gate review
granularity (one claim per gate, one independent verdict per gate, no batching).
Tests whether the token saving is real efficiency vs reduced scrutiny.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 21:42:12 +00:00
985d33dd51 docs(examples): add builder-adversary-stateless — context-lean variant
Same pattern + AI-as-adversary verification as builder-adversary-min, but the
role prompts add CONTEXT HYGIENE: /compact at every checkpoint (lossless — state
is on disk), read diffs not trees, spill bulk output to files, adversary loads
only {plan, STATUS, diff}. Loop agents non-resumed → fresh session per phase.
Targets cache-read (the dominant cost in a long loop) without changing what the
agents do or how they verify.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 20:47:58 +00:00
737ef81066 docs(examples): add builder-adversary-min — minimal-prompt variant
Same topology/behaviour as builder-adversary (loop pair, phase machine,
claim()/review() handoff, machine-docs coordination, cold verification) but the
role + kickoff prompts are compressed to minimal tokens, keeping every
load-bearing rule. Config and plans are unchanged. The separate
agent-orchestrator-benchmark repo runs a head-to-head token comparison.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 20:18:33 +00:00
11843f41a4 docs(examples): add IDEAS.md — backlog of creative example topologies
A sketch backlog of further examples, each teaching a distinct orchestration
topology (anthill/stigmergy, kitchen line/pipeline, incident room/blackboard,
senate/debate, baton/mutex+failover, immune system/reactive, evolution chamber,
plus ATC and day-night extras). Not implemented — ideas only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 18:13:48 +00:00
e4453dcfdd docs(examples): add the "snake pit" worker-pool example
Based on @ponder.ooo's "snake pit agent orchestrator" idea (bsky 2026-05-28) and
Claude's metaphor-mapping elaboration: agents are snakes, tasks are food tossed
into a shared pit; snakes devour/digest/regurgitate/excrete.

A worker-pool-over-a-shared-queue topology (contrast the builder-adversary phase
machine):
- pit/ is a filesystem queue; snakes claim by atomic mv (no two eat the same food)
- species = specialized agents: keeper (zookeeper), planner (regurgitation IS
  task decomposition), snake-1..3 (worker pool), cleanup (scavenger + coprophagy)
- no [loop] phase machine; persistent agents self-pace via /loop
- README carries the full bio→compute mapping table from the thread image

Verified: `agents.py status --config agents.toml` lists all 6 agents + service.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 17:50:42 +00:00
7f237a522c docs(examples): add a Builder/Adversary loop-pair example (the cc-ci pattern)
A self-contained examples/builder-adversary/ that distills the cc-ci production
loop pair into a tiny, fully-local task (build a `wc` CLI in two phases):

- agents.toml: builder + adversary loops, persistent orchestrator, on_complete
  reporter, cleanlogs service; phase machine with a per-phase model override
- prompts/: kickoff template + builder/adversary roles carrying the load-bearing
  protocol (claim()/review() handoff, machine-docs file-location rule,
  WHAT+HOW+EXPECTED+WHERE=STATUS / WHY=JOURNAL anti-anchoring, WAITING-UNTIL liveness)
- plans/: two phase plans (wc, json) each with a cold-verifiable Definition of Done
- README: how to run, the work-repo two-clone isolation model, how to adapt

Verified: `agents.py status --config agents.toml` parses and lists all agents.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 17:50:42 +00:00
289ef07df4 feat: agent-orchestrator v0.1.0 — generic multi-agent harness
Extracted and generalized from a project-specific agent launch engine. No project
specifics remain in code: paths, the loop kickoff preamble, handoff conventions, and the
on-complete hook are all config/template driven; session_prefix + log_dir are required.

- agents.py: driver + watchdog (data-driven backends via prompt_delivery arg|ping|exec;
  required session_prefix/log_dir; project-rooted path resolution; configurable kickoff
  template, handoff patterns, on_complete task; tmux-safe; selftest + init verbs)
- agent-log.py: config-driven claude transcript renderer
- agents.example.toml: self-contained 2-agent example (dependency-free demo backend)
- prompts/: generic builder/adversary/kickoff templates
- smoke.sh: isolated up+down sandbox proof that cleans up after itself
- flake.nix/.lock: devShell (python311 + tmux + git)
- README.md: schema + verbs + AI-PO usage + nix

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 18:39:00 +00:00