Commit Graph

  • 64bc360fc0 chore: gitignore the runner's regenerated .data.hdr artifact main mfowler 2026-06-16 03:13:23 +00:00
  • 2d6dab7e22 docs: rewrite README — full study, how it works, points to FINDINGS (2026-06-16) mfowler 2026-06-16 02:33:14 +00:00
  • 8e6290e7e0 docs: finalize deferred at N=5 (median 12.89M, ~tied with orig) mfowler 2026-06-16 02:12:12 +00:00
  • 3bf3316572 docs: FINDINGS.md — benchmark synthesis; track raw results data mfowler 2026-06-16 01:53:34 +00:00
  • 819000417b feat: benchmark builder-adversary-deferred (4-phase incl review); limit-detect across all roots mfowler 2026-06-16 00:07:43 +00:00
  • aeee484395 results: full 5-variant campaign complete (incl. builder-solo control) mfowler 2026-06-15 07:42:46 +00:00
  • 29b89140e7 results: 4-variant campaign complete (5/5 each); analysis with ratios mfowler 2026-06-15 06:40:34 +00:00
  • 5f9805173a fix: veto check matches all-caps '## VETO <reason>', not '## Veto log' header mfowler 2026-06-15 06:22:40 +00:00
  • 583fc2a0dc chore: append-mode data file; engine -> c6c7ce8 (orig-based stateless/lean) mfowler 2026-06-15 03:19:09 +00:00
  • fc0608ede1 feat: builder-solo control runner (run after campaign) + limit-detect for it mfowler 2026-06-15 02:36:58 +00:00
  • 25a77f5d3c fix: flag usage-limit-affected runs; correct tok/sec mfowler 2026-06-15 01:29:54 +00:00
  • 33eeb3ce6b feat: analyze.py — efficiency ratios (tokens/LOC, tokens/sec, tokens/commit) mfowler 2026-06-15 00:15:46 +00:00
  • dbe9ef9c72 feat: keep run repos + record commits/LOC per run mfowler 2026-06-15 00:13:08 +00:00
  • 37032ee363 feat: campaign mode — repeat each variant N times, aggregate distributions mfowler 2026-06-14 22:19:10 +00:00
  • b46dca003c results: 4-way + the variance finding (N=1 is not enough) mfowler 2026-06-14 22:06:21 +00:00
  • cca5c895b2 feat: add builder-adversary-lean variant; runner takes variant args mfowler 2026-06-14 21:43:11 +00:00
  • 0fa3d726a5 results: full-harness 3-way (orig/min/stateless) on the calculator mfowler 2026-06-14 21:35:47 +00:00
  • a1b59e1bc5 feat: add stateless variant, pre-trust work dirs, loop over 3 variants mfowler 2026-06-14 20:52:29 +00:00
  • 11eda4a8b1 chore: gitignore the runner's transient .tmp file mfowler 2026-06-14 20:40:26 +00:00
  • 8c3f38dbf4 feat: multi-phase calculator problem + full-harness benchmark runner mfowler 2026-06-14 20:40:14 +00:00
  • 27df2c7b55 feat: agent-orchestrator-benchmark — prompt token comparison harness mfowler 2026-06-14 20:20:05 +00:00