docs: FINDINGS.md — benchmark synthesis; track raw results data

Capstone summary of the Builder/Adversary prompt + verification-cadence study:
- adversary EXISTENCE costs ~4.7x (solo 2.8M vs ~13M); cadence is ~token-neutral
- context hygiene is the one clean -22% win; minimal prompts -25% but test less
- deferred review saves nothing (the one comprehensive pass is expensive) + late
- cost is process not product (tokens~duration 0.83, ~commits 0.79, ~LOC -0.04)
All results now in-repo: FINDINGS.md + RESULTS-campaign.md + raw .data + runners.
(deferred N=3, finalizing to N=5.)
This commit is contained in:
2026-06-16 01:53:34 +00:00
parent 819000417b
commit 3bf3316572
4 changed files with 164 additions and 9 deletions

2
.gitignore vendored
View File

@ -4,5 +4,3 @@ __pycache__/
*.pyc
*.tmp
RESULTS-harness.md.tmp
RESULTS-campaign.md.data
RESULTS-campaign.md.data.hdr