i built
for those of you who are autoresearch pilled , or have been meaning to get into autoresearch but dont know how. Its an opensource Claude Code & Codex plugin that optimizes code through experiments
you hand it a codebase. it finds a benchmark, runs the baseline, then fires off parallel agents to try to beat it. kept if better, discarded if worse.
inspired by karpathy's autoresearch, but with structure on top:
- tree search over greedy hill-climb — multiple forks from any committed node
- N parallel agents in git worktrees
- shared failure traces so agents don't repeat each other's mistakes
- regression gates
under the hood: each experiment is a git worktree branching from its parent. commits on score improvement + gate pass. discards + worktree cleanup on regression. everything observable in a local dashboard
Apache 2.0, no signup, no API keys beyond what Claude Code already has:
/plugin marketplace add evo-hq/evo
/plugin install evo@evo-hq-evo
This article was originally published by DEV Community and written by Alok Bishoyi.
Read original article on DEV Community