Methodology
20 tasks. 5 trials per task. 100 trials total.Task sets
Rain
Event-sourced DSL built in Zig. A purpose-built language with a variety of sophisticated modules that make for a realistic engineering playground with no pretraining biases.- 4 feature tasks
- 3 bug tasks
- 3 refactor tasks (including a full Rust rewrite)
pgx
PostgreSQL driver in Go, a real GitHub project. Larger real-world project with real-world complexity and scale to navigate.- 3 feature tasks
- 5 bug tasks
- 2 refactor tasks