ruby-skill-bench orchestrates evaluation runs of AI coding agents inside isolated git sandboxes, then scores the results using deterministic and LLM-powered judges.
Required Ruby Version
>= 3.1
Authors
Ismael Marin
ruby-skill-bench orchestrates evaluation runs of AI coding agents inside isolated git sandboxes, then scores the results using deterministic and LLM-powered judges.
>= 3.1
Ismael Marin