Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ComputeBench: Instruction-following benchmarks for long, step-by-step arithmetic (notdian.github.io)
1 point by notdian 57 days ago | hide | past | favorite | 1 comment


Vibecoded this after seeing models do amazing things but still drift on simple recursive steps; tracks exact match, answer accuracy, prefix correctness. Feedback welcome.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: