Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

IMO by far the best improvement would be to make it easier for the agent to force the agent to use a success criterion.

Right now it's not easy prompting claude code (for example) to keep fixing until a test suite passes. It always does some fixed amount of work until it feels it's most of the way there and stops. So I have to babysit to keep telling it that yes I really mean for it to make the tests pass.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: