Hacker Newsnew | past | comments | ask | show | jobs | submit | pfunctional's commentslogin

You're super right -- this is probably the one crack in our narrative and one that I sorely need to address. Hope to be back with something positive on this front soon, we're setting up all the benchmark harnesses to do this more equitably.


I think we have one on the site right now -- it's roughly 4.1-mini pricing. We're not aiming to make money off of individual users, which is why we're trialing a free thing (and trying to partner with open-source frameworks). Our bread and butter is more companies doing this at scale & licensing.


(Preston, other guy on the team)

Yes, they can -- I actually tried a semantic edit implementation in Aider. It got the "correct edit format" percentage to 100%, but didn't really budge the overall percent correct on SOTA models. I should push it sometime, since it really helps the reliability of these local models like Qwen3. If you reach out to me, I can try to share some of this code with you as well (it needs to be cleaned up).

But yes, 1. have some code, 2. create a patch (semantic, diff, or udiff formats all work), and 3. apply will return it to you very fast. There's roughly a 10-15% merge error rate when we last benchmarked on using Claude 3.7 Sonnet to create diff patches, and with us it was 4%; and you can use the Apply as a backup if the merge fails.


What's the semantic diff format?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: