Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My point was that a prompt that simple could be held and executed very well by sonnet, but all other models (especially reasoning models) crash and burn.

It's a 15 line tsx file so context shouldn't be an issue.

Makes me wonder if reasoning models are really proper models for coding in existing codebases



Your last point matches what I’ve seen some people (simonw?) say they’re doing currently: using aider to work with two models—one reasoning model as an architect, and one standard LLM as the actual coder. Surprisingly, the results seem pretty good vs. putting everything on one model.


This is probably the right way to think about it. O1-pro is an absolute monster when it comes to architecture. It is staggering the breadth and depth that it sees. Ask it to actually implement though, and it trips over its shoelaces almost immediately.


Can you give an example of this monstrous capability you speak of? What have you used it for professionally w.r.t. architecture.


The biggest delta over regular o1 that I've seen is asking it to make a PRD of an app that I define as a stream-of-consciousness with bullet points.

It's fantastic at finding needles in the haystack, so the contradictions are nonexistent. In other words, it seems to identify which objects would interrelate and builds around those nodes, where o1 seems to think more in "columns."

To sum it up, where o1 feels like "5 human minute thinking," o1-pro feels like "1 human hour thinking"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: