> They are bad at deciding requirements by themselves.
What do you mean by requirements here? In my experience the frontier models today are pretty good at figuring out requirements, even when you don't explicitly state them.
> They are bad at original research
Sure, I don't have any experience with that, so I'll trust you on that.
> for example developing a new algorithm.
This is just not correct. I used to think so, but I was trying to come up with a pretty complicated pattern matching, multi-dimensional algorithm (I can't go into the details) - it was something that I could figure out on my own, and was half way through it, but decided to write up a description of it and feed it to gemini 2.5 pro a couple of months ago, and I was stunned.
It came up with a really clever approach and something I had previously been convinced the models weren't very good at it.
In hindsight, since they are getting so good at math in general, there's probably some overlap, but you should revisit your views on this.
--
Your 'bad at' list is missing a few things though:
- Calculations (they can come up with how to calculate or write a program to calculate from given data, but they are not good at calculating in their responses)
- Even though the frontier models are multi-modal, they are still bad at visualizing html/css - or interpreting what it would look like
- Same goes for visualizing/figuring out visual errors in graphics programming such as games programming or 3d modeling (z-index issues, orientation etc)
> They are bad at deciding requirements by themselves.
What do you mean by requirements here? In my experience the frontier models today are pretty good at figuring out requirements, even when you don't explicitly state them.
> They are bad at original research
Sure, I don't have any experience with that, so I'll trust you on that.
> for example developing a new algorithm.
This is just not correct. I used to think so, but I was trying to come up with a pretty complicated pattern matching, multi-dimensional algorithm (I can't go into the details) - it was something that I could figure out on my own, and was half way through it, but decided to write up a description of it and feed it to gemini 2.5 pro a couple of months ago, and I was stunned.
It came up with a really clever approach and something I had previously been convinced the models weren't very good at it.
In hindsight, since they are getting so good at math in general, there's probably some overlap, but you should revisit your views on this.
--
Your 'bad at' list is missing a few things though:
- Calculations (they can come up with how to calculate or write a program to calculate from given data, but they are not good at calculating in their responses)
- Even though the frontier models are multi-modal, they are still bad at visualizing html/css - or interpreting what it would look like
- Same goes for visualizing/figuring out visual errors in graphics programming such as games programming or 3d modeling (z-index issues, orientation etc)