Obviously, they haven't figured out anything remotely sentient. It's cool as fuck, but it's not actually thinking. Thinking requires learning. You could show it a cat and it would still tell you it's a dog, no matter how many times you try and tell it.
Nothing about sentience is obvious. If the trees were sentient, would it be obvious? Is it therefore obvious that they're not? I think its a no in both cases. Same argument applies to AI model.
Sentience is at once too high a standard and too low a standard for AI.
It's too high in that it requires actual consciousness, which may be a very tough architectural problem at best (if functionalism is true) or an unknowable metaphysical mystery at worse (if some form of substance or property dualism is true).
And it's much too low a standard in that many, many sentient creatures are nowhere near intelligent enough to be useful assistants in the domains where we want to use AI.
My hunch is generative pre-trained transformers aren't going to do it just by scaling. Humans learn and modify their models as they go, it isn't all pre-training and then fixed. We need a modified algorithm.
The current situation is kind of like a grand prize where Zuck or similar will hand $1bn to anyone who cracks it. That's a huge incentive for people to have a go.
It's a perfect situation for Nvidia. You can see that after months of trying to squeeze out all % of marginal improvements, sama and co decided to brand this GPT-4.0.0.1 version as GPT-5. This is all happening on NVDA hardware, and they are gonna continue desperately iterating on tiny model efficiencies until all these valuation $$$ sweet sweet VC cash run out (most of it directly or indirectly going to NVDA).
Yeah if 'worlds apart in style' means 'kinda similar'.
There was this joke in this thread that there are the ChatGPT sommeliers that are discussing the subtle difference between the different models nowadays.
It's funny cause in the last year the models have kind of converged in almost every aspect, but the fanbase, kind of like pretentious sommeliers, is trying to convince us that the subtle 0.05% difference on some obscure benchmark is really significant and that they, the experts, can really feel the difference.
Yes, it has the familiar hints of oak that us chat lovers so enjoy but even a non initiated pleb like definitely feels it's less refined than the cytrus notes of o4.
Putting on my speculator hat here, it's as much about psychology and crowd behavior as fundamentals. Probably wait till it drops 30% and the news has "is it all over for AI?" stories. It'll then bounce back and then you sell the top of the bounce back.
I think one thing to look out for are "deliberately" slow models. We are currently using basically all models as if we needed them in an instant loop, but many of these applications do not have to run that fast.
To tell a made-up anecdote: A colleague told me how his professor friend was running statistical models over night because the code was extremely unoptimized and needed 6+ hours to compute. He helped streamline the code and took it down to 30 minutes, which meant the professor could run it before breakfast instead.
We are completely fine with giving a task to a Junior Dev for a couple of days and see what happens. Now we love the quick feedback of running Claude Max for a hundred bucks, but if we could run it for a buck over night? Would be quite fine for me as well.
I don’t really see how this works though — Isn’t it the case that longer “compute” times are more expensive? Hogging a gpu overnight is going to be more expensive than hogging it for an hour.
Nah, it’d take all night because it would be using the GPU for a fraction of the time, splitting the time with other customer’s tokens, and letting higher priority workloads preempt it.
If you buy enough GPUs to do 1000 customers’ requests in a minute, you could run 60 requests for each of these customers in an hour, or you could run a single request each for 60,000 customers in that same hour. The latter can be much cheaper per customer if people are willing to wait. (In reality it’s a big N x M scheduling problem, and there’s tons of ways to offer tiered pricing where cost and time are the main trafeoffs.)