Does this mean AGI is cancelled? 2027 hard takeoff was just sci-fi?

usaar333 · 2025-08-07T17:12:40 1754586760

At this point the prediction for SWE bench (85% by end of this month) is not materializing. We're actually quite far away.

growthwtf · 2025-08-07T17:57:07 1754589427

Good thing they didn't nuke the data centers after all!

Keyframe · 2025-08-07T17:15:33 1754586933

Always has been.

MagicMoonlight · 2025-08-07T23:30:34 1754609434

Obviously, they haven't figured out anything remotely sentient. It's cool as fuck, but it's not actually thinking. Thinking requires learning. You could show it a cat and it would still tell you it's a dog, no matter how many times you try and tell it.

__MatrixMan__ · 2025-08-08T11:55:20 1754654120

Nothing about sentience is obvious. If the trees were sentient, would it be obvious? Is it therefore obvious that they're not? I think its a no in both cases. Same argument applies to AI model.

pxc · 2025-08-08T15:37:20 1754667440

Sentience is at once too high a standard and too low a standard for AI.

It's too high in that it requires actual consciousness, which may be a very tough architectural problem at best (if functionalism is true) or an unknowable metaphysical mystery at worse (if some form of substance or property dualism is true).

And it's much too low a standard in that many, many sentient creatures are nowhere near intelligent enough to be useful assistants in the domains where we want to use AI.

tim333 · 2025-08-07T19:56:22 1754596582

Still got 24 months to work on it.

tim333 · 2025-08-08T10:32:53 1754649173

My hunch is generative pre-trained transformers aren't going to do it just by scaling. Humans learn and modify their models as they go, it isn't all pre-training and then fixed. We need a modified algorithm.

The current situation is kind of like a grand prize where Zuck or similar will hand $1bn to anyone who cracks it. That's a huge incentive for people to have a go.

machiaweliczny · 2025-08-07T17:31:31 1754587891

When to short NVIDIA? I guess when chinese get their cards production

ath3nd · 2025-08-07T17:48:57 1754588937

Short?

It's a perfect situation for Nvidia. You can see that after months of trying to squeeze out all % of marginal improvements, sama and co decided to brand this GPT-4.0.0.1 version as GPT-5. This is all happening on NVDA hardware, and they are gonna continue desperately iterating on tiny model efficiencies until all these valuation $$$ sweet sweet VC cash run out (most of it directly or indirectly going to NVDA).

cedws · 2025-08-07T17:58:18 1754589498

I'd rather they just call it GPT-5 than GPT 4.1o-Pro-Max like their current nightmare naming convention. I lost track of what the 'best' model is.

ath3nd · 2025-08-07T18:03:28 1754589808

They are all..kinda the same?

FergusArgyll · 2025-08-07T21:10:00 1754601000

No, they're really not. o3 and 4o are worlds apart in style and substance. Two completely different models

ath3nd · 2025-08-07T21:44:37 1754603077

Yeah if 'worlds apart in style' means 'kinda similar'.

There was this joke in this thread that there are the ChatGPT sommeliers that are discussing the subtle difference between the different models nowadays.

It's funny cause in the last year the models have kind of converged in almost every aspect, but the fanbase, kind of like pretentious sommeliers, is trying to convince us that the subtle 0.05% difference on some obscure benchmark is really significant and that they, the experts, can really feel the difference.

It's hilarious and sad at the same time.

FergusArgyll · 2025-08-07T23:21:33 1754608893

Have you used o3 more than 10 times?

ath3nd · 2025-08-08T07:56:33 1754639793

Yes, it has the familiar hints of oak that us chat lovers so enjoy but even a non initiated pleb like definitely feels it's less refined than the cytrus notes of o4.

tim333 · 2025-08-08T10:38:30 1754649510

Putting on my speculator hat here, it's as much about psychology and crowd behavior as fundamentals. Probably wait till it drops 30% and the news has "is it all over for AI?" stories. It'll then bounce back and then you sell the top of the bounce back.

mvieira38 · 2025-08-07T18:42:47 1754592167

It's good for NVDA if the AI companies can't squeeze more performance out of the same compute, which is the case if GPT-5 underperforms

teaearlgraycold · 2025-08-07T21:56:08 1754603768

At some point the AI companies will run out of fools to give them money.

bakuninsbart · 2025-08-07T19:46:38 1754595998

I think one thing to look out for are "deliberately" slow models. We are currently using basically all models as if we needed them in an instant loop, but many of these applications do not have to run that fast.

To tell a made-up anecdote: A colleague told me how his professor friend was running statistical models over night because the code was extremely unoptimized and needed 6+ hours to compute. He helped streamline the code and took it down to 30 minutes, which meant the professor could run it before breakfast instead.

We are completely fine with giving a task to a Junior Dev for a couple of days and see what happens. Now we love the quick feedback of running Claude Max for a hundred bucks, but if we could run it for a buck over night? Would be quite fine for me as well.

cyberpunk · 2025-08-07T20:18:17 1754597897

I don’t really see how this works though — Isn’t it the case that longer “compute” times are more expensive? Hogging a gpu overnight is going to be more expensive than hogging it for an hour.

ninkendo · 2025-08-08T03:04:14 1754622254

Nah, it’d take all night because it would be using the GPU for a fraction of the time, splitting the time with other customer’s tokens, and letting higher priority workloads preempt it.

If you buy enough GPUs to do 1000 customers’ requests in a minute, you could run 60 requests for each of these customers in an hour, or you could run a single request each for 60,000 customers in that same hour. The latter can be much cheaper per customer if people are willing to wait. (In reality it’s a big N x M scheduling problem, and there’s tons of ways to offer tiered pricing where cost and time are the main trafeoffs.)

HackerLemon · 2025-08-08T10:48:00 1754650080

But but but my tech bro CTO said grok IS AGI