Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When to short NVIDIA? I guess when chinese get their cards production


Short?

It's a perfect situation for Nvidia. You can see that after months of trying to squeeze out all % of marginal improvements, sama and co decided to brand this GPT-4.0.0.1 version as GPT-5. This is all happening on NVDA hardware, and they are gonna continue desperately iterating on tiny model efficiencies until all these valuation $$$ sweet sweet VC cash run out (most of it directly or indirectly going to NVDA).


I'd rather they just call it GPT-5 than GPT 4.1o-Pro-Max like their current nightmare naming convention. I lost track of what the 'best' model is.


They are all..kinda the same?


No, they're really not. o3 and 4o are worlds apart in style and substance. Two completely different models


Yeah if 'worlds apart in style' means 'kinda similar'.

There was this joke in this thread that there are the ChatGPT sommeliers that are discussing the subtle difference between the different models nowadays.

It's funny cause in the last year the models have kind of converged in almost every aspect, but the fanbase, kind of like pretentious sommeliers, is trying to convince us that the subtle 0.05% difference on some obscure benchmark is really significant and that they, the experts, can really feel the difference.

It's hilarious and sad at the same time.


Have you used o3 more than 10 times?


Yes, it has the familiar hints of oak that us chat lovers so enjoy but even a non initiated pleb like definitely feels it's less refined than the cytrus notes of o4.


Putting on my speculator hat here, it's as much about psychology and crowd behavior as fundamentals. Probably wait till it drops 30% and the news has "is it all over for AI?" stories. It'll then bounce back and then you sell the top of the bounce back.


It's good for NVDA if the AI companies can't squeeze more performance out of the same compute, which is the case if GPT-5 underperforms


At some point the AI companies will run out of fools to give them money.


I think one thing to look out for are "deliberately" slow models. We are currently using basically all models as if we needed them in an instant loop, but many of these applications do not have to run that fast.

To tell a made-up anecdote: A colleague told me how his professor friend was running statistical models over night because the code was extremely unoptimized and needed 6+ hours to compute. He helped streamline the code and took it down to 30 minutes, which meant the professor could run it before breakfast instead.

We are completely fine with giving a task to a Junior Dev for a couple of days and see what happens. Now we love the quick feedback of running Claude Max for a hundred bucks, but if we could run it for a buck over night? Would be quite fine for me as well.


I don’t really see how this works though — Isn’t it the case that longer “compute” times are more expensive? Hogging a gpu overnight is going to be more expensive than hogging it for an hour.


Nah, it’d take all night because it would be using the GPU for a fraction of the time, splitting the time with other customer’s tokens, and letting higher priority workloads preempt it.

If you buy enough GPUs to do 1000 customers’ requests in a minute, you could run 60 requests for each of these customers in an hour, or you could run a single request each for 60,000 customers in that same hour. The latter can be much cheaper per customer if people are willing to wait. (In reality it’s a big N x M scheduling problem, and there’s tons of ways to offer tiered pricing where cost and time are the main trafeoffs.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: