More

atleastoptimal · 2026-02-08T07:14:34 1770534874

I feel like I'm caught in between two schizophrenically myopic perspectives on AI.

One being:

>Generative AI is a product of VC-funding enabled hype, enormous subsidies and fraudulent results. No AI code "really" works or contributes to productivity, and soon the bubble will burst, returning Real Software Engineers to their former peerless ascendency.

And the other perspective:

>The AI boom will be the last chance to make money, after which point your socioeconomic status circa 2028 will be the permanent station of all your progeny, who will enjoy a heavenly post-scarcity experience with luxury amenities scaled by your PageRank equivalent of social connections to employees at leading AI labs.

atleastoptimal · 2026-02-07T00:09:45 1770422985

Only about 20% of front-page links are related to AI. I think it's impossible to have a productive discussion on the tech industry nowadays without AI in context.

krapp · 2026-02-07T00:15:03 1770423303

¯\_(ツ)_/¯ There are far fewer posts about "politics" than AI, and they all get flagged ruthlessly even when they have a legitimate tech angle.

If people don't like AI related content then I encourage them to treat those posts the way they do politics.

bigstrat2003 · 2026-02-07T02:58:33 1770433113

On the contrary, I don't think productive discussions (or even interesting ones) can be had about AI. We've seen what it has to offer (not much), now we are just waiting for the hype bubble to burst like it did for blockchain and so many other things before.

atleastoptimal · 2026-02-07T00:03:38 1770422618

No matter what, people are going to still use cars because they are an absolute advantage over public transportation for certain use cases. It is better that the existing status quo is improved to reduce death rates, than hope for a much larger scale change in infrastructure (when we have already seen that attempts at infrastructure overhaul in the US, like high-speed rail, is just an infinitely deep money pit)

Even though the train system in Japan is 10x better than the US as a whole, the per-capita vehicle ownership rate in Japan is not much lower than the US (779 per 1000 vs 670 per 1000). It would be a pipe dream for American trains/subways to be as good as Japan, but even a change that significant would lead to a vehicle ownership share reduced by only about 13%.

atleastoptimal · 2026-02-03T23:39:06 1770161946

Its very simple, xAI needs money to win the AI race, so best option is to attach to Elon’s moneybank (spacex) to get cash without dilution

georgemcbay · 2026-02-04T00:50:44 1770166244

> xAI needs money to win the AI race

Off on a tangent here but I'd love for anyone to seriously explain how they believe the "AI race" is economically winnable in any meaningful way.

Like what is the believed inflection point that changes us from the current situation (where all of the state-of-the-art models are roughly equal if you squint, and the open models are only like one release cycle behind) to one where someone achieves a clear advantage that won't be reproduced by everyone else in the "race" virtually immediately.

fhd2 · 2026-02-04T05:55:22 1770184522

I _think_ the idea is that the first one to hit self improving AGI will, in a short period of time, pull _so_ far ahead that competition will quickly die out, no longer having any chance to compete economically.

At the same time, it'd give the country controlling it so much economic, political and military power that it becomes impossible to challenge.

I find that all to be a bit of a stretch, but I think that's roughly what people talking about "the AI race" have in mind.

theshrike79 · 2026-02-04T10:40:00 1770201600

Like any other mega-scaler, theyre just playing Money Chicken.

Everyone is spending crazy amounts of money in the hopes that the competition will tap out because they can't afford it anymore.

Then they can cool down on their spending and increase prices to a sustainable level because they have an effective monopoly.

a_victorp · 2026-02-04T12:25:16 1770207916

Money Chicken is the best term I've seen for this!

CamperBob2 · 2026-02-04T00:57:14 1770166634

They ultimately want to own everyone's business processes, is my guess. You can only jack up the subscription prices on coding models and chatbots by so much, as everyone has already noted... but if OpenAI runs your "smart" CRM and ERP flows, they can really tighten the screws.

adventured · 2026-02-04T02:11:55 1770171115

If you have the greatest coding agent under your thumb, eventually you orient it toward eating everything else instead of letting everybody else use your agent to build software & make money. Go forward ten years, it's highly likely GPT, Gemini, maybe Claude - they'll have consumed a very large amount of the software ecosystem. Why should MS Office exist at all as a separate piece of software? The various pieces of Office will be trivial for the GPT (etc) of ten years out to fully recreate & maintain internally for OpenAI. There's no scenario where they don't do what the platforms always do: eat the ecosystem, anything they can. If a platform can consume a thing that touches it, it will.

Office? Dead. Box? Dead. DropBox? Dead. And so on. They'll move on anything that touches users (from productivity software to storage). You're not going to pay $20-$30 for GPT and then pay for DropBox too, OpenAI will just do an Amazon Prime maneuver and stack more onto what you get to try to kill everyone else.

Google of course has a huge lead on this move already with their various prominent apps.

notahacker · 2026-02-04T14:21:08 1770214868

Dropbox is actually a great example of why this isn't likely to happen. Deeper pocketed competition with tons of cloud storage and the ability to build easy upload workflows (including directly into software with massive install base) exists, and showed an active interest in competing with them. Still doing OK

Office's moat is much bigger (and its competition already free). "New vibe coded features every week" isn't an obvious reason for Office users to switch away from the platform their financial models and all their clients rely on to a new upstart software suite

ExoticPearTree · 2026-02-04T09:19:50 1770196790

> Off on a tangent here but I'd love for anyone to seriously explain how they believe the "AI race" is economically winnable in any meaningful way.

Because the first company to have a full functioning AGI will most likely be the most valuable in the world. So it is worth all the effort to be the first.

georgemcbay · 2026-02-04T15:32:43 1770219163

> Because the first company to have a full functioning AGI will most likely be the most valuable in the world.

This may be what they are going for, but there are two effectively religious beliefs with this line of thinking, IMO.

The first is that LLMs lead to AGI.

The second is that even if the first did turn out to be true that they wouldn't all stumble into AGI at the same time, which given how relatively lockstep all of the models have been for the past couple of years seems far more likely to me than any single company having a breakthrough the others don't immediately reproduce.

iknowstuff · 2026-02-04T00:23:28 1770164608

Remember how he argued for Tesla’s Solarcity acquisition because solar roofs?

Data centers in space are the same kind of justification imo.

MobiusHorizons · 2026-02-04T00:30:03 1770165003

Solar roofs are much more practical to be honest.

undersuit · 2026-02-04T02:30:09 1770172209

Putting solar roofs on a building? For a car company?

kuschku · 2026-02-04T08:21:10 1770193270

There's a synergy effect here - Tesla sells you a solar roof and car bundle, the roof comes without a battery (making it cheaper) and the car now gets a free recharge whenever you're home (making it cheaper in the long term).

Of course that didn't work out with this specific acquisition, but overall it's at least a somewhat reasonable idea.

MobiusHorizons · 2026-02-04T04:25:30 1770179130

In comparison to datacenters in space yes. Solar roofs are already a profitable business, just not likely to be high growth. Datacenters in space are unlikely to ever make financial sense, and even if they did, they are very unlikely to show high growth due to continuing ongoing high capital expenses inherent in the model.

ben_w · 2026-02-04T13:53:52 1770213232

I think a better critique of space-based data centres is not that they never become high growth, it's just that when they do it implies the economy is radically different from the one we live in to the degree that all our current ideas about wealth and nations and ownership and morality and crime & punishment seem quaint and out-dated.

The "put 500 to 1000 TW/year of AI satellites into deep space" for example, that's as far ahead of the entire planet Earth today as the entire planet Earth today is from specifically just Europe right after the fall of Rome. Multiplicatively, not additively.

There's no reason to expect any current business (or nation, or any given asset) to survive that kind of transition intact.

mayoff · 2026-02-04T05:40:00 1770183600

For an electrification company.

rsynnott · 2026-02-04T08:17:52 1770193072

It's obviously a pretty weird thing for a car company to do, and is probably just a silly idea in general (it has little obvious benefit over normal solar panels, and is vastly more expensive and messy to install), but in principle it could at least work, FSOV work. The space datacenter thing is a nonsensical fantasy.

Findecanor · 2026-02-04T00:32:24 1770165144

> win the AI race

I keep seeing that term, but if it does not mean "AI arms race" or "AI surveillance race", what does it mean?

Those are the only explanations that I have found, and neither is any race that I would like to see anyone win.

bigstrat2003 · 2026-02-04T01:17:34 1770167854

Big tech businesses are convinced that there must be some profitable business model for AI, and are undeterred by the fact that none has yet been found. They want to be the first to get there, raking in that sweet sweet money (even though there's no evidence yet that there is money to be made here). It's industry-wide FOMO, nothing more.

FranklinJabar · 2026-02-04T06:40:14 1770187214

Typically in capitalism, if there is any profit, the race is towards zero profit. The alternative is a race to bankrupt all competitors at enormous cost in order to jack up prices and recoup the losses as a monopoly (or duopoly, or some other stable arrangement). I assume the latter is the goal, but that means burning through like 50%+ of american gdp growth just to be undercut by china.

Imo I would be extremely angry if I owned any spacex equity. At least nvidia might be selling to china in the short term... what's the upside for spacex?

WalterBright · 2026-02-04T07:58:16 1770191896

> The alternative is a race to bankrupt all competitors at enormous cost in order to jack up prices and recoup the losses as a monopoly

I don't know of an instance of this happening successfully.

kuschku · 2026-02-04T08:17:37 1770193057

Walmart? It's certainly more successful in physical markets

WalterBright · 2026-02-04T17:31:12 1770226272

See Amazon

kuschku · 2026-02-04T17:48:12 1770227292

Are you saying that Amazon is a successful monopoly, or that Amazon is even with massive expenses still not a full monopoly?

WalterBright · 2026-02-04T22:52:20 1770245540

Walmart competes with Amazon.

FranklinJabar · 2026-02-04T20:00:09 1770235209

Different markets entirely—I can't walk into amazon, and I don't order online from Walmart.

WalterBright · 2026-02-05T02:07:00 1770257220

You can order online from Walmart:

https://www.walmart.com/

Amazon can ship it to a location near you.

FranklinJabar · 2026-02-05T18:23:48 1770315828

Again, different markets, because I'm not going to do either of those things—if I'm ordering online amazon has better selection, and if I want to walk somewhere to pick something up I'm not going to wait for shipping.

FranklinJabar · 2026-02-04T20:01:37 1770235297

taxi apps, delivery apps, social media apps—all of these require a market that's extremely expensive to build but is also extremely lucrative to exploit and difficult to unseat. You see this same model with big-box stores displacing local stores. The secret to making a lot of money under capitalism is to have a lot of money to begin with.

WalterBright · 2026-02-04T22:51:56 1770245516

Taxis are a government created monopoly.

None of the big-box stores have created a monopoly.

Amazon unseated behemoth Walmart with a mere $300,000 startup capital.

Musk founded his empire with $28,000.

FranklinJabar · 2026-02-05T18:26:42 1770316002

> Taxis are a government created monopoly.

Taxi apps—uber & lyft. They moved into an area (often illegally); spent a shit-ton of money to displace local legal taxis, and then jacked up prices when the competition ceased to exist. Now I can't hail a taxi anymore if I don't have a phone.

> None of the big-box stores have created a monopoly.

They do in my region. Mom and pop shops are gone.

> Amazon unseated behemoth Walmart with a mere $300,000 startup capital.

We've been over this—they occupy different markets.

> Musk founded his empire with $28,000.

Sure. It would have been far easier to do with more capital.

WalterBright · 2026-02-06T00:39:13 1770338353

Uber and Lyft compete with each other. The higher prices resulted from government mandates on pay for the drivers.

Amazon and Walmart do compete with each other. Neither has a monopoly. Nor have I noticed jacked up prices from them.

jabron · 2026-02-04T10:01:07 1770199267

Amazon

WalterBright · 2026-02-04T17:31:22 1770226282

See Walmart

hannasanarion · 2026-02-04T04:38:55 1770179935

People keep saying this but it's simply untrue. AI inference is profitable. Openai and Anthropic have 40-60% gross margins. If they stopped training and building out future capacity they would already be raking in cash.

They're losing money now because they're making massive bets on future capacity needs. If those bets are wrong, they're going to be in very big trouble when demand levels off lower than expected. But that's not the same as demand being zero.

adgjlsfhk1 · 2026-02-04T05:46:58 1770184018

those gross profit margins aren't that useful since training at fixed capacity is continually getting cheaper, so there's a treadmill effect where staying in business requires training new models constantly to not fall behind. If the big companies stop training models, they only have a year before someone else catches up with way less debt and puts them out of business.

HDThoreaun · 2026-02-04T07:58:37 1770191917

Only if training new models leads to better models. If the newly trained models are just a bit cheaper but not better most users wont switch. Then the entrenched labs can stop training so much and focus on profitable inference

kuschku · 2026-02-04T08:18:47 1770193127

If they really have 40-60% gross margins, as training costs go down, the newly trained models could offer the same product at half the price.

HDThoreaun · 2026-02-04T21:31:30 1770240690

Well thats why the labs are building these app level products like claude code/codex to lock their users in. Most of the money here is in business subscriptions I think, how much savings would be required for businesses to switch to products that arent better, just cheaper?

kuschku · 2026-02-04T21:49:04 1770241744

I think the real lock-in is in "CLAUDE.md" and similar rulesets, which are heavily AI specific.

yencabulator · 2026-02-06T16:36:51 1770395811

Why would they be heavily "AI specific", when we're being told these things are approaching AGI and can just read arbitrary work documents?

mbesto · 2026-02-04T10:58:37 1770202717

> Openai and Anthropic have 40-60% gross margins.

Stop this trope please. We (1) don't really know what their margins are and (2) because of the hard tie-in to GPU costs/maintenance we don't know (yet) what the useful life (and therefore associated OPEX) is of GPUs.

> If they stopped training and building out future capacity they would already be raking in cash.

That's like saying "if car companies stopped researching how to make their cars more efficient, safer, more reliable they'd be more profitable"

Nystik · 2026-02-04T02:51:46 1770173506

It will be genuinely interesting to see what happens first, the discovery of such a model, or the bubble bursting.

ekidd · 2026-02-04T01:11:46 1770167506

A significant number of AI companies and investors are hoping to build a machine god. This is batshit insane, but I suppose it might be possible. Which wouldn't make it any more sane.

But when they say, "Win the AI race," they mean, "Build the machine god first." Make of this what you will.

FeteCommuniste · 2026-02-04T03:18:21 1770175101

On the edge of my seat waiting to see what hits us first, a massive economic collapse when the hype runs out, or the Torment Nexus.

reverius42 · 2026-02-04T06:35:59 1770186959

It really seems like the market has locked in on one of those two things being a guaranteed outcome at this point.

totetsu · 2026-02-04T00:49:49 1770166189

It’s a graft to keep people distracted and allow for positioning as we fall off the end of the fossil energy boom.

strange_quark · 2026-02-04T00:47:12 1770166032

It’s a framing device to justify the money, the idea being the first company (to what?) will own the market.

atleastoptimal · 2026-02-04T01:48:38 1770169718

Being too far ahead for competitors to catch up, similar to how google won browsers, amazon won distribution, etc

danw1979 · 2026-02-04T10:11:14 1770199874

I’m not certain spacex is generating much cash right now ?

Starship development is consuming billions. F9 & Starlink are probably profitable ?

I’d say this is more shifting of the future burden of xAI to one of his companies he knows will be a hit stonk when it goes public, where enthusiasm is unlikely to be dampened by another massive cash drain on the books.

ben_w · 2026-02-03T23:46:13 1770162373

That may be the plan, but this is also a great way for GDPR's maximum fine, based on global revenue, to bite on SpaceX's much higher revenue. And without any real room for argument.

atleastoptimal · 2026-01-19T01:43:26 1768787006

Knowing what users want and need is more the essence of a product manager, not a software engineer.

Software engineering is solving problems given a set of requirements, and determining the value, need and natural constraints of those requirements in a given system. Understanding users is a task interfaces with software engineering but is more on the "find any way to get this done" axis of value rather than the "here is how we will get it done" one.

I'd say what OP is referencing is that LLM's are increasingly adept at writing software that fulfills a set of requirements, with the prompter acting as a product manager. This devalues software engineers in that many somewhat difficult technical tasks, once the sole domain of SWEs is not commodified via agentic coding tools.

veunes · 2026-01-19T11:06:03 1768820763

That's a dangerous distinction in the AI era. If you reduce your work to solving problems given a set of requirements, you put yourself in direct competition with agents. LLMs are perfect for taking a clear spec and outputting code. A "pure" engineer who refuses to understand the product and the user risks becoming just middleware between the PM and the AI. In the future, the lines between PM and Tech Lead will blur, and the engineers who survive will be those who can not only "do as told" but propose "how to do it better for the business"

raw_anon_1111 · 2026-01-19T02:28:08 1768789688

> Software engineering is solving problems given a set of requirements, and determining the value, need and natural constraints of those requirements in a given system

That’s the description of a mid level code monkey according to every tech company with leveling guidelines and easily outsourced and commoditized before the age of AI.

atleastoptimal · 2026-01-19T22:03:30 1768860210

If it was easily outsourced and commoditized before AI how come mid level code monkeys were making 200k+ at FAANG

raw_anon_1111 · 2026-01-20T01:24:09 1768872249

And most of the 3 million developers working in the US aren’t working for a FAANG and will never make over $200K inflation adjusted. If you look at the comp of most “senior developers” outside of FAANG and equivalent, you’ll see that the comp has verb stagnant and hasn’t kept up with inflation for a decade.

I have personally given the thumbs down to two developers who came from a FAANG when it was clear that they were just code monkeys who had to have everything handed to them.

Have you looked at how hard it is for mid level code monkeys even from a FAANG to get a job these days? Just being able to reverse a b tree isn’t enough anymore.

FWIW, I did a 3.5 year stint at AWS until late 2023 Professional Services (full time with the same 4 year comp structure as software devs get). But made about 20% less and it was remote the full time I was there. and I’m very well aware of what software developers make.

I still work full time at a consulting company (cloud + app dev). And no FAANG doesn’t pay enough difference than what I make now to give up remote work in state tax free relatively low cost of living central Florida at 50 years old and grown (step)kids

ativzzz · 2026-01-19T04:48:41 1768798121

Great, then we can use AI to solve the problems given a set of requirements, and spend more time thinking about what the requirements are by understanding the users.

PM and software development will converge more and more as AI gets better.

The best PMs will be the ones who can understand customers and create low-fidelity prototypes or even "good enough" vibe coded solutions to customers

The best engineers will be the ones who use their fleet of subagents to work on the "correct" requirements, by understanding their customers

At the end of the day, we are using software to solve people's problems. Those who understand that, and have skills around diving in and navigating people's problems will come out ahead

atleastoptimal · 2026-01-11T03:53:03 1768103583

The US's entire economy depends on tech. They won't do anything that would compromise the integrity and viability of the international tech industrial complex.

In the US you also are not arrested for social media posts like you are in the UK or other parts of Europe.

aosaigh · 2026-01-12T09:05:21 1768208721

At the moment, you can’t GET INTO the US if you have a social media post that even criticises the administration. Is that the “free speech” people in the US are so obsessed with?

atleastoptimal · 2026-01-07T22:09:41 1767823781

The general conceit of this article, which is something that many frontier labs seem to be beginning to realize, is that the average human is no longer smart enough to provide sufficient signal to improve AI models.

gpm · 2026-01-08T01:36:09 1767836169

No, it's that the average unpaid human doesn't care to read closely enough to provide signal to improve AI models. Not that they couldn't if they put in even the slightest amount of effort.

kazinator · 2026-01-08T01:51:52 1767837112

Firstly, paying is not at all the correct incentive for the desired outcome. When the incentive is payment, people will optimize for maximum payout not for the quality goals of the system.

Secondly, it doesn't fix stupidity. A participant who earnestly takes the quality goals of the system to heart instead of focusing on maximizing their take (thus, obviously stupid) will still make bad classifications due to that reason.

tbrownaw · 2026-01-08T02:24:35 1767839075

> Firstly, paying is not at all the correct incentive for the desired outcome. When the incentive is payment, people will optimize for maximum payout not for the quality goals of the system.

1. I would expect any paid arrangement to include a quality-control mechanism. With the possible exception of if it was designed from scratch by complete ignoramuses.

2. Do you have a proposal for a better incentive?

Eisenstein · 2026-01-08T10:40:13 1767868813

1. Goodhart's law suggests that you will end up with quality control mechanisms which work at ensuring that the measure is being measured, but not that it is measuring anything useful

2. Criticism of a method does not require that there is a viable alternative. Perhaps the better idea is just to not incentivize people to do tasks they are not qualified for

dresrs · 2026-01-08T09:43:12 1767865392

> Secondly, it doesn't fix stupidity.

Agreed, and would add that it doesn’t fix other things like lack of skill, focus, time, etc.

An example is the output of the Amazon Turk “Sheep Market” experiment:

https://docubase.mit.edu/project/the-sheep-market/

Some of those sheep were really ba-aaa-ad.

zem · 2026-01-09T05:53:07 1767937987

I don't think there is any correct incentive for "do unpaid labour for someone's proprietary model but please be diligent about it"

edit: ugh. it's even worse, lmarena itself is a proprietary system, so the users presumably don't even get the benefit of an open dataset out of all this

ehnto · 2026-01-08T01:40:09 1767836409

Why would an unpaid human want to do that?

alterom · 2026-01-08T01:41:41 1767836501

Exactly — they wouldn't.

0manrho · 2026-01-08T03:20:42 1767842442

Therein lies the problem.

Y_Y · 2026-01-07T22:26:42 1767824802

But when you're a moron how can you distinguish?

I'm being (mostly) serious, suppose you're a stuffed ahort trying to boost your valuation, how can you work out who's smart enough to train your LLM? (Never mind how to get them to work for you!)

aspenmartin · 2026-01-07T22:34:11 1767825251

I do a lot of human evaluations. Lots of Bayesian / statistical models that can infer rater quality without ground truth labels. The other thing about preference data you have to worry about (which this article gets at) is: preferences of _who_? Human raters are a significantly biased population of people, different ages, genders, religions, cultures, etc all inform preferences. Lots of work being done to leverage and model this.

Then for LMArena there is the host of other biases / construct validity: people are easily fooled, even PhD experts; in many cases it’s easier for a model to learn how to persuade than actually learn the right answers.

But a lot of dismissive comments as if frontier labs don’t know this, they have some of the best talent in the world. They aren’t perfect but they in a large sene know what they’re doing and what the tradeoffs of various approaches are.

Human annotations are an absolute nightmare for quality which is why coding agents are so nice: they’re verifiable and so you can train them in a way closer to e.g. alphago without the ceiling of human performance

fc417fc802 · 2026-01-07T22:48:00 1767826080

> in many cases it’s easier for a model to learn how to persuade than actually learn the right answers

So we should expect the models to eventually tend toward the same behaviors that politicians exhibit?

c0balt · 2026-01-07T23:42:39 1767829359

Maybe a happy to deceive marketing/sales role would be more accurate.

RA_Fisher · 2026-01-07T23:41:07 1767829267

100% (am a Bayesian statistician).

Isn’t it fascinating how it comes down to quality of judgement (and the descriptions thereof)?

We need an LMArena rated by experts.

Lerc · 2026-01-08T03:14:21 1767842061

As a statistician, do you you think you could, given access to the data, identify the subset of LMArena users that are experts?

RA_Fisher · 2026-01-08T12:09:52 1767874192

Yes, for sure! I can think of a few ways.

zqy123007 · 2026-01-08T01:14:40 1767834880

they always know, they just have non-AGI incentive and asymetric upside to play along...

wongarsu · 2026-01-07T22:43:59 1767825839

Sure, on the surface judging the judge is just as hard as being the judge

But at least the two examples of judging AI provided in the article can be solved by any moron by expending enough effort. Any moron can tell you what Dorothy says to Toto when entering Oz by just watching the first thirty minutes of the movie. And while validating answer B in the pan question takes some ninth-grade math (or a short trip to wikipedia), figuring out that a nine inch diameter circle is in fact not the same area as a 9x13 inch square is not rocket science. And with a bit of craft paper you could evaluate both answers even without math knowledge

So the short answer is: with effort. You spend lots of effort on finding a good evaluator, so the evaluator can judge the LLM for you. Or take "average humans" and force them to spend more effort on evaluating each answer

michaelmrose · 2026-01-08T01:38:47 1767836327

Maybe you need to have people rate others ratings to remove at least the worst idiots.

XajniN · 2026-01-12T20:20:41 1768249241

Your social bubble is making you biased. An average human is quite dumb.

atleastoptimal · 2026-01-07T22:37:26 1767825446

that’s why Mercor is worth 2billion

ryandrake · 2026-01-07T23:09:43 1767827383

Popularity has never been a meaningful signal of quality, no matter how many tech companies try to make it so, with their star ratings, up/down voting, and crowdsourcing schemes.

PaulHoule · 2026-01-08T00:42:32 1767832952

Different strokes for different folks: I mean who is to say if Bleach or Backstabbed in a Backwater Dungeon: My Trusted Companions Tried to Kill Me, but Thanks to the Gift of an Unlimited Gacha I Got LVL 9999 Friends and Am Out for Revenge on My Former Party Members and the World is better?

echelon · 2026-01-08T02:12:58 1767838378

If these frontier models were open source, the market of downstream consumers would figure out how to optimize them.

By being closed, they'll never be optimal.

Yizahi · 2026-01-07T22:46:54 1767826014

Yep, it's like getting a commoner from the street evaluate a literature PhD in their native language. Sure, both know the language, but the depth difference of a specialist vs a generalist is too large. And neither we can't use AI to automatically evaluate this literature genius because real AI doesn't exist (yet), hence the programs can't understand the contents of text they output or input. Whoops. :)

cyanydeez · 2026-01-07T22:25:26 1767824726

They need to spend money on actual experts to curate their data to improve.

Instead, finance bros are convinced by the argument that number goes up.

8f2ab37a-ed6c · 2026-01-07T22:29:01 1767824941

Is that not exactly what https://www.mercor.com/ does?

aspenmartin · 2026-01-07T22:28:44 1767824924

Wait you know that frontier labs do actually do this right?

Terr_ · 2026-01-07T22:28:26 1767824906

Sometimes it feels like:

    def is_it_true(question): 
        return profit_if_true(question) > profit_if_false(question)

AI will make it cheaper, faster, better, no problem. You can eat the cake now and save it for later.

kazinator · 2026-01-08T01:47:56 1767836876

It is glaringly obvious that the average human is not smart enough to the level hat their decision making should be replicated and adopted at scale.

People hold falsehoods to be true, and cannot calculate a 10% tip.

michaelmrose · 2026-01-08T01:36:33 1767836193

The average human is a moron you wouldn't trust to watch your hamster. If you watched them outside of the narrow range of tasks they have been trained to perform by rote you would probably conclude they should qualify for benefits by virtue of mental disability.

We give them WAY too much credit by watching mostly the things they have been trained specifically to do and pretending this indicates a general mental competence that just doesn't exist.

atleastoptimal · 2025-12-28T17:46:24 1766943984

The inevitable outcome of regulation on building data centers in the US is that they will be built in the Gulf states, China, or wherever else it is cheaper and better.

atleastoptimal · 2025-12-21T07:38:34 1766302714

AI will get better

atleastoptimal · 2025-12-21T07:22:22 1766301742

They should do a 95% and 99% version of the graphs, otherwise it's hard to ascertain whether the failure cases will remain in the elusive "stuff humans can do easily but LLM's trip up despite scaling"