More

xeeeeeeeeeeenu · 2026-01-18T05:15:14 1768713314

> no prior solutions found.

This is no longer true, a prior solution has just been found[1], so the LLM proof has been moved to the Section 2 of Terence Tao's wiki[2].

[1] - https://www.erdosproblems.com/forum/thread/281#post-3325

[2] - https://github.com/teorth/erdosproblems/wiki/AI-contribution...

nl · 2026-01-18T05:55:45 1768715745

Interesting that in Terrance Tao's words: "though the new proof is still rather different from the literature proof)"

And even odder that the proof was by Erdos himself and yet he listed it as an open problem!

pfdietz · 2026-01-20T18:54:50 1768935290

The theorem is implied by an older result of Erdos, but is not a result of Erdos. Apparently this is because the connection is something called "Roger's Theorem" that was quite obscure.

https://terrytao.wordpress.com/2026/01/19/rogers-theorem-on-...

"This theorem is somewhat obscure: its only appearance in print is in pages 242-244 of this 1966 text of Halberstam and Roth, where the authors write in a footnote that the result is “unpublished; communicated to the authors by Professor Rogers”. I have only been able to find it cited in three places in the literature: in this 1996 paper of Lewis, in this 2007 paper of Filaseta, Ford, Konyagin, Pomerance, and Yu (where they credit Tenenbaum for bringing the reference to their attention), and is also briefly mentioned in this 2008 paper of Ford. As far as I can tell, the result is not available online, which could explain why it is rarely cited (and also not known to AI tools). This became relevant recently with regards to Erdös problem 281, posed by Erdös and Graham in 1980, which was solved recently by Neel Somani through an AI query by an elegant ergodic theory argument. However, shortly after this solution was located, it was discovered by KoishiChan that Rogers’ theorem reduced this problem immediately to a very old result of Davenport and Erdös from 1936. Apparently, Rogers’ theorem was so obscure that even Erdös was unaware of it when posing the problem!"

TZubiri · 2026-01-18T06:00:47 1768716047

Maybe it was in the training set.

magneticnorth · 2026-01-18T06:10:43 1768716643

I think that was Tao's point, that the new proof was not just read out of the training set.

rzmmm · 2026-01-18T07:30:47 1768721447

The model has multiple layers of mechanisms to prevent carbon copy output of the training data.

TZubiri · 2026-01-18T07:50:16 1768722616

forgive the skepticism, but this translates directly to "we asked the model pretty please not to do it in the system prompt"

ffsm8 · 2026-01-18T08:18:49 1768724329

It's mind boggling if you think about the fact they're essential "just" statistical models

It really contextualizes the old wisdom of Pythagoras that everything can be represented as numbers / math is the ultimate truth

glemion43 · 2026-01-18T09:33:50 1768728830

They are not just statistical models

They create concepts in latent space which is basically compression which forces this

jrmg · 2026-01-18T15:20:07 1768749607

You’re describing a complex statistical model.

glemion43 · 2026-01-18T19:06:41 1768763201

Debatable I would argue. It's definitely not 'just a statistical model's and I would argue that the compression into this space fixes potential issues differently than just statistics.

But I'm not a mathematics expert if this is the real official definition I'm fine with it. But are you though?

inimino · 2026-01-19T00:30:42 1768782642

I am, and yes, that's what a statistical model is.

mmooss · 2026-01-18T16:13:34 1768752814

What is "latent space"? I'm wary of metamagical descriptions of technology that's in a hype cycle.

AIorNot · 2026-01-18T17:44:32 1768758272

See this video

https://youtu.be/D8GOeCFFby4?si=AtqH6cmkOLvqKdr0

DoctorOetker · 2026-01-18T17:56:12 1768758972

its a statistical term, a latent variable is one that is either known to exist, or believed to exist, and then estimated.

consider estimating the position of an object from noisy readings. One presumes that position to exist in some sense, and then one can estimate it by combining multiple measurements, increasing positioning resolution.

its any variable that is postulated or known to exist, and for which you run some fitting procedure

glemion43 · 2026-01-18T19:03:58 1768763038

I'm disappointed that you had to add the 'metamagical' to your question tbh

It doesn't matter if ai is in a hype cycle or not it doesn't change how a technology works.

Check out the yt videos from 1blue3brown he explains LLMs quite well. .your first step is the word embedding this vector space represents the relationship between words. Father - grandfather. The vector which makes a father a grandfather is the same vector as mother to grandmother.

You the use these word vectors in the attention layer to create a n dimensional space aka latent space which basically reflects a 'world' the LLM walks through. This makes the 'magic' of LLMs.

Basically a form of compression by having higher dimensions reflecting kind a meaning.

Your brain does the same thing. It can't store pixels so when you go back to some childhood environment like your old room, you remember it in some efficient (brain efficient) way. Like the 'feeling' of it.

That's also the reason why an LLM is not just some statistical parrot.

mmooss · 2026-01-18T20:27:08 1768768028

> It doesn't matter if ai is in a hype cycle or not it doesn't change how a technology works.

It does change what people say about it. Our words are not reality itself; the map is not the territory.

Are you saying people should take everything said about LLMs at face value?

glemion43 · 2026-01-18T22:09:40 1768774180

Being dismissive of technical terms on hn because something seems to be a hype is really weird.

It's the reason why I'm here because we discuss more technically about technology

mmooss · 2026-01-19T03:33:18 1768793598

I wasn't dismissive, just wary. As a new account, it's odd to be lecturing people on behavior. You're the one diverting the conversation.

glemion43 · 2026-01-19T07:12:29 1768806749

I'm in hn for 10 years.

I spend too much time here and decided to delete my account to interact less.

It's partially working though

GrowingSideways · 2026-01-18T09:06:02 1768727162

How so? Truth is naturally an apriori concept; you don't need a chatbot to reach this conclusion.

mikaraento · 2026-01-18T08:30:06 1768725006

That might be somewhat ungenerous unless you have more detail to provide.

I know that at least some LLM products explicitly check output for similarity to training data to prevent direct reproduction.

TZubiri · 2026-01-18T21:24:05 1768771445

So it would be able to produce the training data but with sufficient changes or added magic dust to be able to claim it as one's own.

Legally I think it works, but evidence in a court works differently than in science. It's the same word but don't let that confuse you and don't mix them both.

guenthert · 2026-01-18T16:19:04 1768753144

Should they though? If the answer to a question^Wprompt happens to be in the training set, wouldn't it be disingenuous to not provide that?

ttctciyf · 2026-01-18T17:13:55 1768756435

Maybe it's intended to avoid legal liability resulting from reproducing copyright material not licensed for training?

TZubiri · 2026-01-18T21:18:56 1768771136

Ding!

It's great business to minimally modify valuable stuff and then take credit for it. As was explained to me by bar-certified counsel "if you take a recipe and add, remove or change just one thing, it's now your recipe"

The new trend in this is asking Claude Code to create a software on some type, like a Browser or a DICOM viewer, and then publishing that it's managed to do this very expensive thing (but if you check source code, which is never published, it probably imports a lot of open source dependencies that actually do the thing)

Now this is especially useful in business, but it seems that some people are repurposing this for proving math theorems. The Terence Tao effort which later checks for previous material is great! But the fact that the Section 2 (for such cases) is filled to the brim, and section 1 is mostly documented failed attempts (except for 1 proof, congratulations to the authors), mostly confirms my hypothesis, claiming that the model has guards that prevent it is a deus ex machina cope against the evidence.

ComplexSystems · 2026-01-18T16:37:57 1768754277

The model doesn't know what its training data is, nor does it know what sequences of tokens appeared verbatim in there, so this kind of thing doesn't work.

efskap · 2026-01-18T08:34:13 1768725253

Would it really be infeasible to take a sample and do a search over an indexed training set? Maybe a bloom filter can be adapted

hexaga · 2026-01-18T09:26:56 1768728416

It's not the searching that's infeasible. Efficient algorithms for massive scale full text search are available.

The infeasibility is searching for the (unknown) set of translations that the LLM would put that data through. Even if you posit only basic symbolic LUT mappings in the weights (it's not), there's no good way to enumerate them anyway. The model might as well be a learned hash function that maintains semantic identity while utterly eradicating literal symbolic equivalence.

glemion43 · 2026-01-18T09:32:00 1768728720

Do you have a source for this?

Carbon copy would mean over fitting

fweimer · 2026-01-18T15:52:22 1768751542

I saw weird results with Gemini 2.5 Pro when I asked it to provide concrete source code examples matching certain criteria, and to quote the source code it found verbatim. It said it in its response quoted the sources verbatim, but that wasn't true at all—they had been rewritten, still in the style of the project it was quoting from, but otherwise quite different, and without a match in the Git history.

It looked a bit like someone at Google subscribed to a legal theory under which you can avoid copyright infringement if you take a derivative work and apply a mechanical obfuscation to it.

Workaccount2 · 2026-01-18T18:29:09 1768760949

LLM's are not archives of information.

People seem to have this belief, or perhaps just general intuition, that LLMs are a google search on a training set with a fancy language engine on the front end. That's not what they are. The models (almost) self avoid copyright, because they never copy anything in the first place, hence why the model is a dense web of weight connections rather than an orderly bookshelf of copied training data.

Picture yourself contorting your hands under a spotlight to generate a shadow in the shape of a bird. The bird is not in your fingers, despite the shadow of the bird, and the shadow of your hand, looking very similar. Furthermore, your hand-shadow has no idea what a bird is.

fweimer · 2026-01-18T19:07:10 1768763230

For a task like this, I expect the tool to use web searches and sift through the results, similar to what a human would do. Based on progress indicators shown during the process, this is what happens. It's not an offline synthesis purely from training data, something you would get from running a model locally. (At least if we can believe the progress indicators, but who knows.)

int_19h · 2026-01-19T08:25:17 1768811117

While true in general, they do know many things verbatim. For instance, GPT-4 can reproduce the Navy SEAL copypasta word for word with all the misspellings.

Workaccount2 · 2026-01-19T14:49:05 1768834145

I'd imagine more than a few basement dwellers could as well.

NewsaHackO · 2026-01-18T19:20:03 1768764003

It is the classic "He made it up"

Der_Einzige · 2026-01-18T16:34:20 1768754060

Source is just read the definition of what "temperature" is.

But honestly source = "a knuckle sandwich" would be appropriate here.

dang · 2026-01-19T03:47:05 1768794425

Threatening violence*, even in this virtual way and encased in quotation marks, is not allowed here.

Edit: you've been breaking the site guidelines badly in other threads as well. (To pick one example of many: https://news.ycombinator.com/item?id=46601932.) We've asked you many times not to.

I don't want to ban your account because your good contributions are good and I do believe you're well-intentioned. But really, can you please take the intended spirit of this site more to heart and fix this? Because at some point the damage caused by poisonous comments is worse.

https://news.ycombinator.com/showhn.html

* it would be more accurate to say "using violent language as a trope in an argument" - I don't believe in taking comments like this literally, as if they're really threatening violence. Nonetheless you can't post this way to HN.

Den_VR · 2026-01-18T08:43:34 1768725814

Unfortunately.

GeoAtreides · 2026-01-18T15:00:07 1768748407

does it?

this is a verbatim quote from gemini 3 pro from a chat couple of days ago:

"Because I have done this exact project on a hot water tank, I can tell you exactly [...]"

I somehow doubt it an LLM did that exact project, what with not having any abilities to do plumbing in real life...

retsibsi · 2026-01-18T15:09:33 1768748973

Isn't that easily explicable as hallucination, rather than regurgitation?

ttctciyf · 2026-01-18T17:15:00 1768756500

Those are not mutually exclusive in this instance, it seems.

cma · 2026-01-18T14:11:15 1768745475

I don't think it is dispositive, just that it likely didn't copy the proof we know was in the training set.

A) It is still possible a proof from someone else with a similar method was in the training set.

B) something similar to erdos's proof was in the training set for a different problem and had a similar alternate solution to chatgpt, and was also in the training set, which would be more impressive than A)

CamperBob2 · 2026-01-18T15:51:02 1768751462

It is still possible a proof from someone else with a similar method was in the training set.

A proof that Terence Tao and his colleagues have never heard of? If he says the LLM solved the problem with a novel approach, different from what the existing literature describes, I'm certainly not able to argue with him.

mmooss · 2026-01-18T16:12:11 1768752731

> A proof that Terence Tao and his colleagues have never heard of?

Tao et al. didn't know of the literature proof that started this subthread.

pvab3 · 2026-01-18T22:53:54 1768776834

there is an immense amount of stuff out there on ArXiv that no one has ever looked at

CamperBob2 · 2026-01-18T16:38:08 1768754288

Right, but someone else did ("colleagues.")

habinero · 2026-01-18T18:30:05 1768761005

No, they searched for it. There's a lot of math literature out there, not even an expert is going to know all of it.

CamperBob2 · 2026-01-18T18:43:38 1768761818

Point being, it's not the same proof.

mmooss · 2026-01-18T20:25:29 1768767929

Your point seemed to be, if Tao et al. haven't heard of it then it must not exist. The now known literature proof contradicts that claim.

nl · 2026-01-18T22:37:33 1768775853

There's an update from Tao after emailing Tenenbaum (the paper author) about this:

> He speculated that "the formulation [of the problem] has been altered in some way"....

[snip]

> More broadly, I think what has happened is that Rogers' nice result (which, incidentally, can also be proven using the method of compressions) simply has not had the dissemination it deserves. (I for one was unaware of it until KoishiChan unearthed it.) The result appears only in the Halberstam-Roth book, without any separate published reference, and is only cited a handful of times in the literature. (Amusingly, the main purpose of Rogers' theorem in that book is to simplify the proof of another theorem of Erdos.) Filaseta, Ford, Konyagin, Pomerance, and Yu - all highly regarded experts in the field - were unaware of this result when writing their celebrated 2007 solution to #2, and only included a mention of Rogers' theorem after being alerted to it by Tenenbaum. So it is perhaps not inconceivable that even Erdos did not recall Rogers' theorem when preparing his long paper of open questions with Graham in 1980.

(emphasis mine)

I think the value of LLM guided literature searches is pretty clear!

casey2 · 2026-01-19T02:15:46 1768788946

This whole thread is pretty funny. Either it can demo some pretty clever, but still limited, features resulting in math skills OR it's literally the best search engine ever invented. My guess is the former, it's pretty whatever at web search and I'd expect to see something similar to the easily retrievable, more visible proof method from Rogers' (as opposed to some alleged proof hidden in some dataset).

CamperBob2 · 2026-01-19T16:42:52 1768840972

Either it can demo some pretty clever, but still limited, features resulting in math skills OR it's literally the best search engine ever invented.

Both are precisely true. It is a better search engine than anything else -- which, while true, is something you won't realize unless you've used the non-free 'pro research' features from Google and/or OpenAI. And it can perform limited but increasingly-capable reasoning about what it finds before presenting the results to the user.

Note that no online Web search or tool usage at all was involved in the recent IMO results. I think a lot of people missed that little detail.

heliumtera · 2026-01-18T15:02:54 1768748574

Does it matter if it copied or not? How the hell would one even define if it is a copy or original at this point?

At this point the only conclusion here is: The original proof was on the training set. The author and Terence did not care enough to find the publication by erdos himself

davidhs · 2026-01-18T14:17:01 1768745821

It looks like these models work pretty well as natural language search engines and at connecting together dots of disparate things humans haven't done.

pfdietz · 2026-01-18T15:21:19 1768749679

They're finding them very effective at literature search, and at autoformalization of human-written proofs.

Pretty soon, this is going to mean the entire historical math literature will be formalized (or, in some cases, found to be in error). Consider the implications of that for training theorem provers.

mlpoknbji · 2026-01-18T16:24:39 1768753479

I think "pretty soon" is a serious overstatement. This does not take into account the difficulty in formalizing definitions and theorem statements. This cannot be done autonomously (or, it can, but there will be serious errors) since there is no way to formalize the "text to lean" process.

What's more, there's almost surely going to turn out to be a large amount of human generated mathematics that's "basically" correct, in the sense that there exists a formal proof that morally fits the arc of the human proof, but there's informal/vague reasoning used (e.g. diagram arguments, etc) that are hard to really formalize, but an expert can use consistently without making a mistake. This will take a long time to formalize, and I expect will require a large amount of human and AI effort.

pfdietz · 2026-01-18T21:06:25 1768770385

It's all up for debate, but personally I feel you're being too pessimistic there. The advances being made are faster than I had expected. The area is one where success will build upon and accelerate success, so I expect the rate of advance to increase and continue increasing.

This particular field seems ideal for AI, since verification enables identification of failure at all levels. If the definitions are wrong the theorems won't work and applications elsewhere won't work.

p-e-w · 2026-01-18T15:59:00 1768751940

Every time this topic comes up people compare the LLM to a search engine of some kind.

But as far as we know, the proof it wrote is original. Tao himself noted that it’s very different from the other proof (which was only found now).

That’s so far removed from a “search engine” that the term is essentially nonsense in this context.

theptip · 2026-01-18T16:35:45 1768754145

Hassabis put forth a nice taxonomy of innovation: interpolation, extrapolation, and paradigm shifts.

AI is currently great at interpolation, and in some fields (like biology) there seems to be low-hanging fruit for this kind of connect-the-dots exercise. A human would still be considered smart for connecting these dots IMO.

AI clearly struggles with extrapolation, at least if the new datum is fully outside the training set.

And we will have AGI (if not ASI) if/when AI systems can reliably form new paradigms. It’s a high bar.

davidhs · 2026-01-20T12:12:10 1768911130

Maybe if Terence Tao had memorized the entire Internet (and pretty much all media), then maybe he would find bits and pieces of the problem remind him of certain known solutions and be able to connect the dots himself.

But, I don't know. I tend to view these (reasoning) LLMs as alien minds and my intuition of what is perhaps happening under the hood is not good.

I just know that people have been using these LLMs as search engines (including Stephen Wolfram), browsing through what these LLMs perhaps know and have connected together.

cubefox · 2026-01-18T07:47:41 1768722461

This illustrates how unimportant this problem is. A prior solution did exist, but apparently nobody knew because people didn't really care about it. If progress can be had by simply searching for old solutions in the literature, then that's good evidence the supposed progress is imaginary. And this is not the first time this has happened with an Erdős problem.

A lot of pure mathematics seems to consist in solving neat logic puzzles without any intrinsic importance. Recreational puzzles for very intelligent people. Or LLMs.

glemion43 · 2026-01-18T09:35:33 1768728933

It shows that a 'llm' can now work on issues like this today and tomorrow it can do even more.

Don't be so ignorant. A few years ago NO ONE could have come up with something so generic as an LLM which will help you to solve this kind of problems and also create text adventures and java code.

danielbln · 2026-01-18T10:43:15 1768732995

The goal posts are strapped to skateboards these days, and the WD40 is applied to the wheels generously.

sampullman · 2026-01-18T12:51:47 1768740707

Regular WD40 should not be used as bearing lubricant!

danielbln · 2026-01-18T12:58:07 1768741087

Exactly!

glemion43 · 2026-01-18T20:03:33 1768766613

I don't get your pessimism...

Nothing of it was even imaginable and yes the progress is crazy fast.

How can you be so dismissive?

danielbln · 2026-01-18T20:06:29 1768766789

You misread my comment.

glemion43 · 2026-01-18T21:02:44 1768770164

You mean like a small rocket build? Okay :)

BoredPositron · 2026-01-18T11:48:21 1768736901

You can just wait and verify instead of the publishing, redacting cycles of the last year. It's embarrassing.

jojobas · 2026-01-18T12:13:13 1768738393

It's hard to predict which maths result from 100 years ago surfaces in say quantum mechanics or cryptography.

layer8 · 2026-01-18T23:04:42 1768777482

The likelihood for that is vanishingly low, though, for any given math result.

antonvs · 2026-01-18T18:12:04 1768759924

> "intrinsic importance"

"Intrinsic" in contexts like this is a word for people who are projecting what they consider important onto the world. You can't define it in any meaningful way that's not entirely subjective.

cubefox · 2026-01-19T09:04:23 1768813463

Mathematical theorems at least have objectively lower information content, because they merely rule out the impossible, while scientific knowledge also rules out the possible but non-actual.

antonvs · 2026-01-20T07:29:32 1768894172

You have it backwards. Mathematical theorems have objectively higher information content, because they rule out the impossible and model possibilities in all possible worlds that satisfy their preconditions. Scientific knowledge can never do more than inductive projections from observations in the single world we have physical access to.

The only thing that saves science from being nothing more than “huh, will you look at that,” is when it can make use of a mathematical model to provide insight into relationships between phenomena.

MattGaiser · 2026-01-18T08:06:47 1768723607

There is still enormous value in cleaning up the long tail of somewhat important stuff. One of the great benefits of Claude Code to me is that smaller issues no longer rot in backlogs, but can be at least attempted immediately.

cubefox · 2026-01-18T08:30:07 1768725007

The difference is that Claude Code actually solves practical problems, but pure (as opposed to applied) mathematics doesn't. Moreover, a lot of pure mathematics seems to be not just useless, but also without intrinsic epistemic value, unlike science. See https://news.ycombinator.com/item?id=46510353

drob518 · 2026-01-18T15:09:03 1768748943

I’m an engineer, not a mathematician, so I definitely appreciate applied math more than I do abstract math. That said, that’s my personal preference and one of the reasons that I became an engineer and not a mathematician. Working on nothing but theory would bore me to tears. But I appreciate that other people really love that and can approach pure math and see the beauty. And thank God that those people exist because they sometimes find amazing things that we engineers can use during the next turn of the technological crank. Instead of seeing pure math as useless, perhaps shift to seeing it as something wonderful for which we have not YET found a practical use.

Ar-Curunir · 2026-01-18T20:53:28 1768769608

Even if pure math is useless, that’s still okay. We do plenty of things that are useless. Not everything has to have a use.

drob518 · 2026-01-19T00:30:12 1768782612

I’m not sure I agree. Pure math is not useless because a lot of math is very useful. But we don’t know ahead of time what is going to be useless vs. useful. We need to do all of it and then sort it out later.

If we knew that it was all going to be useless, however, then it’s a hobby for someone, not something we should be paying people to do. Sure, if you enjoy doing something useless, knock yourself out… but on your own dime.

jstanley · 2026-01-18T08:41:05 1768725665

Applications for pure mathematics can't necessarily be known until the underlying mathematics is solved.

Just because we can't imagine applications today doesn't mean there won't be applications in the future which depend on discoveries that are made today.

cubefox · 2026-01-18T14:02:51 1768744971

Well, read the linked comment. The possible future applications of useless science can't be known either. I still argue that it has intrinsic value apart from that, unlike pure mathematics.

Thorrez · 2026-01-18T14:23:27 1768746207

There are many cases where pure mathematics became useful later.

https://www.reddit.com/r/math/comments/dfw3by/is_there_any_e...

cubefox · 2026-01-18T14:32:51 1768746771

So what? There are probably also many cases where seemingly useless science became useful later.

glenstein · 2026-01-18T15:55:02 1768751702

Exactly, you're almost getting it. Hence the value of "pure" research in both science and math.

cubefox · 2026-01-18T16:12:28 1768752748

You are not yet getting it I'm afraid. The point of the linked post was that, even assuming an equal degree of expected uselessness, scientific explanations have intrinsic epistemic value, while proving pure math theorems hasn't.

glenstein · 2026-01-18T16:52:13 1768755133

I think you lost track of what I was replying to. Thorrez noted that "There are many cases where pure mathematics became useful later." You replied by saying "So what? There are probably also many cases where seemingly useless science became useful later." You seemed to be treating the latter as if it negated the former which doesn't follow. The utility of pure math research isn't negated by noting there's also value in pure science research, any more than "hot dogs are tasty" is negated by replying "so what? hamburgers are also tasty". That's the point you made, and that's what I was responding to, and I'm not confused on this point despite your insistence to the contrary.

Instead of addressing any of that you're insisting I'm misunderstanding and pointing me back to a linked comment of yours drawing a distinction between epistemic value of science research vs math research. Epistemic value counts for many things, but one thing it can't do is negate the significance of pure math turning into applied research on account of pure science doing the same.

cubefox · 2026-01-18T17:29:41 1768757381

"You replied by saying "So what? There are probably also many cases where seemingly useless science became useful later." You seemed to be treating the latter as if it negated the former"

No, "so what" doesn't indicate disagreement, just that something isn't relevant.

Anyway, assume hot dogs taste not good at all, except in rare circumstances. It would then be wrong to say "hot dogs taste good", but it would be right to say "hot dogs don't taste good". Now substitute pure math for hot dogs. Pure math can be generally useless even if it isn't always useless. Men are taller than women. That's the difference between applied and pure math. The difference between math and science is something else: Even useless science has value, while most useless math (which consists of pure math) doesn't. (I would say the axiomatization of new theories, like probability theory, can also have inherent value, independent of any uselessness, insofar as it is conceptual progress, but that's different from proving pure math conjectures.)

glenstein · 2026-01-21T21:58:08 1769032688

So when you said "so what, hamburgers (science) taste good (is useful)", you were implicitly making a point about how bad (mostly not useful) the hot dogs (math research) was? And that's the thing that supposedly wasn't being followed on the first pass?

That brings us full circle, because you're now saying you were using one to negate the other, yet you were claiming that interpretation was a "failure to follow" what you were saying the first time around.

cwnyth · 2026-01-18T18:54:15 1768762455

It really speaks to the weakness of your original claim that you're applying this level of sophistry to your backpedaling.

cubefox · 2026-01-18T20:32:58 1768768378

There are 1135 Erdős problems. The solution to how many of them do you expect to be practically useless? 99%? More? 100%? Calling something useful merely because it might be in rare exceptions is the real sophistry.

teiferer · 2026-01-18T09:03:31 1768727011

It's hard to know beforehand. Like with most foundational research.

My favorite example is number theory. Before cyptography came along it was pure math, an esoteric branch for just number nerds. defund Turns out, super applicable later on.

baq · 2026-01-18T09:28:16 1768728496

You’re confusing immediately useful with eventually useful. Pure maths has found very practical applications over the millennia - unless you don’t consider it pure anymore, at which point you’re just moving goalposts.

cubefox · 2026-01-18T09:46:39 1768729599

No, I'm not confusing that. Read the linked comment if you're interested.

TheOtherHobbes · 2026-01-18T10:56:02 1768733762

You are confusing that. The biggest advancements in science are the result of the application of leading-edge pure math concepts to physical problems. Netwonian physics, relativistic physics, quantum field theory, Boolean computing, Turing notions of devices for computability, elliptic-curve cryptography, and electromagnetic theory all derived from the practical application of what was originally abstract math play.

Among others.

Of course you never know which math concept will turn out to be physically useful, but clearly enough do that it's worth buying conceptual lottery tickets with the rest.

glenstein · 2026-01-18T16:08:51 1768752531

Just to throw in another one, string theory was practically nothing but a basic research/pure research program unearthing new mathematical objects which drove physics research and vice versa. And unfortunately for the haters, string theory has borne real fruit with holography, producing tools for important predictions in plasma physics and black hole physics among other things. I feel like culture hasn't caught up to the fact that holography is now the gold rush frontier that has everyone excited that it might be our next big conceptual revolution in physics.

cubefox · 2026-01-18T15:54:57 1768751697

There is a difference between inventing/axiomatizing new mathematical theories and proving conjectures. Take the Riemann hypothesis (the big daddy among the pure math conjectures), and assume we (or an LLM) prove it tomorrow. How high do you estimate the expected practical usefulness of that proof?

glenstein · 2026-01-18T16:25:12 1768753512

That's an odd choice, because prime numbers routinely show up in important applications in cryptography. To actually solve RH would likely involve developing new mathematical tools which would then be brought to bear on deployment of more sophisticated cryptography. And solving it would be valuable in its own right, a kind of mathematical equivalent to discovering a fundamental law in physics which permanently changes what is known to be true about the structure of numbers.

Ironically this example turns out to be a great object lesson in not underestimating the utility of research based on an eyeball test. But it shouldn't even have to have any intuitively plausible payoff whatsoever in order to justify it. The whole point is that even if a given research paradigm completely failed the eyeball test, our attitude should still be that it very well could have practical utility, and there are so many historical examples to this effect (the other commenter already gave several examples, and the right thing to do would have been acknowledge them), and besides I would argue they still have the same intrinsic value that any and all knowledge has.

cubefox · 2026-01-18T17:43:03 1768758183

> To actually solve RH would likely involve developing new mathematical tools which would then be brought to bear on deployment of more sophisticated cryptography.

I doubt that this is true.

glenstein · 2026-01-18T18:30:41 1768761041

It already has! The progress that's been made thus far, involved the development of new ways to probabilistically estimate density of primes, which in turn have already been used in cryptography for secure key based on deeper understanding of how to quickly and efficiently find large prime numbers.

amazingman · 2026-01-18T08:59:38 1768726778

It's unclear to me what point you are making.

threethirtytwo · 2026-01-18T05:49:16 1768715356

This is a relief, honestly. A prior solution exists now, which means the model didn’t solve anything at all. It just regurgitated it from the internet, which we can retroactively assume contained the solution in spirit, if not in any searchable or known form. Mystery resolved.

This aligns nicely with the rest of the canon. LLMs are just stochastic parrots. Fancy autocomplete. A glorified Google search with worse footnotes. Any time they appear to do something novel, the correct explanation is that someone, somewhere, already did it, and the model merely vibes in that general direction. The fact that no human knew about it at the time is a coincidence best ignored.

The same logic applies to code. “Vibe coding” isn’t real programming. Real programming involves intuition, battle scars, and a sixth sense for bugs that can’t be articulated but somehow always validates whatever I already believe. When an LLM produces correct code, that’s not engineering, it’s cosplay. It didn’t understand the problem, because understanding is defined as something only humans possess, especially after the fact.

Naturally, only senior developers truly code. Juniors shuffle syntax. Seniors channel wisdom. Architecture decisions emerge from lived experience, not from reading millions of examples and compressing patterns into a model. If an LLM produces the same decisions, it’s obviously cargo-culting seniority without having earned the right to say “this feels wrong” in a code review.

Any success is easy to dismiss. Data leakage. Prompt hacking. Cherry-picking. Hidden humans in the loop. And if none of those apply, then it “won’t work on a real codebase,” where “real” is defined as the one place the model hasn’t touched yet. This definition will be updated as needed.

Hallucinations still settle everything. One wrong answer means the whole system is fundamentally broken. Human mistakes, meanwhile, are just learning moments, context switches, or coffee shortages. This is not a double standard. It’s experience.

Jobs are obviously safe too. Software engineering is mostly communication, domain expertise, and navigating ambiguity. If the model starts doing those things, that still doesn’t count, because it doesn’t sit in meetings, complain about product managers, or feel existential dread during sprint planning.

So yes, the Erdos situation is resolved. Nothing new happened. No reasoning occurred. Progress remains hype. The trendline is imaginary. And any discomfort you feel is probably just social media, not the ground shifting under your feet.

eru · 2026-01-18T07:04:31 1768719871

> This is a relief, honestly. A prior solution exists now, which means the model didn’t solve anything at all. It just regurgitated it from the internet, which we can retroactively assume contained the solution in spirit, if not in any searchable or known form. Mystery resolved.

Vs

> Interesting that in Terrance Tao's words: "though the new proof is still rather different from the literature proof)"

catoc · 2026-01-18T07:41:23 1768722083

I firmly believe @threethirtytwo’s reply was not produced by an LLM

mkarliner · 2026-01-18T08:28:03 1768724883

regardless of if this text was written by an LLM or a human, it is still slop,with a human behind it just trying to wind people up . If there is a valid point to be made , it should be made, briefly.

catoc · 2026-01-18T09:07:59 1768727279

If the point was triggering a reply, the length and sarcasm certainly worked.

I agree brevity is always preferred. Making a good point while keeping it brief is much harder than rambling on.

But length is just a measure, quality determines if I keep reading. If a comment is too long, I won’t finish reading it. If I kept reading, it wasn’t too long.

johnfn · 2026-01-18T06:06:00 1768716360

I suspect this is AI generated, but it’s quite high quality, and doesn’t have any of the telltale signs that most AI generated content does. How did you generate this? It’s great.

AstroBen · 2026-01-18T06:52:16 1768719136

Their comments are full of "it's not x, it's y" over and over. Short pithy sentences. I'm quite confident it's AI written, maybe with a more detailed prompt than the average

I guess this is the end of the human internet

prussia · 2026-01-18T07:45:54 1768722354

To give them the benefit of the doubt, people who talk to AI too much probably start mimicking its style.

4k93n2 · 2026-01-18T07:12:28 1768720348

yea, i was suspicious by the second paragraph but was sure once i got to "that’s not engineering, it’s cosplay"

AstroBen · 2026-01-18T07:23:42 1768721022

It's also the wording. The weird phrases

"Glorified Google search with worse footnotes" what on earth does that mean?

AI has a distinct feel to it

lxgr · 2026-01-18T07:44:53 1768722293

And with enough motivated reasoning, you can find AI vibes in almost every comment you don’t agree with.

For better or worse, I think we might have to settle on “human-written until proven otherwise”, if we don’t want to throw “assume positive intent” out the window entirely on this site.

testdelacc1 · 2026-01-18T07:52:07 1768722727

Dude is swearing up and down that they came up with the text on their own. I agree with you though, it reeks of LLMs. The only alternative explanation is that they use LLMs so much that they’ve copied the writing style.

plaguuuuuu · 2026-01-18T07:20:11 1768720811

I've had that exact phrase pop up from an LLM when I asked it for a more negative code review

threethirtytwo · 2026-01-18T06:25:16 1768717516

Your intuition on AI is out of date by about 6 months. Those telltale signs no longer exist.

It wasn't AI generated. But if it was, there is currently no way for anyone to tell the difference.

catlifeonmars · 2026-01-18T07:32:59 1768721579

I’m confused by this. I still see this kind of phrasing in LLM generated content, even as recent as last week (using Gemini, if that matters). Are you saying that LLMs do not generate text like this, or that it’s now possible to get text that doesn’t contain the telltale “its not X, it’s Y”?

comp_throw7 · 2026-01-18T06:34:46 1768718086

> But if it was there is currently no way for anyone to tell the difference.

This is false. There are many human-legible signs, and there do exist fairly reliable AI detection services (like Pangram).

int_19h · 2026-01-19T22:36:14 1768862174

There are no reliable AI detection services. At best they can reliably detect output from popular chatbots running with their default prompts. Beyond that reliability deteriorates rapidly so they either err on the side of many false positives, or on the side of many false negatives.

There's already been several scandals where students were accused of AI use on the basis of these services and successfully fought back.

threethirtytwo · 2026-01-18T06:43:43 1768718623

I've tested some of those services and they weren't very reliable.

CamperBob2 · 2026-01-18T15:29:27 1768750167

If such a thing did exist, it would exist only until people started training models to hide from it.

Negative feedback is the original "all you need."

velox_neb · 2026-01-18T06:43:33 1768718613

> It wasn't AI generated.

You're lying: https://www.pangram.com/history/94678f26-4898-496f-9559-8c4c...

Not that I needed pangram to tell me that, it's obvious slop.

threethirtytwo · 2026-01-18T06:50:43 1768719043

I wouldn't know how to prove to you otherwise other then to tell you that I have seen these tools show incorrect results for both AI generated text and human written text.

lxgr · 2026-01-18T07:52:19 1768722739

Good thing you had a stochastic model backing up (with “low confidence”, no less) your vague intuition of a comment you didn’t like being AI-written.

XenophileJKO · 2026-01-18T06:54:25 1768719265

I must be a bot because I love existential dread, that's a great phrase. I feel like they trigger a lot on literate prose.

lxgr · 2026-01-18T07:54:42 1768722882

Sad times when the only remaining way to convince LLM luddites of somebody’s humanity is bad writing.

CamperBob2 · 2026-01-18T06:11:45 1768716705

(edit: removed duplicate comment from above, not sure how that happened)

undeveloper · 2026-01-18T06:21:00 1768717260

the poster is in fact being very sarcastic. arguing in favor of emergent reasoning does in fact make sense

threethirtytwo · 2026-01-18T06:19:42 1768717182

It's a formal sarcasm piece.

CamperBob2 · 2026-01-18T06:11:20 1768716680

It's bizarre. The same account was previously arguing in favor of emergent reasoning abilities in another thread ( https://news.ycombinator.com/item?id=46453084 ) -- I voted it up, in fact! Turing test failed, I guess.

(edit: fixed link)

threethirtytwo · 2026-01-18T06:20:29 1768717229

I thought the mockery and sarcasm in my piece was rather obvious.

CamperBob2 · 2026-01-18T06:27:57 1768717677

Poe's Law is the real Bitter Lesson.

habinero · 2026-01-18T06:19:09 1768717149

We need a name for the much more trivial version of the Turing test that replaces "human" with "weird dude with rambling ideas he clearly thinks are very deep"

I'm pretty sure it's like "can it run DOOM" and someone could make an LLM that passes this that runs on an pregnancy test

magnio · 2026-01-18T06:08:04 1768716484

Pity that HN's ability to detect sarcasm is as robust as that of a sentiment analysis model using keyword-matching.

furyofantares · 2026-01-18T06:45:31 1768718731

The problem is more that it's an LLM-generated comment that's about 20x as long as it needed to be to get the point across.

cubefox · 2026-01-18T07:33:45 1768721625

It's obviously not LLM-generated.

kleene_op · 2026-01-18T07:39:05 1768721945

Phew. This is a relief, honestly!

threethirtytwo · 2026-01-18T06:55:41 1768719341

It's not.

Evidence shows otherwise: Despite the "20x" length, many people actually missed the point.

eru · 2026-01-18T07:04:57 1768719897

Despite or because?

furyofantares · 2026-01-18T19:38:21 1768765101

Oh yeah, there is also a problem with people not noticing they're reading LLM output, AND with people missing sarcasm on here. Actually, I'm OK with people missing sarcasm on here - I have plenty of places to go for sarcasm and wit and it's actually kind of nice to have a place where most posts are sincere, even if that sets people up to miss it when posts are sarcastic.

Which is also what makes it problematic that you're lying about your LLM use. I would honestly love to know your prompt and how you iterated on the post, how much you put into it and how much you edited or iterated. Although pretending there was no LLM involved at all is rather disappointing.

Unfortunately I think you might feel backed into a corner now that you've insisted otherwise but it's a genuinely interesting thing here that I wish you'd elaborate on.

_diyar · 2026-01-18T07:20:13 1768720813

I definitely missed the point because of the length, and only realized after I read replies to your comment.

threethirtytwo · 2026-01-18T07:31:57 1768721517

Next time I'll write something shorter, or if you don't believe I wrote it... then I'll tell the AI to write something shorter.

quinnjh · 2026-01-18T07:39:55 1768721995

Its not just verbose—it's almost a novel. Parent either cooked and capped, or has managed to perfectly emulate the patterns this parrot is stochastically known best for. I liked the pro human vibe if anything.

catlifeonmars · 2026-01-18T07:25:52 1768721152

That’s just the internet. Detecting sarcasm requires a lot of context external to the content of any text. In person some of that is mitigated by intonation, facial expressions, etc. Typically it also requires that the the reader is a native speaker of the language or at least extremely proficient.

dang · 2026-01-19T04:32:34 1768797154

I'm more worried that the best LLMs aren't yet good enough to classify satire reliably.

nurettin · 2026-01-18T06:00:45 1768716045

Why not plan for a future where a lot of non-trivial tasks are automated instead of living on the edge with all this anxiety?

threethirtytwo · 2026-01-18T06:06:28 1768716388

[flagged]

undeveloper · 2026-01-18T06:22:54 1768717374

come out of the irony layer for a second -- what do you believe about LLMs?

jorvi · 2026-01-18T08:24:34 1768724674

I mean.. LLMs have hit a pretty hard wall a while ago, with the only solution being throwing monstrous compute at eking out the remaining few percent improvement (real world, not benchmarks). That's not to mention hallucinations / false paths being a foundational problem.

LLMs will continue to get slightly better in the next few years, but mainly a lot more efficient. Which will also mean better and better local models. And grounding might get better, but that just means less wrong answers, not better right answers.

So no need for doomerism. The people saying LLMs are a few years away from eating the world are either in on the con or unaware.

7777332215 · 2026-01-18T06:10:45 1768716645

If all of it is going away and you should deny reality, what does everything else you wrote even mean?

habinero · 2026-01-18T06:12:35 1768716755

Yes, it is simply impossible that anyone could look at things and do your own evaluations and come to a different, much more skeptical conclusion.

The only possible explanation is people say things they don't believe out of FUD. Literally the only one.

rixed · 2026-01-18T07:43:08 1768722188

Are you expecting people who can't detect self-dellusions to be able to detect sarcasm, or are you just being cruel?

xeeeeeeeeeeenu · 2025-12-28T19:43:56 1766951036

>Reddit results are often force-translated instead of linking to the original English content.

It's very annoying. Put this in a search query to filter them out: -inurl:?tl=

theodric · 2025-12-28T19:50:02 1766951402

I built a Firefox extension to inject this sort of thing automatically, but I'm not sure if I'm allowed to shill it here. It's not like I'm getting paid for it...

Imustaskforhelp · 2025-12-28T19:58:08 1766951888

hey! Quick suggestion but if you create an firefox extension, please open source it. It immensely boosts my trust in an extension and I doubt that it would be considered a shill (atleast in my book) if you open source it and I don't think that you earn from the extension but if its open source and people like it, it opens up a pathway where people can donate if this extension helps their problems!

theodric · 2025-12-28T21:01:36 1766955696

It's public domain already!

Fine, let's see if I get banned https://addons.mozilla.org/en-US/firefox/addon/google-search...

xeeeeeeeeeeenu · 2025-12-16T09:42:41 1765878161

>What is their business model anyway?

They take a 5.5% fee whenever you buy credits. There's also a discount for opting-in to share your prompts for training.

xeeeeeeeeeeenu · 2025-11-24T16:40:38 1764002438

>avif is just better for typical web image quality,

What does "typical web image quality" even mean? I see lots of benchmarks with very low BPPs, like 0.5 or even lower, and that's where video-based image codecs shine.

However, I just visited CNN.com and these are the BPPs of the first 10 images my browser loaded: 1.40, 2.29, 1.88, 18.03 (PNG "CNN headlines" logo), 1.19, 2.01, 2.21, 2.32, 1.14, 2.45.

I believe people are underestimating the BPP values that are actually used on the web. I'm not saying that low-BPP images don't exist, but clearly it isn't hard to find examples of higher-quality images in the wild.

xeeeeeeeeeeenu · 2025-11-10T04:47:53 1762750073

Unfortunately, LLMs empower "contributors" who can't be bothered to put in any effort and who don't care about the negative impact of their actions on the maintainers.

The open-source community, generally speaking, is a high-trust society and I'm afraid that LLM abuse may turn it into a low-trust society. The end result will be worse than the status quo for everyone involved.

wilya7 · 2025-11-10T06:03:36 1762754616

Everything is collapsing toward a low-trust default. At the end of this trajectory, we rediscover that the analog world becomes valuable precisely because it can't be infinitely replicated.

Authenticity becomes the foundational currency.

But everyone must master AI tools to stay relevant. The brilliant engineer who refuses AI-generated PR by principle will get replaced. Every 18-24 months, as capabilities double, required skills shift. Specialization diminishes. Learning velocity becomes the only durable advantage. These people cannot learn new tricks.

Those who cannot question their assumptions cannot self-correct and will be replaced. The future belongs to the humble, the fluid, and the resilient. 60% of HN users is going toward a very tough time, and I am being very charitable with this assumption.

wombatpm · 2025-11-10T05:22:53 1762752173

AI seems like a rerun of the Eternal September problem that eventually killed internet news groups and a lot of email discussion lists.

xeeeeeeeeeeenu · 2025-09-30T18:26:37 1759256797

>IMO you can't tweak the TikTok/YouTube shorts format and make it a societal good all of a sudden, especially with exclusively AI content.

I agree. At best, short videos can be entertainment that destroys your attention span. Anything more is impossible. Even if there were no bad actors producing the content, you can't condense valuable information into this format.

xeeeeeeeeeeenu · 2025-09-09T03:53:33 1757390013

They improved this in later revisions of the standard. The behaviour of autoSpaceLikeWord95 is now actually described and there's an example.

You can see it for yourself here (in Part 4): https://ecma-international.org/publications-and-standards/st...

xeeeeeeeeeeenu · 2025-06-26T10:38:59 1750934339

>It's the first non opioid painkiller applicable for situations like post operative use.

Perhaps the first approved by FDA, I don't know. In many countries, metamizole is the first-line drug for postoperative pain.

(It should be noted that metamizole may very rarely cause agranulocytosis. It is suspected that the risk varies depending on the genetic makeup of the population, which would explain why it is banned in some countries but available OTC in others.)

arthur2e5 · 2025-06-26T13:37:53 1750945073

From my limited experience of metamizole it feels a bit stronger than paracetamol/acetaminophen. Neat little drug if your genetics can take it.

Tangential: China technically banned metamizole due to the agranulocytosis scare, but somehow small clinics always have fresh stocks of this stuff. And their stocks don't look like my metamizole for horses! It's pressed out of the usual magnesium stearate instead of whatever rock-hard thing they use for animal drugs in China.

xeeeeeeeeeeenu · 2025-06-06T14:03:33 1749218613

It seems that AMD/ATI reuses the Radeon 8000 name approximately every 10 years:

2001: Radeon 8000 series

2013: Radeon HD 8000 series

2025: Radeon 8000S series

_aavaa_ · 2025-06-06T17:00:43 1749229243

Checks out. Increase number by 1,000 every year, modulo 10,000.

xeeeeeeeeeeenu · 2025-05-19T14:16:57 1747664217

You can filter them out by adding this operator to the query:

    -inurl:?tl=