"It" does not know when it does not know. But it does know when it has uncertain...

gerdesj · on Dec 5, 2024

"The next token is with almost 100% certainty 4."

By using the word "almost" with regards 2 + 2 = 4, you have not exactly dispelled LLM "nonsense".

A human (with a modicum of maths knowledge) will know that 2 + 2 = 4 (pure integers - a fact by assertion). A maths worrier will get slightly uncomfortable about 2.0 + 2.0 = 4.0 unless they are ensured that decimal places and accuracy are the same thing and a few other things.

A LLM will almost certainly "know" something that is certain, if its training set is conclusive about that. However, it does not know why and if enough of the training set is suitably ambiguous then it (LLM) will drift off course and seem to spout bollocks - "hallucinate".

TZubiri · on Dec 5, 2024

You might be in the wrong thread. This is merely a comment about whether LLMs hold a concept of uncertainty, they do.

Also, the next token might be 2 and the next token might be ², the next token could also have been x, these are all valid statements and the LLM might have been uncertain because of them.

2+2=4

2+2=x

2+2=2x

2+2=2x2

2+2=2²

Are all valid statements.

glandium · on Dec 5, 2024

And somewhere in its training data, you can be sure there's also 2+2=5.

TZubiri · on Dec 5, 2024

Yes, but most likely it's marked as false or incorrect through fine tuning or some form of reinforcement.

The idea that the logprobs of any token is proportional to the amount of times it comes up in training data is not true.

For example, suppose that A is a common misconception and is repeated often in Reddit, but B appears in scholarly textbooks and papers, and higher reputation data sources. Then through reinforcement the logprobs of B can increase, and they can increase consistently when surrounded by contexts like "This is true" and conversely decrease in contexts of "this is not true".

So the presumptions and values of its trainers are also embedded into the LLM in addition to those of the authors of the text corpus.