Discussion about this post

User's avatar
AveragePCuser's avatar

"This is best illustrated through the Reversal Curse, which showed that models that learn “A is B” don’t automatically generalize “B is A”. Another way of putting it: the input dictates the output"

Human intuition can often be like this, the similarities are quite striking.

"Well, which of these statements strikes you as more natural: "98 is approximately 100", or "100 is approximately 98"? If you're like most people, the first statement seems to make more sense. (Sadock 1977.) For similar reasons, people asked to rate how similar Mexico is to the United States, gave consistently higher ratings than people asked to rate how similar the United States is to Mexico. (Tversky and Gati 1978.)"

source: https://www.lesswrong.com/posts/4mEsPHqcbRWxnaE5b/typicality-and-asymmetrical-similarity

Expand full comment
Alejandro Piad Morffis's avatar

The term "hallucination" is also heavily loaded and another example of the prevalent problem of wishful mnemonics in AI. I think hallucinations are just the result of sampling OOD in a smooth approximation of language. For a language model to generalize (to generate sentences it has never seen in training) it has to come up with stuff. And stochastic language models come up wth stuff by building very similar sentences (low perplexity) to sentences in the training set. We can always take a truthful sentence in the training set and make a tiny tweak that makes it false. As long as you model language as Markov process conditioned on language itself, with no grounding in a world model, hallucinations are a feature, not even a bug.

Expand full comment
23 more comments...

No posts