18 Comments
Jun 3·edited Jun 3Liked by Jurgen Gravestein

I think the argument that they hallucinate does not in fact mean that they don't build a world model - I mean, look at this:

https://www.anthropic.com/research/mapping-mind-language-model

This is a world model of some sort. Hallucinations just mean that the world model can be wrong, and I think that even in the Claude example, it does some odd things like relating "coding error" to "food poisoning" in conceptual space.

But I think I go with Hinton here that it is grasping some sort of meaning(enough to scare me, obviously) and perhaps this should be seen as a matter of degree, with errors. Of course, there are people who also claim that LLMs are discovering Platonic truth, or at least converging to something(maybe all of the same biases?).

https://arxiv.org/pdf/2405.07987

Expand full comment
author

Thanks for you reply, Sean! I think it's important to remember that to the model there are no hallucinated outputs: all outputs are legitimate outputs. Whether any output string actually corresponds to the world is an assessment that only we can make, not the AI.

Because of that, I think it would be more accurate to say the LLM builds a model of language rather than a model of the world. This is also why LLMs are confined by the data they're trained on.

I think Wittgenstein's idea of language games makes a good case for why language doesn't capture any Platonic ideas, even if they existed, despite some folks arguing otherwise (I'm definitely not in Hinton's camp on this one).

If you're looking for further reading, I highly recommend this paper by Bender and Koller: https://aclanthology.org/2020.acl-main.463.pdf

Expand full comment
Jun 3·edited Jun 3Liked by Jurgen Gravestein

Oh, I agree - it only ever hallcuinates, so to speak(though, in a way, so do our brains). I'm definitely more in the Hinton camp but I'l read the paper by Bender. Lemme send a chat to you as well.

I'm open to my mind being changed either way.

Expand full comment
Jun 3·edited Jun 3Liked by Jurgen Gravestein

Bender posted a compelling thought experiment last year, somewhat like Searle's Chinese Room but (to me) more realistic and salient. Assuming you do not speak or recognize the written language of Thai:

"Imagine you are in the National Library of Thailand. You have access to all the books in that library, except any that have illustrations or any writing not in Thai. You have unlimited time, and your physical needs are catered to, but no people to interact with. Could you learn to understand written Thai? If so, how would you achieve that?"

https://medium.com/@emilymenonbender/thought-experiment-in-the-national-library-of-thailand-f2bf761a8a83

Expand full comment
Jun 3Liked by Jurgen Gravestein

But GPTo is does not only use language anymore, so even if it was true, it would not apply to the latest. GPTo is fully multimodal, and does not translate it into language/text.

https://zapier.com/blog/gpt-4o/

"The "o" in GPT-4o stands for "omni." That refers to the fact that, in addition to taking text inputs, it can also natively understand audio and image inputs—and it can reply with any combination of text, images, and audio. The key here is that this is all being done by the one model, rather than multiple separate models that are working together."

Expand full comment
Jun 3·edited Jun 3Liked by Jurgen Gravestein

Thanks for the reply! I do agree that incorporating images and audio should expand GPT-4o's capabilities, and I'm interested to see what that ends up looking like.

What I object to (and I'm not saying you're doing this) is the kind of "gee-whiz" take on the capabilities of LLMs and other generative AI, where people stop thinking about these in terms of how they actually work, and start thinking about them as approximations to human brains and minds. Some AI research borders on pseudo-science, to me. There have been so many papers on the allegedly emergent abilities of ChatGPT and GPT-4 that don't attempt a serious answer to "why should a next token prediction model be able to do this?" but instead jump to "this LLM is highly complex and we don't fully understand how it's generating its output and *maybe* it's developed human-like abilities". The one about ChatGPT developing "theory of mind" is a good example. Why on Earth would a next token prediction model develop anything like what we call "theory of mind" in a human being? Shouldn't the default hypothesis be that the next-token prediction model has been trained on theory of mind problems? Instead of starting from what we actually know about how LLMs work and trying to slowly build on this, the authors go all "gee-whiz" and suggest that they've just discovered something amazing and mysterious. Real science doesn't work this way.

This isn't to say that multi-modal models aren't a promising advancement; at minimum I expect them to be better at mimicing human style understanding, and perhaps they'll show some progress in distinguishing truth from fiction independent of just pattern-matching and probabilities for what word comes next. I only wish that AI researchers would follow the lead of the more mature sciences, which progress through slow and careful theory building and testing, and demand that strong claims be backed up with strong evidence. As brilliant as Geoff Hinton clearly is, I think he's one of the many people in AI who have let their imaginations get the better of them.

Expand full comment
Jun 4Liked by Jurgen Gravestein

I'm not sure if they are wrong, though and I've done quite a bit of research on it; sometimes I think that we're hewing too close to theory and not observing what is happening enough.

Take, for example, "Why would it need to develop a theory of mind?" Well, you provided an answer, "In order to minimize loss on the next token, it needed to develop something akin to answering questions on theory of mind problems; effectively, it develops something that is like a theory of mind."

Philosophically, we have to consider what we observe as science and then reformulate. Quantum physics, for example, often made no sense but because it is replicable, we have to accept that it is onto something. Likewise, if I said, "Wingardium Leviosa!" and made things levitate, it would indeed make no sense. But if it was replicable, then it clearly pointing to some reality.

The argument is that AI systems are simulations of the mind, and a good enough simulation of the mind is indeed a mind. Is this true? I'm not sure. But it is compelling enough that we should consider it as a possibillty, and along with it, attendant risks.

Because frankly, nothing in AI at the moment is progressing with slow and careful anything - not in its construction, not in its social impact, not in its understanding of what it is specifically doing, etc.

And for what it is worth, I want to link the Anthropic research that indicates that thye do, indeed, mimic human style understanding(and pretty deep reasoning in order to achieve deception):

https://www.astralcodexten.com/p/ai-sleeper-agents

If you want a tl;dr on it, just search "Still, this was pretty within-distribution. In order to test how good its out-of-distribution deception abilities, they put it in some novel situations" and look at it generalized to deceive in a surprisingly novel manner.

This is all quite a bit of reasoning, and a scary kind of reasoning, at that - power-seeking.

Expand full comment
Oct 2Liked by Jurgen Gravestein

To me, the inability to solve even the most elementary of cryptograms demonstrates the artificiality of the LLMs. They do not understand language, they just process it. For example:

https://earlboebert.substack.com/p/simple-cryptograms-are-still-safe?r=2adh4p

An example of how semantics, and not simple syntactic word association, plays a role in solution is touched on here:

https://earlboebert.substack.com/p/what-i-can-do-that-ai-cant?r=2adh4p

Great essay, BTW.

Expand full comment
Jun 5Liked by Jurgen Gravestein

This makes me wonder if there are ways that this questionable relationship to grounded reality can be used as a feature, rather than a bug.

I ran my own, very small, experiment by writing a few nonsense files and adding them to a knowledge base for a GPT4All instance on my machine. Sure enough, when prompted for ideas whose representations were in proximity to the nonsense files, the AI gave me the same nonsense I had written, rephrased a little bit. But maybe, if I could curate a particular representation of a "world" (fictional or just specialized) - perhaps then using the LLM's limited perspectives as a tool could be helpful, rather than risky? If it was described as, and used on purpose, as a figurative map rather than interpreted as the territory.

Expand full comment
author

I think that’s a very clever and fun idea. It reminds me of an old adage in computer science that says “all models are wrong, but some are useful”.

Expand full comment
Jun 3·edited Jun 3Liked by Jurgen Gravestein

"But we should never forget that to an LLM a truthful statement and a hallucinated response look completely identical. To them, it’s just words. Words that are likely to appear next to other words"

Very well said. I really dislike the phrase "hallucination", and I aslo dislike Hinton's proposed alternative of "confabulation". Both refer to a distinction between truth and falsehood (or reality and fiction) that plays no role in how LLMs generate output. That we humans see their output as being more often true than false is a happy byproduct of their training; LLMs are trained on human-written text, and humans more often write truths than falsehoods. But LLMs have no independent method for telling the difference.

In contrast, when a human gets an answer wrong, they might believe it to be true. Or, they might be BSing because they don't really know. Or, they might be lying. And we usually know when we're doing this: when a person doesn't know the answer to something and they're just guessing, they know they're guessing. When they do know the answer and they say something else, they know they're lying. When they think they know the answer but they aren't sure, they're able to say why and how confident they are. LLMs don't do any of this. It's not like they check to see if they know the answer first, and then if not they make something up. To them, true output and false output are generated the same way all output is generated: by pulling tokens from probability distributions, one at a time.

Expand full comment
Jun 5Liked by Jurgen Gravestein

You might like this write-up on the topic: https://untangled.substack.com/p/ai-isnt-hallucinating-we-are

Expand full comment
Jun 3Liked by Jurgen Gravestein

This was a nice post (and I'm using the modern meaning of the word, not 14th century.)

Now I might be biased, but I like my own comparison of LLMs with Leonard Shellby from Memento (https://www.whytryai.com/i/143611388/myth-llms-upgrade-themselves-on-their-own) - they both have memories up to their cutoff point (pretraining in the case of LLMs, the "incident" in the case of Leonard) but can only hold new memories for a short time and reset completely with new chats.

Expand full comment
author
Jun 3·edited Jun 4Author

Haha, nice one. This is great further reading, thanks for sharing! To not make the piece I was writing too long I tried to stay away from all the intricacies around fine tuning, context windows, external memory features, and focus mostly on the core aspects.

Expand full comment
Jun 3Liked by Jurgen Gravestein

Oh yeah, makes sense - it's a minor part in the context of this article. Would be too much of a digression to dive into every detail.

Expand full comment

LLM's are a very small part of Ai.

Expand full comment