I'm looking for an article showing that LLMs don't know how they work internally

lgsp@feddit.it@feddit.it · edit-2 2 months ago

I'm looking for an article showing that LLMs don't know how they work internally

AnneBonny@lemmy.dbzer0.com · 2 months ago

Who has claimed that LLMs have the capacity to reason?

theparadox@lemmy.world · 2 months ago

More than enough people who claim to know how it works think it might be “evolving” into a sentient being inside it’s little black box. Example from a conversation I gave up on… https://sh.itjust.works/comment/18759960

theunknownmuncher@lemmy.world · 2 months ago

I don’t want to brigade, so I’ll put my thoughts here. The linked comment is making the same mistake about self preservation that people make when they ask an LLM to “show it’s work” or explain it’s reasoning. The text response of an LLM cannot be taken at it’s word or used to confirm that kind of theory. It requires tracing the logic under the hood.

Just like how it’s not actually an AI assistant, but trained and prompted to output text that is expected to be what an AI assistant would respond with, if it is expected that it would pursue self preservation, then it will output text that matches that. It’s output is always “fake”

That doesn’t mean there isn’t a real potential element of self preservation, though, but you’d need to dig and trace through the network to show it, not use the text output.

AnneBonny@lemmy.dbzer0.com · 2 months ago

Maybe I should rephrase my question:

Outside of comment sections on the internet, who has claimed or is claiming that LLMs have the capacity to reason?

Em Adespoton@lemmy.ca · 2 months ago

The study being referenced explains in detail why they can’t. So I’d say it’s Anthropic who stated LLMs don’t have the capacity to reason, and that’s what we’re discussing.

The popular media tends to go on and on about conflating AI with AGI and synthetic reasoning.

theunknownmuncher@lemmy.world · 2 months ago

You’re confusing the confirmation that the LLM cannot explain it’s under-the-hood reasoning as text output, with a confirmation of not being able to reason at all. Anthropic is not claiming that it cannot reason. They actually find that it performs complex logic and behavior like planning ahead.

Em Adespoton@lemmy.ca · 2 months ago

No, they really don’t. It’s a large language model. Input cues instruct it as to which weighted path through the matrix to take. Those paths are complex enough that the human mind can’t hold all the branches and weights at the same time. But there’s no planning going on; the model can’t backtrack a few steps, consider different outcomes and run a meta analysis. Other reasoning models can do that, but not language models; language models are complex predictive translators.

theunknownmuncher@lemmy.world · 2 months ago

To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with “grab it”), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.

Instead, we found that Claude plans ahead. Before starting the second line, it began “thinking” of potential on-topic words that would rhyme with “grab it”. Then, with these plans in mind, it writes a line to end with the planned word.

🙃 actually read the research?

glizzyguzzler@lemmy.blahaj.zone · 2 months ago

No, they’re right. The “research” is biased by the company that sells the product and wants to hype it. Many layers don’t make think or reason, but they’re glad to put them in quotes that they hope peeps will forget were there.