I found the aeticle in a post on the fediverse, and I can’t find it anymore.

The reaserchers asked a simple mathematical question to an LLM ( like 7+4) and then could see how internally it worked by finding similar paths, but nothing like performing mathematical reasoning, even if the final answer was correct.

Then they asked the LLM to explain how it found the result, what was it’s internal reasoning. The answer was detailed step by step mathematical logic, like a human explaining how to perform an addition.

This showed 2 things:

  • LLM don’t “know” how they work

  • the second answer was a rephrasing of original text used for training that explain how math works, so LLM just used that as an explanation

I think it was a very interesting an meaningful analysis

Can anyone help me find this?

EDIT: thanks to @theunknownmuncher @lemmy.world https://www.anthropic.com/research/tracing-thoughts-language-model its this one

EDIT2: I’m aware LLM dont “know” anything and don’t reason, and it’s exactly why I wanted to find the article. Some more details here: https://feddit.it/post/18191686/13815095

  • JackGreenEarth@lemm.ee
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    4
    ·
    4 days ago

    By design, they don’t know how they work. It’s interesting to see this experimentally proven, but it was already known. In the same way the predictive text function on your phone keyboard doesn’t know how it works.

    • lgsp@feddit.it@feddit.itOP
      link
      fedilink
      English
      arrow-up
      17
      ·
      4 days ago

      I’m aware of this and agree but:

      • I see that asking how an LLM got to their answers as a “proof” of sound reasoning has become common

      • this new trend of “reasoning” models, where an internal conversation is shown in all its steps, seems to be based on this assumption of trustable train of thoughts. And given the simple experiment I mentioned, it is extremely dangerous and misleading

      • take a look at this video: https://youtube.com/watch?v=Xx4Tpsk_fnM : everything is based on observing and directing this internal reasoning, and these guys are computer scientists. How can they trust this?

      So having a good written article at hand is a good idea imho

      • Blue_Morpho@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        4 days ago

        I only follow some YouTubers like Digital Spaceport but there has been a lot of progress from years ago when LLM’s were only predictive. They now have an inductive engine attached to the LLM to provide logic guard rails.