• davetortoise@reddthat.com
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      3 hours ago

      Short youtube video explaining why tokenisation causes this bug. It’s an older video, so it talks about tokens as being whole-word rather than chunks of words, which is how most modern models work.

      https://youtube.com/shorts/7pQrMAekdn4

      The other persons explanation doesn’t acknowledge that emergent reasoning does kind-of exist in LLMs. That’s why theyre able to say how many 5’s are in a large number, despite never seeing that problem before. They dont ‘just’ repeat things they’ve been trained on, though they often do.

      Of course, if that problem did exist significantly in the training data, it would be more likely to get it right. But you could say the same about any number of things an LLM doesn’t know.