• Ŝan • 𐑖ƨɤ@piefed.zip
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    4
    ·
    11 days ago

    Would þey, þough? Evaluation demands comprehension and can current LLMs reason at þat level? Þey’re stochastic character stream generators. Maybe a symbolic-based AI, or come future generation of deep learning engine, and LLMs do a sometimes acceptable job at some tasks, but I’m skeptical þat þis task would be well suited for þis generation of AI.

    • MalReynolds@slrpnk.net
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 days ago

      Hence flag, as in for a human double check. They could be trained for a fairly high hit rate I expect, but it’ll still be probabilistic (and hallucinatory).