This is the technology worth trillions of dollars huh

  • ilinamorato@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    28 minutes ago

    ✅ Colorado

    ✅ Connedicut

    ✅ Delaware

    ❌ District of Columbia (on a technicality)

    ✅ Florida

    But not

    ❌ I’aho

    ❌ Iniana

    ❌ Marylan

    ❌ Nevaa

    ❌ North Akota

    ❌ Rhoe Islan

    ❌ South Akota

    • boonhet@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 minutes ago

      Everyone knows it’s properly spelled “I, the ho” not Idaho. That’s why it didn’t make the list.

    • ilinamorato@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      35 minutes ago

      I would assume it uses a different random seed for every query. Probably fixed sometimes, not fixed other times.

    • MML@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 hour ago

      What about Our Kansas? Cause according to Google Arkansas has one o in it. Refreshing the page changes the answer though.

      • samus12345@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        1 hour ago

        Just checked, it sure does say that! AI spouting nonsense is nothing new, but it’s pretty ironic that a large language model can’t even parse what letters are in a word.

  • resipsaloquitur@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    6 hours ago

    Listen, we just have to boil the ocean five more times.

    Then it will hallucinate slightly less.

    Or more. There’s no way to be sure since it’s probabilistic.

  • BlueMagma@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    9
    ·
    6 hours ago

    I get the sentiment behind this post, and it’s almost always funny when LLM are such dumbass. But this is not a good argument against the technology. It is akin to climate change denier using the argument: “look! It snowed today, climate change is so dumb huh ?”

    • Reygle@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      You do know that AI is (if not already) fast approaching a leading CAUSE of climate change?

    • MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      1
      ·
      5 hours ago

      AI writes code for me. It makes dumbass mistakes that compilers automatically catch. It takes three or four rounds to correct a lot of random problems that crop up. Above all else, it’s got limited capacity - projects beyond a couple thousand lines of code have to be carefully structured and spoonfed to it - a lot like working with junior developers. However: it’s significantly faster than Googling for the information needed to write the code like I have been doing for the last 20 years, it does produce good sample code (if you give it good prompts), and it’s way less frustrating and slow to work with than a room full of junior developers.

      That’s not saying we fire the junior developers, just that their learning specializations will probably be very different from the ones I was learning 20 years ago, just as those were very different than the ones programmers used 40 and 60 years ago.

      • BlueMagma@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 hours ago

        I agree, cursor and other IDE integration have been a game changer. It made it way easier for a certain range of problems we used to have in software dev. And for every easy code, like prototyping, or inconsequential testing, it’s so so fast. What I found is that, it is particularly efficient at helping you do stuff you would have been able to do alone, and are able to check once it’s done. Need to be careful when asking stuff you aren’t familiar with though, cause it will comfortably lead you toward a mistake that will waste your time.

        Though one thing I have to say: I’m very annoyed by it’s constant agreeing with what I say, and enabling me when I’m doing dumb shit. I wish it would challenge me more and tell me when I’m an idiot.

        “Yes you are totally right”, “This is a very common issue that everybody has”, “What a great and insightful question”… I’m so tired of this BS.

  • dude@lemmings.world
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    1
    ·
    8 hours ago

    Well, for anyone who knows a bit about how LLMs work, it’s pretty obvious why LLMs struggle with identifying the letters in the words

      • JustTesting@lemmy.hogru.ch
        link
        fedilink
        English
        arrow-up
        15
        ·
        6 hours ago

        They don’t look at it letter by letter but in tokens, which are automatically generated separately based on occurrence. So while ‘z’ could be it’s own token, ‘ne’ or even ‘the’ could be treated as a single token vector. of course, ‘e’ would still be a separate token when it occurs in isolation. You could even have ‘le’ and ‘let’ as separate tokens, afaik. And each token is just a vector of numbers, like 300 or 1000 numbers that represent that token in a vector space. So ‘de’ and ‘e’ could be completely different and dissimilar vectors.

        so ‘delaware’ could look to an llm more like de-la-w-are or similar.

        of course you could train it to figure out letter counts based on those tokens with a lot of training data, though that could lower performance on other tasks and counting letters just isn’t that important, i guess, compared to other stuff

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          3
          ·
          5 hours ago

          Of course, when the question asks “contains the letter _” you might think an intelligent algorithm would get off its tokens and do a little letter by letter analysis. Related: ChatGPT is really bad at chess, but there are plenty of algorithms that are super-human good at it.

        • fading_person@lemmy.zip
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 hours ago

          Wouldn’t that only explain errors by omission? If you ask for a letter, let’s say D, it would omit words containing that same letter when in a token in conjunction with more letters, like Da, De, etc, but how would it return something where the letter D isn’t even in the word?

          • JustTesting@lemmy.hogru.ch
            link
            fedilink
            English
            arrow-up
            2
            ·
            2 hours ago

            Well each token has a vector. So ‘co’ might be [0.8,0.3,0.7] just instead of 3 numbers it’s like 100-1000 long. And each token has a different such vector. Initially, those are just randomly generated. But the training algorithm is allowed to slowly modify them during training, pulling them this way and that, whichever way yields better results during training. So while for us, ‘th’ and ‘the’ are obviously related, for a model no such relation is given. It just sees random vectors and the training reorganizes them tho slowly have some structure. So who’s to say if for the model ‘d’, ‘da’ and ‘co’ are in the same general area (similar vectors) whereas ‘de’ could be in the opposite direction. Here’s an example of what this actually looks like. Tokens can be quite long, depending how common they are, here it’s ones related to disease-y terms ending up close together, as similar things tend to cluster at this step. You might have an place where it’s just common town name suffixes clustered close to each other.

            and all of this is just what gets input into the llm, essentially a preprocessing step. So imagine someone gave you a picture like the above, but instead of each dot having some label, it just had a unique color. And then they give you lists of different colored dots and ask you what color the next dot should be. You need to figure out the rules yourself, come up with more and more intricate rules that are correct the most. That’s kinda what an LLM does. To it, ‘da’ and ‘de’ could be identical dots in the same location or completely differents

            plus of course that’s before the llm not actually knowing what a letter or a word or counting is. But it does know that 5.6.1.5.4.3 is most likely followed by 7.7.2.9.7(simplilied representation), which when translating back, that maps to ‘there are 3 r’s in strawberry’. it’s actually quite amazing that they can get it halfway right given how they work, just based on ‘learning’ how text structure works.

            but so in this example, us state-y tokens are probably close together, ‘d’ is somewhere else, the relation between ‘d’ and different state-y tokens is not at all clear, plus other tokens making up the full state names could be who knows where. And tien there’s whatever the model does on top of that with the data.

            for a human it’s easy, just split by letters and count. For an llm it’s trying to correlate lots of different and somewhat unrelated things to their ‘d-ness’, so to speak

  • Aceticon@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    8 hours ago

    “This is the technology worth trillions of dollars

    You can make anything fly high in the sky with enough helium, just not for long.

    (Welcome to the present day Tech Stock Market)

    • MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      5 hours ago

      Bubbles and crashes aren’t a bug in the financial markets, they’re a feature. There are whole legions of investors and analysts who depend on them. Also, they have been a feature of financial markets since anything resembling a financial market was invented.

  • SaveTheTuaHawk@lemmy.ca
    link
    fedilink
    English
    arrow-up
    12
    ·
    8 hours ago

    We’re turfing out students by the tens on academic misconduct. They are handing in papers with references that clearly state “generated by Chat GPT”. Lazy idiots.

    • NateNate60@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      ·
      8 hours ago

      This is why invisible watermarking of AI-generated content is likely to be so effective. Even primitive watermarks like file metadata. It’s not hard for anyone with technical knowledge to remove, but the thing with AI-generated content is that anyone who dishonestly uses it when they are not supposed to is probably also too lazy to go through the motions of removing the watermarking.

        • chaospatterns@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 hours ago

          Depends on the watermark method used. Some people talk about watermarking by subtly adjusting the words used. Like if there’s 5 synonyms and you pick the 1st synonym, next word you pick the 3rd synonym. To check the watermark you have to access to the model and probabilities to see if it matches that. The tricky part about this is that the model can change and so can the probabilities and other things I don’t fully understand.

  • IngeniousRocks (They/She) @lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    20
    arrow-down
    2
    ·
    edit-2
    9 hours ago

    Hey look the markov chain showed its biggest weakness (the markov chain)!

    In the training data, it could be assumed by output that Connecticut usually follows Colorado in lists of two or more states containing Colorado. There is no other reason for this to occur as far as I know.

    Markov Chain based LLMs (I think thats all of them?) are dice-roll systems constrained to probability maps.

    Edit: just to add because I don’t want anyone crawling up my butt about the oversimplification. Yes. I know. That’s not how they work. But when simplified to words so simple a child could understand them, its pretty close.

  • panda_abyss@lemmy.ca
    link
    fedilink
    English
    arrow-up
    25
    ·
    edit-2
    10 hours ago

    Yesterday i asked Claude Sonnet what was on my calendar (since they just sent a pop up announcing that feature)

    It listed my work meetings on Sunday, so I tried to correct it…

    You’re absolutely right - I made an error! September 15th is a Sunday, not a weekend day as I implied. Let me correct that: This Week’s Remaining Schedule: Sunday, September 15

    Just today when I asked what’s on my calendar it gave me today and my meetings on the next two thursdays. Not the meetings in between, just thursdays.

    Something is off in AI land.

    Edit: I asked again: gave me meetings for Thursday’s again. Plus it might think I’m driving in F1

    • achance4cheese@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      7
      ·
      8 hours ago

      Also, Sunday September 15th is a Monday… I’ve seen so many meeting invites with dates and days that don’t match lately…

      • panda_abyss@lemmy.ca
        link
        fedilink
        English
        arrow-up
        3
        ·
        7 hours ago

        Yeah, it said Sunday, I asked if it was sure, then it said I’m right and went back to Sunday.

        I assume the training data has the model think it’s a different year or something, but this feature is straight up not working at all for me. I don’t know if they actually tested this at all.

        Sonnet seems to have gotten stupider somehow.

        Opus isn’t following instructions lately either.

    • FlashMobOfOne@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      1
      ·
      10 hours ago

      A few weeks ago my Pixel wished me a Happy Birthday when I woke up, and it definitely was not my birthday. Google is definitely letting a shitty LLM write code for it now, but the important thing is they’re bypassing human validation.

      Stupid. Just stupid.