This is the technology worth trillions of dollars huh

  • Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    50
    arrow-down
    1
    ·
    10 hours ago

    You joke, but I bet you didn’t know that Connecticut contained a “d”

    I wonder what other words contain letters we don’t know about.

      • KubeRoot@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        16
        ·
        9 hours ago

        That actually sounds like a fun SCP - a word that doesn’t seem to contain a letter, but when testing for the presence of that letter using an algorithm that exclusively checks for that presence, it reports the letter is indeed present. Any attempt to check where in the word the letter is, or to get a list of all letters in that word, spuriously fail. Containment could be fun, probably involving amnestics and widespread societal influence, I also wonder if they could create an algorithm for checking letter presence that can be performed by hand without leaking any other information to the person performing it, reproducing the anomaly without computers.

        • leftzero@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          4
          ·
          edit-2
          5 hours ago

          No, LLMs produce the most statistically likely (in their training data) token to follow a certain list of tokens (there’s nothing remotely resembling reasoning going on in there, it’s pure hard statistics, with some error and randomness thrown in), and there are probably a lot more lists where Colorado is followed by Connecticut than ones where it’s followed by Delaware, so they’re obviously going to be more likely to produce the former.

          Moreover, there aren’t going to be many texts listing the spelling of states (maybe transcripts of spelling bees?), so that information is unlikely to be in their training data, and they can’t extrapolate because it’s not really something they do and because they use words or parts of words as tokens, not letters, so they literally have no way of listing the letters of a word if said list is not in their training data (and, again, that’s not something we tend to write, and if we did we wouldn’t include d in Connecticut even if we were reading a misprint). Same with counting how many letters a word has, and stuff like that.

      • I Cast Fist@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        ·
        9 hours ago

        SCP-00WTFDoC (lovingly called “where’s the fucking D of Connecticut” by the foundation workers, also “what the fuck, doc?”)

        People think it’s safe, because it’s “just an invisible D”, not even a dick, just the letter D, and it only manifests verbally when someone tries to say “connecticut” or write it down. When you least expect it, everyone heard “Donnedtidut”, everyone read that thing and a portal to that fucking place opens and drags you in.

      • ripcord@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 hours ago

        Words are full of mystery! Besides the invisible D, Connecticut has that inaudible C…

        • XeroxCool@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 hours ago

          “Kinetic” with a hard “T” like posh Brit is saying it to the queen? Everyone I’ve ever heard speaking US English pronounces it with a rolled “t” like “kinedic” so the alternate pronunciation still reads like it’d have a “d” sound

          • TipRing@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            5 hours ago

            This phenomenon is called “T flapping” and it is common in North American English. I got into an argument with my dad who insisted he pronounces the T’s in ‘butter’ when his dialect, like nearly all North Americans pronounces the word as ‘budder’.

              • TipRing@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                3 hours ago

                It’s an approximation, but the t is partially vocalized giving it a ‘d’ sound even if it’s not made exactly the same way.

                • BeeegScaaawyCripple@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  3 hours ago

                  i just thought we were getting technical about the linguistics. i got and use both words frequently, thought the distinction might be appreciated. the difference is so subtle we sometimes have to ask each other which one we’re referring to. i’m willing to bet it shows up more on my face than in my voice.

                  • TipRing@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    edit-2
                    3 hours ago

                    I appreciate the discussion, I get out of my depth pretty quickly on the topic being a linguistic hobbyist rather than someone with actual education and background.

        • Echo Dot@feddit.uk
          link
          fedilink
          English
          arrow-up
          2
          ·
          9 hours ago

          That’s how I’ve always heard it pronounced on the rare occasions anybody ever mentions it. But I’ve never been to that part of the US so maybe the accents different there?

      • Aneb@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        7 hours ago

        I was going to make a joke if you’re from connedicut you never pronounce first d in the word. Conne-icut