• Zephyr@sh.itjust.works
    link
    fedilink
    arrow-up
    9
    arrow-down
    4
    ·
    14 hours ago

    I mean there’s almost no secret sauce about these AI’s currently which is why open source models are nearly as good. We are totally free to set up a nonprofit kinda like Wikipedia where people donate to train and run AI models for the public based on open source datasets. We’re now seeing people like PewDiePie kinda getting the ball rolling.

    • RedstoneValley@sh.itjust.works
      link
      fedilink
      arrow-up
      8
      ·
      12 hours ago

      Absolutely true, but that doesn’t change the fact that those AI companies stole the knowledge to train their models and they did this on a massive scale.

      It’s so ridiculous to see a guy torrenting a few movies getting jailtime while the AI companies make off with the biggest IP heist in history and get applauded for it.

      • Zephyr@sh.itjust.works
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        11 hours ago

        There could be a class action lawsuit. I wonder how other major players in AI are managing this, particularly labs in China, Israel, UK, Singapore, and India. Of course each nation had its own laws around copyright. Like isn’t there an equal pushback like this for Chinese AI labs or is it a uniquely American or western thing?

        • RedstoneValley@sh.itjust.works
          link
          fedilink
          arrow-up
          1
          ·
          46 minutes ago

          I think a major problem is that it is difficult to prove which IP is in the model data. That’s why the AI companies argue that there isn’t a verbatim copy in the model, and therefore it’s not theft. The law in most countries is not equipped to deal with this scenario

          • Zephyr@sh.itjust.works
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            31 minutes ago

            Seems easy enough to prove with a court order. Short of that though I’ve seen people get models to perfectly complete content inferring that information is in there somewhere or at minimum the model is willing to go fetch that information breaching copyright. I am still curious if this is an issue in AI labs elsewhere or if it’s primarily a US / UK issue.