• grue@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    ·
    14 hours ago

    ELI5 why the AI companies can’t just clone the git repos and do all the slicing and dicing (running git blame etc.) locally instead of running expensive queries on the projects’ servers?

    • green@feddit.nl
      link
      fedilink
      English
      arrow-up
      5
      ·
      2 hours ago

      Too many people overestimate the actual capabilities of these companies.

      I really do not like saying this because it lacks a lot of nuance, but 90% of programmers are not skilled in their profession. This is not to say they are stupid (though they likely are, see cat-v/harmful) but they do not care about efficiency nor gracefulness - as long as the job gets done.

      You assume they are using source control (which is unironically unlikely), you assume they know that they can run a server locally (which I pray they do), and you assume their deadlines allow them to think about actual solutions to problems (which they probably don’t)

      Yes, they get paid a lot of money. But this does not say much about skill in an age of apathy and lawlessness

      • turmacar@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        55 minutes ago

        Also, everyone’s solution to a problem is stupid if they’re only given 5 minutes to work on it.

        Combine that with it being “free” for them to query the website and expensive to have enough local storage to replicate, even temporarily, all the stuff they want to scrape and it’s kind of a no brainier to ‘just not do that’. The only thing stopping them is morals / whether they want to keep paying rent.

    • zovits@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      12 hours ago

      Takes more effort and results in a static snapshot without being able to track the evolution of the project. (disclaimer: I don’t work with ai, but I’d bet this is the reason and also I don’t intend to defend those scraping twatwaffles in any way, but to offer a possible explanation)