• FreedomAdvocate
    link
    fedilink
    English
    arrow-up
    11
    ·
    7 hours ago

    “It’s another move to protect against AI scraping.”

    Not because they’re against AI getting their data, oh no - because they SELL their data to google to use for their AI.

  • magnetosphere@fedia.io
    link
    fedilink
    arrow-up
    25
    ·
    11 hours ago

    …we’re limiting some of their access to Reddit data to protect redditors,”

    This is the funniest, most transparent piece of PR bullshit I’ve seen today.

    Reddit doesn’t want the Internet Archive to give away information that Reddit wants to sell. That’s ALL. Privacy never enters the equation.

  • reddig33@lemmy.world
    link
    fedilink
    English
    arrow-up
    20
    ·
    12 hours ago

    Ha! That just means all the content deleted by users who left Reddit is actually inaccessible.

    • halcyoncmdr@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      edit-2
      12 hours ago

      No, they reverted a lot of that. Bulk restoring even “overwritten” post data several weeks and months after the fact, after most people stopped checking.

      Plus it’s going forwards, so anything in the Archive already is still there.

      • Ace@feddit.uk
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        edit-2
        11 hours ago

        I don’t think they did. Unless you have evidence otherwise, I think this is a rumour which comes from a misunderstanding of how deletion tools worked.

        Until recently, the api only provided access to 1000 posts per feed, i.e. 1000 most recent comments in your /comments/new feed; 1000 “top” submissions, etc. So if you try to mass delete “everything”, you really only delete the 1k posts from each feed, which can leave a lot of your posts/comments unfindable if they’ve falled off the bottom of all feeds. So if you run a mass deletion, your profile feeds will look empty, via the UI and the API, but you haven’t necessarily deleted everything.

        This has been a commonly-complained-about limitation of the API for 15 years. “How am I supposed to find all of my comments if they’re not in the 1000 top/new/hot/controversial feeds?” Their answer has always been, “sorry, you can’t.”. So people would run a “mass deletion” tool, think they’d deleted everything, then later find a comment that wasn’t deleted and they’d get all conspiratorial and claim reddit is “secretly” undeleting stuff and “pretending stuff is deleted” by not showing it on your profile. I seriously doubt reddit cares about your comment that much, much as we love to hate them. They aren’t undeleting anything as far as I’ve ever seen.

        I personally wrote a script to scrape search engine results for a “myusername site:reddit.com” search, looking for my comments to delete. After running various mass deletion tools, which claimed/seemed to have completely deleted everything (and made my profile look empty, since all feeds had been exhausted), I was then able to delete tens of thousands more comments via my search engine method which weren’t findable via the API. I… used reddit a lot.

        The API has recently been updated to allow more (?) posts to be visible. Since that change, a lot of old posts appeared in some feeds on my profile, so a couple thousand more comments surfaced which even my search engine script missed, and I’ve now deleted those too. If you have that volume of posts it’s really just a case of trying to find them all however you can.

        Annoying technical limitation? Yes. Conspiracy to undelete data? No.

        • FreedomAdvocate
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 hours ago

          I don’t think they did. Unless you have evidence otherwise, I think this is a rumour which comes from a misunderstanding of how deletion tools worked.

          They did it to my comments. I had a like 15 year old account, hundreds of thousands of karma, and I deleted all of my comments that I could view in my profiles history and then deleted my account. Days later I found not only was my account undeleted, all of my comments that I deleted were back as if nothing ever happened - and my account was banned.

        • dhhyfddehhfyy4673@fedia.io
          link
          fedilink
          arrow-up
          3
          arrow-down
          1
          ·
          10 hours ago

          I’ve seen no evidence of this either. Recently made a comment on this as well:

          Since apparently people still don’t know this, it is unlikely that reddit has been restoring deleted posts & comments. Historically, there was a limit to how much could appear on a user’s profile and even deleting stuff to back below the limit would not restore the visibility of items already pushed off.

          They did relatively recently change this though so if you still have access to your account you can see and nuke the rest (although with the api lockdown and rate limiting shit nowadays it’s not as quick & simple).

          Another reason this how idea got started was during the api fiasco, a fuckload of subs went private so anything in said subs did not show on profiles during that time. As mods capitulated or were removed, subs went public again and hidden content showed back up; people who nuked their accounts, via user profile pages, in this period assumed reddit was restoring their deleted stuff.

  • irotsoma@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    5
    ·
    12 hours ago

    Problem is scraper bots are way more aggressive and harder to block. If they were ignoring Reddit because they were taking content from IA but IA is willing to obey robots.txt whereas scraper bots are not, they just shifted the load of serving the bots or playing whack-a-mole with their block evading mechanisms. They aren’t going to stop the bots. It may result in being able to negotiate a license with the bigger guys, but that’s likely not going to make up for the money they spend on dealing with the bots in the long run. Of course companies like this don’t really think long term, it just looks good to investors this quarter.