• General_Effort@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 hours ago

    Don’t really see the problem. If you pick up the content while web crawling, you will end up with a lot of duplicates, but that’s normal. If you wanted to scrape the Fediverse in particular, you’d know the structure of the data.