Bluesky users debate plans around user data and AI training

xc2215x@lemmy.world · 8 hours ago

Bluesky users debate plans around user data and AI training

Telorand@reddthat.com · 7 hours ago

It’s that interoperability of unique instances that makes the Fediverse resistant to scraping. The posts are all public, but crawling it all and categorizing everything is probably like untangling a cotton ball.

General_Effort@lemmy.world · 1 hour ago

Don’t really see the problem. If you pick up the content while web crawling, you will end up with a lot of duplicates, but that’s normal. If you wanted to scrape the Fediverse in particular, you’d know the structure of the data.

unalivejoy@lemm.ee · 6 hours ago

Or you can host your own instance and let the servers send you all their data (instances can still defederate)

Bluesky users debate plans around user data and AI training

Bluesky users debate plans around user data and AI training

Bluesky users debate plans around user data and AI training | TechCrunch