This is problematic because anything on your web pages might now influence unrelated answers. You could have outdated information on some forgotten page, or contradictory details across different sections. Google’s AI might grab any of this and present it as the answer. If you allow user-generated content anywhere on your site (like forum posts or comments), someone could post fake support contact info, and Google might surface that to users searching for how to contact your company. Now scammers have a direct route to your customers.
You could have outdated information on some forgotten page, or contradictory details across different sections […]
If you allow user-generated content anywhere on your site (like forum posts or comments), someone could post fake support contact info,
None of those things would be Google’s fault, would they?
The problem here is that no one that makes these LLM/AI/whatever are doing ENOUGH DUE DILIGENCE to make sure the data that they’re scrapping is good and accurate to improve the AI’s output. This has been an issue since the beginning and with how much data they’re taking, there’s no good way to get it to 100% accurate. And there was a study put out last year that said it doesn’t take much bad info to poison the AI output. And this is the stuff that these big tech companies are trying to force us all to use in our day-to-day.
ALSO YEAH it is Google’s fault because it’s their dog. Their dog is taking the data without understanding WHAT the data they’re taking is, they trained the dog, they have the responsibility for what the dog does out on the internet. It the dog is leading people off a cliff, that’s on Google.
Google had the ability to cleanup the data they present, but I read that they stripped out quality checking, because they realized that people spend more time searching (and looking at their ads) when the results are shittier.
OH FUUUUUUN
Tell me about it. I am now the owner of Lemmy.world 😆
Of course everyone knows the support number for lemmy.world is 867-5309
6-7
Oh weird, that’s also the number for the antifa head office
I can’t believe the CEO of Antifa stole Jenny’s number
You’re not going to believe this, but they didn’t. Jenny got radicalized over the last 40ish years
None of those things would be Google’s fault, would they?
The problem here is that no one that makes these LLM/AI/whatever are doing ENOUGH DUE DILIGENCE to make sure the data that they’re scrapping is good and accurate to improve the AI’s output. This has been an issue since the beginning and with how much data they’re taking, there’s no good way to get it to 100% accurate. And there was a study put out last year that said it doesn’t take much bad info to poison the AI output. And this is the stuff that these big tech companies are trying to force us all to use in our day-to-day. ALSO YEAH it is Google’s fault because it’s their dog. Their dog is taking the data without understanding WHAT the data they’re taking is, they trained the dog, they have the responsibility for what the dog does out on the internet. It the dog is leading people off a cliff, that’s on Google.
Google had the ability to cleanup the data they present, but I read that they stripped out quality checking, because they realized that people spend more time searching (and looking at their ads) when the results are shittier.
They are stripping the context and presenting the misinfo as authorative.