You could have outdated information on some forgotten page, or contradictory details across different sections […]
If you allow user-generated content anywhere on your site (like forum posts or comments), someone could post fake support contact info,
None of those things would be Google’s fault, would they?
The problem here is that no one that makes these LLM/AI/whatever are doing ENOUGH DUE DILIGENCE to make sure the data that they’re scrapping is good and accurate to improve the AI’s output. This has been an issue since the beginning and with how much data they’re taking, there’s no good way to get it to 100% accurate. And there was a study put out last year that said it doesn’t take much bad info to poison the AI output. And this is the stuff that these big tech companies are trying to force us all to use in our day-to-day.
ALSO YEAH it is Google’s fault because it’s their dog. Their dog is taking the data without understanding WHAT the data they’re taking is, they trained the dog, they have the responsibility for what the dog does out on the internet. It the dog is leading people off a cliff, that’s on Google.
Google had the ability to cleanup the data they present, but I read that they stripped out quality checking, because they realized that people spend more time searching (and looking at their ads) when the results are shittier.
None of those things would be Google’s fault, would they?
The problem here is that no one that makes these LLM/AI/whatever are doing ENOUGH DUE DILIGENCE to make sure the data that they’re scrapping is good and accurate to improve the AI’s output. This has been an issue since the beginning and with how much data they’re taking, there’s no good way to get it to 100% accurate. And there was a study put out last year that said it doesn’t take much bad info to poison the AI output. And this is the stuff that these big tech companies are trying to force us all to use in our day-to-day. ALSO YEAH it is Google’s fault because it’s their dog. Their dog is taking the data without understanding WHAT the data they’re taking is, they trained the dog, they have the responsibility for what the dog does out on the internet. It the dog is leading people off a cliff, that’s on Google.
Google had the ability to cleanup the data they present, but I read that they stripped out quality checking, because they realized that people spend more time searching (and looking at their ads) when the results are shittier.
They are stripping the context and presenting the misinfo as authorative.