- cross-posted to:
- technology@lemmy.zip
- cross-posted to:
- technology@lemmy.zip
Archive: https://archive.is/lP0lT
If this AI stuff weren’t a bubble and the companies dumping billions into it were capable of any long term planning they’d call up wikipedia and say “how much do you need? we’ll write you a cheque”
They’re trying to figure out nefarious ways of getting data from people and wikipedia literally has people doing work to try to create high quality data for a relatively small amount of money that’s very valuable to these AI companies.
But nah, they’ll just shove AI into everything blow the equivalent of Wikipedia’s annual budget in a week on just electricity to shove unwanted AI slop into people’s faces.
Because they already ate through every piece of content on wikipedia years and years ago. They’re at the stage where they’ve trawled nearly the entire internet and are running out of content to find.
(pasting a Mastodon post I wrote few days ago on StackOverflow but IMHO applies to Wikipedia too)
"AI, as in the current LLM hype, is not just pointless but rather harmful epistemologically speaking.
It’s a big word so let me unpack the idea with 1 example :
- StackOverflow, or SO for shot.
So SO is cratering in popularity. Maybe it’s related to LLM craze, maybe not but in practice, less and less people is using SO.
SO is basically a software developer social network that goes like this :
- hey I have this problem, I tried this and it didn’t work, what can I do?
- well (sometimes condescendingly) it works like this so that worked for me and here is why
then people discuss via comments, answers, vote, etc until, hopefully the most appropriate (which does not mean “correct”) answer rises to the top.
The next person with the same, or similar enough, problem gets to try right away what might work.
SO is very efficient in that sense but sometimes the tone itself can be negative, even toxic.
Sometimes the person asking did not bother search much, sometimes they clearly have no grasp of the problem, so replies can be terse, if not worst.
Yet the content itself is often correct in the sense that it does solve the problem.
So SO in a way is the pinnacle of “technically right” yet being an ass about it.
Meanwhile what if you could get roughly the same mapping between a problem and its solution but in a nice, even sycophantic, matter?
Of course the switch will happen.
That’s nice, right?.. right?!
It is. For a bit.
It’s actually REALLY nice.
Until the “thing” you “discuss” with maybe KPI is keeping you engaged (as its owner get paid per interaction) regardless of how usable (let’s not even say true or correct) its answer is.
That’s a deep problem because that thing does not learn.
It has no learning capability. It’s not just “a bit slow” or “dumb” but rather it does not learn, at all.
It gets updated with a new dataset, fine tuned, etc… but there is no action that leads to invalidation of a hypothesis generated a novel one that then … setup a safe environment to test within (that’s basically what learning is).
So… you sit there until the LLM gets updated but… with that? Now that less and less people bother updating your source (namely SO) how is your “thing” going to lean, sorry to get updated, without new contributions?
Now if we step back not at the individual level but at the collective level we can see how short-termist the whole endeavor is.
Yes, it might help some, even a lot, of people to “vile code” sorry I mean “vibe code”, their way out of a problem, but if :
- they, the individual
- it, the model
- we, society, do not contribute back to the dataset to upgrade from…
well I guess we are going faster right now, for some, but overall we will inexorably slow down.
So yes epistemologically we are slowing down, if not worst.
Anyway, I’m back on SO, trying to actually understand a problem. Trying to actually learn from my “bad” situation and rather than randomly try the statistically most likely solution, genuinely understand WHY I got there in the first place.
I’ll share my answer back on SO hoping to help other.
Don’t just “use” a tool, think, genuinely, it’s not just fun, it’s also liberating.
Literally.
Don’t give away your autonomy for a quick fix, you’ll get stuck."
originally on https://mastodon.pirateparty.be/@utopiah/115315866570543792
Most importantly, the pipeline from finding a question on SO that you also have, to answering that question after doing some more research is now completely derailed because if you ask an AI a question and it doesn’t have a good answer you have no way to contribute your eventual solution to the problem.
I honestly think that LLM will result in no progress made ever in computer science.
Most past inventions and improvements were made because of necessity of how sucky computers are and how unpleasant it is to work with them (we call it “abstraction layers”). And it was mostly done on company’s dime.
Now companies will prefer to produce slop (even more) because it will hope to automate slop production.
As an expert in my engineering field I would agree. LLMs has been a great tool for my job in being better at technical writing or getting over the hump of coding something every now and then. That’s where I see the future for ChatGPT/AI LLMs; providing a tool that can help people broaden their skills.
There is no future for the expertise in fields and the depth of understanding that would be required to make progress in any field unless specifically trained and guided. I do not trust it with anything that is highly advanced or technical as I feel I start to teach it.
Maybe SO should run everyone’s answers through a LLM and revoke any points a person gets for a condescending answer even if accepted.
Give a warning and suggestions to better meet community guidelines.
It can be very toxic there.
Edit: I love the downvotes here. OP - AI is going to destroy the sources of truth and knowledge, in part because people stopped going to those sources because people were toxic at the sources. People: But I’ll downvote suggestions that could maybe reduce toxicity, while having no actual impact on the answers given.
Maybe the humans are going outside and the library?
Oh I hope not.
I do not want us to return to the days of people getting limited information from outdated books from a state ran facility.
They would be better if they were funded better.
Every time someone visits Wikipedia they make exactly $0. In fact, it costs them money. Are people still contributing and/or donating? These seem like more important questions to me.
yeah, i drop a $20-25 donation yearly.
There are indirect benefits to visitors, though. Yes, most people are a drain on resources because they visit strictly to read and never to contribute. The minority that do contribute, though, are presumably people who used Wikipedia and liked it, or people who enjoy knowing that other people are benefiting from their contributions. I’m not sure people will donate or edit on Wikipedia if they believe no one is using it.
I’d make a cash donation right now if I could.
I got you fam. I’ve been making a decent monthly donation for years. Consider one of those on your behalf!
Oh I didn’t mean change the current setup. Create a standalone tool that better uses the wiki framework so people can access it in a different way, that’s all.
AI will inevitably kill all the sources of actual information. Then all we’re going to be left with is the fuzzy learned version of information plus a heap of hallucinations.
What a time to be alive.
AI just cuts pastes from the websites like Wikipedia. The problem is when it gets information that’s old or from a sketchy source. Hopefully people will still know how to check sources, should probably be taught in schools. Who’s the author, how olds the article, is it a reputable website, is there a bias. I know I’m missing some pieces
You replied to OP while somehow missing the entire point of what he said lol
Much of the time, AI paraphrases, because it is generating plausible sentences not quoting factual material. Rarely do I see direct quotes that don’t involve some form of editorialising or restating of information, but perhaps I’m just not asking those sorts of questions much.
Man, we hardly did that shit 20 years ago. Ain’t no way the kids doing that now.
At best they’ll probably prompt AI into validating if the text is legit
Yet I still have to go to the page for the episode lists of my favorite TV shows because every time I ask AI which ones to watch it starts making up episodes that either don’t exist or it gives me the wrong number.
“With fewer visits to Wikipedia, fewer volunteers may grow and enrich the content, and fewer individual donors may support this work.”
I understand the donors aspect, but I don’t think anyone who is satisfied with AI slop would bother to improve wiki articles anyway.
The idea that there’s a certain type of person that’s immune to a social tide is not very sound, in my opinion. If more people use genAI, they may teach people who could have been editors in later years to use genAI instead.
That’s a good point, scary to think that there are people growing up now for whom LLMs are the default way of accessing knowledge.
Eh, people said the exact same thing about Wikipedia in the early 2000’s. A group of randos on the internet is going to “crowd source” truth? Absurd! And the answer to that was always, “You can check the source to make sure it says what they say it says.” If you’re still checking Wikipedia sources, then you’re going to check the sources AI provides as well. All that changes about the process is how you get the list of primary sources. I don’t mind AI as a method of finding sources.
The greater issue is that people rarely check primary sources. And even when they do, the general level of education needed to read and understand those sources is a somewhat high bar. And the even greater issue is that AI-generated half-truths are currently mucking up primary sources. Add to that intentional falsehoods from governments and corporations, and it already seems significantly more difficult to get to the real data on anything post-2020.
But Wikipedia actually is crowd sourced data verification. Every AI prompt response is made up on the fly and there’s no way to audit what other people are seeing for accuracy.
Hey! An excuse to quote my namesake.
Hackworth got all the news that was appropriate to his situation in life, plus a few optional services: the latest from his favorite cartoonists and columnists around the world; the clippings on various peculiar crackpot subjects forwarded to him by his father […] A gentleman of higher rank and more far-reaching responsibilities would probably get different information written in a different way, and the top stratum of New Chuasan actually got the Times on paper, printed out by a big antique press […] Now nanotechnology had made nearly anything possible, and so the cultural role in deciding what should be done with it had become far more important than imagining what could be done with it. One of the insights of the Victorian Revivial was that it was not necessarily a good thing for everyone to read a completely different newspaper in the morning; so the higher one rose in society, the more similar one’s Times became to one’s peers’. - The Diamond Age by Neal Stephenson (1995)
That is to say, I agree that everyone getting different answers is an issue, and it’s been a growing problem for decades. AI’s turbo-charged it, for sure. If I want, I can just have it yes-man me all day long.
Not me. I value Wikipedia content over AI slop.
It used to be that the first result to a lot of queries, was a link to the relevant Wikipedia article. But that first result has now been replaced by an ai summary of the relevant Wikipedia article. If people don’t need more info than that summary, they don’t click through. That Ai summary is a layer of abstraction that wouldn’t be able to exist without the source material that it’s now making less viable to exist. Kinda like a parasite.
It’s a layer of dependency and a barrier to entry. AI is not a servant to our interests but censor, preacher and teacher and cult speaker who works for psychopaths who would happily re-enslave the human race.
all websites should block ai and bot traffic on principle.
all websites should block ai and bot traffic on principle.
Increasing numbers do.
But there is no proof that the LLM trawling bots are willing to respect those blocks.
FWIW:
Wikipedia:Bot policy#Bot requirements
https://en.wikipedia.org/wiki/Wikipedia:Bot_policy#Bot_requirements
RationalWiki:Bots
The problem is many no longer identify as bots and come from hundreds if not thousands of IPs.
Voight-Kampff them.
I sympathize with Wikipedia here because I really like the platform. That being said, modernize and get yourself a new front end. People don’t like AI because of it’s intrusiveness. They want convenience. Create “Knowledge-bot” or something similar that is focused on answering questions in a more meaningful way.
The last thing Wikipedia should do is change the look. Modernizing is a waste of resources when it works just fine all to just to give idiots a new dopamine hit.
Capitalism is the problem not wikipedia. Plus the reference desk exists, its just not instant.
Wikipedia Says AI Is Causing a Dangerous Decline in Human Visitors
FWIW:
Wikipedia:Reference desk
Interesting. I had never heard about this. Could still use a lot of sprucing up.
IIRC, they expect people to first try to find the answers themselves—perhaps they could check out a few WP articles—no “Who’s the Secretary of the Department of Interior” or similar questions;
though my big (perhaps only) problem is that a question only stands for a while—maybe a few days or week or so—before it’s archived. In some of the forums (non-WP) of about 20 years ago, one could answer questions asked months, maybe years, earlier that are still relevant.
Still, I’ve gotten a few good answers to the few questions I posted on it.
I’m part of the problem. I now use Le Chat instead of search engines because AI destroyed search engines, thanks to all the content mills that make slop. I wish search engines just worked, and it’s a classic example of capitalism creating problems to justify new technology.
And I wonder if it’s just AI. I know some people moved to backing up pre-2025 versions of Wikipedia via Kiwix out of fear that the site gets censored. I know now that I’ve done that, it’s a no-brainer to just do my Wikipedia research without using bandwidth.
Search engines will still give Wikipedia results at the top for relevant searches. Heck, you can search Wikipedia itself directly!
Both Ecosia and DuckDuckGo support some form of “bangs”, if I tack
!w
onto my search it’ll immediate go through to Wikipedia.
DuckDuckGo has even introduced an AI image filter, which is not perfect but still pretty good.Bangs are helpful, but my problem is that I previously used search engines to find informative articles and product suggestions beyond the scope of Wikipedia, and so much of that is AI slop now. And if it’s not that, Reddit shows up disproportionately in search results and Google is dominated by promoted posts.
Search engines used to be really good at connecting people to reliable resources, even if you didn’t have a specific website in mind, if you were good with keywords/boolean and had a discerning eye for reliable content, but now the slop-to-valuable-content ratio is too disproportionate. So you either need to have pre-memorized a list of good websites, rely on Chatbots, or take significantly longer wading through the muck.
I asked a chatbot scenarios for AI wiping out humanity and the most believable one is where it makes humans so dependent and infantilized on it that we just eventually die out.
Tbh, I’d say that’s not a bad scenario all in all, and much more preferably than scenarios with world war, epidemics, starvation etc.
So we get the Wall-e future…
Mudd explains that he broke out of prison, stole a spaceship, crashed on this planet, and was taken in by the androids. He says they are accommodating, but refuse to let him go unless he provides them with other humans to serve and study. Mudd informs Kirk that he and his crew are to serve this purpose and can expect to spend the rest of their lives there.
I am kinda a big hater on AI and what danger it represents to the future of humanity
But. as a hobby programmer, I was surprised at how good these llms can answer very technical questions and provide conceptual insight and suggestions about how to glue different pieces of software together and which are the limitations of each one. I know that if AI knows about this stuff it must have been produced by a human. but considering the shitty state of the internet where copycat website are competing to outrank each other with garbage blocks of text that never answer what you are looking for. the honest blog post is instead burried at the 99 page in google search. I can’t see how old school search will win over.
Add to that I have found forums and platforms like stack overflow to be not always very helpful, I have many unanswered questions on stackoverflow piled-up over many years ago. things that llms can answer in details in just seconds without ever being annoyed at me or passing passive aggressive comments.
I know that if AI knows about this stuff it must have been produced by a human.
For now. Maybe.
It won’t be long before these LLMs will start ingesting the output from other LLMs, biases, confidently wrong answers, hallucinations and all.
Hobby programmer her as well. I know you I’ve spent a lot of time searching for solutions or hints for, especially when it’s about edge cases. So using AI as an alt. to a search engine have saved me sooo much time!
Another thing with the approach. I read somewhere that it require about 10 times as much energy to ask an AI instead of doing a web search and spending a little time looking through the result. So it’s something I try to think of to motivate myself with, to do as many usual web searches as possible, saving AI queries for when it matters more.
I would say it’s more like 1000 times more energy. Trillions of matrix math computations for a handful of tokens at max speed and CPU/GPU usage, compared to a 10 millisecond database query (or in wiki’s case, probably mostly just easy direct edge node cache with no processing involved.)
Alright, yea sounds fair enough, even better motivation to prioritize search engines!