How an artificial intelligence (as in large language model based generative AI) could be better for information access and retrieval than an encyclopedia with a clean classification model and a search engine?
If we add a step of processing – where a genAI “digests” perfectly structured data and tries, as bad as it can, to regurgitate things it doesn’t understand – aren’t we just adding noise?
I’m talking about the specific use-case of “draw me a picture explaining how a pressure regulator works”, or “can you explain to me how to code a recursive pattern matching algorithm, please”.
I also understand how it can help people who do not want or cannot make the effort to learn an encyclopedia’s classification plan, or how a search engine’s syntax work.
But on a fundamental level, aren’t we just adding an incontrolable step of noise injection in a decent time-tested information flow?
Hopefully, it told you that’s not a sign of a worn clutch. Assuming no computer interference and purely mechanical effects, then that’s a sign the clutch is dragging. A worn clutch would provide more of an air gap with the pedal depressed than a fresh clutch. If you want to see a partial list of potential causes, see my reply to the other comment that replied to you.
Your questions are still not proof that LLMs are filling some void. If you think of a traditional encyclopedia, of course it’s not going to know what the colors of one manufacturer’s sandpapers mean. I’m sure that’s answered somehow on their website or wherever you came across the two colors in the same grit and format. Chances are, if one is more expensive and doesn’t have a defined difference in abrasive material, the pricier one is going to last longer by way of having stronger backing paper, better abrasive adhesive, and better resistance to clogging. Whether or not the price is necessary for your project is a different story. ChatGPT is reading the same info available to you. But if you don’t understand the facts presented on the package, then how can you trust the LLM to tokenize it correctly to you?
Similarly, a traditional encyclopedia isn’t going to have a direct answer to your clutch question, but, if it has thorough mechanical entries (with automotive specifics), you might be able to piece it together. You’d learn the “engine” spins in unison up to the flywheel, the flywheel is the mating surface for the clutch, the clutch pedal disengages the clutch from the flywheel, and that holding the pedal down for 5+ seconds should make the transmission input components spin down to a stop (even in neutral). You’re trusting the LLM here to have a proper understanding of those linked mechanical devices. It doesn’t. It’s aggregating internet sources, buzzfeed style, and presenting anything it finds in a corrupted stream of tokens. Again, if you’re not brought up to speed on how those components interact, then how do you know what it’s saying is correct?
Obviously, the rebuttal is how can you trust anyone’s answer if you’re not already knowledgeable? Peer review is great for forums/social sites/wikipedias in the way of people correcting other comments. But beyond that, for formal informational sites, vetting places as a source - a skill being actively eroded with Google or ChatGPT “giving” answers. Neither are actually answering your questions. They’re regurgitating things they found elsewhere. Remember, Google was happy to take reddit answers as fact and tell you elmers glue will hold cheese to pizza and cockroaches live in cocks. If you saw those answers with their high upvote count, you’d understand the nuance that reddit loves shitty sarcastic answers for entertainment value. LLMs don’t because they, literally, don’t understand anything. It’s up to you to figure out if you should trust an algorithm-promoted Facebook page called “car hacks and facts” filled with bullshit videos. It’s up to you to figure out if everythingcar. com is untrustworthy because it has vague, expansive wording and has more ad space than information.