Against the Quantification of Integrity
When the measure of language becomes its target, it ceases to be good language.
💡Nerd Rating: 1/5. I discuss the origins of certain linguistic tics in LLMs and what it means for writing, student assessment, and thinking.
"It's not x, it's y."
Large Language
Basically, and now we’re basically forcing people to avoid natural writing patterns cause of the whole hysteria over whether something is written by an LLM or not. Ironically, companies wasted no time monetizing the hysteria by using LLMs to decide whether something is legitimately human or not, cashing in on the whole thing and becoming the arbiters of what acceptable writing is. The author correctly points out that if people stopped being lazy, and started engaging with the content instead of style then the whole problem would go away.
All that it is is a reason to introduce character limits for new members. There is no way engaging with the contents of some randos 5000 word AI post is worth anyone’s time.
That’s one way, but for stuff like papers doing stuff like looking for fabricated data would be the best approach in my opinion. You can extract facts from the paper, and then validate that they’re not hallucinated. And that can be done largely in automated fashion.
Writing and reading texts has a primary social function, communication. Letting the LLM expand your idea into an essay goes against that social function.
I disagree, what matters to me is that language stimulates parts of my brain which then translates into having new ideas. I absolutely do not care whether the text was written by a human or not here. The only question is whether it helped my understand something or framed it in a useful way.
Treating an essay written by expert, an essay by a crank and 100 essays written by LLMs leads not to an intellectual community. It leads to self delusion if you just pick whatever tickles your fancy.
Nowhere did I suggest anything of the sort. In fact, I was quite clear that focus should be on the quality of the content rather than style. Sounds like you’re arguing with a mirror here.
The hysteria is a sad part of the whole situation. I think the core tension is just how serious an infarction plagurism is in academia.
It’s like if you had a medical machine that first did harm to produce a more accurate diagnosis. You’d be compelled to find the oath breaking physicians and you’d have to do some weird statistical analysis like “are you getting too many accurate diagnoses?” Then you’ll have arguments about correct diagnoses being the point. But your institution and teaching since its inception is to first do no harm and you don’t have any institutional capacity, and likely any will to, change it so that you can do a little bit of harm.
I feel a lot better about code than I do art or writing or anything else. Before Bill Gates did all his patent fraud and paywall schemes, I believe open source was the default mode. Information wants to propagate after all. Even before LLMs you’re just copy and pasting stuff from stack overflow. Code should be iterative and available; who cares if it’s easy to come by? I think it takes us closer to how code was envisioned when your SaaS nonsense gets one shot by Claude which is opposite to a plagurism machine removing us from the envisioned principles of education.
So the origin of the hysteria is good and just. Obviously if you send someone out into the world with your endorsement and they rely on a tool instead of their own capacity that’s against the point of an education. On the flip side I also absolutely get the desire to instrumentalize the education you receive. A diploma is a piece of paper with an ROI and the context around that education is you being instrumentalized. Not to mention the way to avoid death by exposure is to be capable of being profited off of. So why would they care about the principles of academia? You’d care about the principles of medicine because you don’t want to receive or give harm. You care about the principles of code because you don’t want to be paywalled out of the shared knowledge pool. But academia? You’re subject to punishment if you do or if you don’t (10% of the time per the article). You’re seldom hired or useful by virtue of your capacity without a tool otherwise. I was waxing poetic earlier, but you are seldom seen praised, acknowledged, or rewarded for unassisted general intelligence in culture and media.
So therefore an educator has this existential threat to the model of how the institution works and tools that are insufficient to ward against it. You hollow out and undermine the people you’re looking after if you turn them into instruments of the LLM (e.g. people who are good at checking LLM outputs for accuracy). But, like the article mentioned, you begin to police thought and the most common ways of resolving ambiguity by using language as a conduit. The very act of policing incentivising a lack of engagement with reason instead of its form as well as putting people who don’t engage with any of it into the crossfire.
That sucks and that’s grim.
Yeah, the whole closed source thing with code only started when corps realized they could make money of software. People writing code and sharing it was the default mode before that. So, completely agree that LLMs can be helpful in making open source become the default because it’s just not going to be worth hoarding the code going forward. I’ve also found they’re pretty capable at reverse engineering closed code as well. My Nikon camera uses a weird ass proprietary format for its RAW files, and I was able to reverse engineer it by a combination of decompiling and instrumenting a proprietary library from the app I’ve been using to read the files. This is just something that wouldn’t have been practical for me to even attempt before LLMs.
I think for science, we need to see more of what arxiv started banning people for a year if they have hallucinated references in their papers. That’s the kind of thing that makes sense to focus on instead of style. If a paper makes stuff up, then you know it’s bad quality regardless how it was made and you can deal with the offender. You can even have an automated process to do the initial survey of papers to validate them before they get to a human review. This stuff can be done fairly deterministically since a citation either exists or it does not.
Ultimately, the focus should be on the substance, and as you note, the problem was already there long before LLMs. Now it’s just a lot easier for people to produce garbage.