Against the Quantification of Integrity
When the measure of language becomes its target, it ceases to be good language.
💡Nerd Rating: 1/5. I discuss the origins of certain linguistic tics in LLMs and what it means for writing, student assessment, and thinking.
"It's not x, it's y."
Large Language
Yeah, the whole closed source thing with code only started when corps realized they could make money of software. People writing code and sharing it was the default mode before that. So, completely agree that LLMs can be helpful in making open source become the default because it’s just not going to be worth hoarding the code going forward. I’ve also found they’re pretty capable at reverse engineering closed code as well. My Nikon camera uses a weird ass proprietary format for its RAW files, and I was able to reverse engineer it by a combination of decompiling and instrumenting a proprietary library from the app I’ve been using to read the files. This is just something that wouldn’t have been practical for me to even attempt before LLMs.
I think for science, we need to see more of what arxiv started banning people for a year if they have hallucinated references in their papers. That’s the kind of thing that makes sense to focus on instead of style. If a paper makes stuff up, then you know it’s bad quality regardless how it was made and you can deal with the offender. You can even have an automated process to do the initial survey of papers to validate them before they get to a human review. This stuff can be done fairly deterministically since a citation either exists or it does not.
Ultimately, the focus should be on the substance, and as you note, the problem was already there long before LLMs. Now it’s just a lot easier for people to produce garbage.