that the technology is more or less sticking the landing
Only in competent hands, because everything it generates has to be validated manually. My office uses Copilot, and every competent worker involved in complex projects hates it and only uses it for trivial things, like generating an email response, which you then have to read anyway so you might as well type it yourself in the first place. No one uses it for anything meaningful.
Human validation is propping up the perception of LLM’s.
One cannot trust this technology to do anything overly consequential or precise. It’s like how Theranos’ Edison system could perform maybe four types of blood tests correctly, but the extravagant promises, lies, and outright fraud about the product were contrary to Elizabeth Holmes’ grandiose claims.
We’re only a few years away from similar documentaries about Sam Altman, and if you read the recent Ronan Farrow article about him, maybe not even years.
Only in competent hands, because everything it generates has to be validated manually.
Same can be said for the output of most interns, and even more senior employees. This is why we have “Quality Systems” safety audits, design controls, and the rest of the regulations which basically set us all up checking each others’ work all day long. No AI required, these systems were shown to improve both safety AND efficiency of industry back in the 1980s, which is why they were rolled out as law for industries like aviation, medical, automotive and finance in the 1990s long before anyone would have claimed that AI of the day was doing anything.
The reason studies say those “measure twice cut once” practices increase efficiency is because mistakes are expensive, extremely expensive when you get to problems like Boeing has, it’s not just the lives lost or cost of crash damage/loss, it’s the reputational impact to the company, public perception diminishing the real value of their products.
ALL those same quality practices apply to AI. People are complaining because AI output is so much faster than people output, leaving people holding the review bag, but… newsflash: AI does quality review too. Imperfectly, incompletely, just like people, except AI does the quality reviews much faster than people. Will AI as author, editor, publisher and critic work? Maybe not as a complete closed loop system today, but the individual functions of AI acting within those systems have all improved dramatically in the past 12 months.
What hasn’t changed? The broad public’s perceptions and growing anxiety. Justifiable concerns about how the powerful owners/directors of AI companies will abuse their influence.
AI does quality review too. Imperfectly,
but the individual functions of AI acting within those systems have all improved dramatically in the past 12 months
I suppose that depends on how we define improvement, because from where I’m sitting, it’s reasonable to be apprehensive about LLM’s and their output when we see spectacular failure after spectacular failure.
Whether it’s bombing a school in Iran because Claude fucked up the targeting, or an AI agent deleting your email inbox or your production database, or creating a court case out of thin air, or stats in a SCOTUS ruling that are fictitious, over and over and over again the extravagant promises they keep telling us are just around the corner appear to be decidedly half-baked.
And if you use Teams or Windows and pieces of functionality that worked for two decades are no longer working as designed in a dependable way, I guess I just don’t know what to tell you.
It makes perfect sense not to trust this technology, and the speed it promises is often mitigated by the fact that you can’t and shouldn’t trust its output, because if you’re the unlucky SOB that doesn’t check a reference, you can literally become national news.
Further, being that it’s already been trained on the entirety of recorded human knowledge, I’m not sure how it gets better either. You can make it faster, but it’s just going to spit out slop at a faster rate.
Whether it’s bombing a school in Iran because Claude fucked up the targeting
I’m going to call user error on that, and I don’t think it matters what system they were using - they were going to make mistakes.
an AI agent deleting your email inbox or your production database
The real error there? Conducting risky operations without backups.
creating a court case out of thin air
That’s just big silicon-brass balls. Interns do it too, but you don’t hear about them. On the other hand, trusting the AI or the intern, that’s disbarment levels of reckless.
It makes perfect sense not to trust this technology
Or any technology, until we have figured out what it is, and isn’t, capable of doing reliably.
But, plenty of people still play Russian Roulette, for one reason or another. Is that the revolver manufacturer’s fault?
being that it’s already been trained on the entirety of recorded human knowledge, I’m not sure how it gets better either
Only in competent hands, because everything it generates has to be validated manually. My office uses Copilot, and every competent worker involved in complex projects hates it and only uses it for trivial things, like generating an email response, which you then have to read anyway so you might as well type it yourself in the first place. No one uses it for anything meaningful.
Human validation is propping up the perception of LLM’s.
One cannot trust this technology to do anything overly consequential or precise. It’s like how Theranos’ Edison system could perform maybe four types of blood tests correctly, but the extravagant promises, lies, and outright fraud about the product were contrary to Elizabeth Holmes’ grandiose claims.
We’re only a few years away from similar documentaries about Sam Altman, and if you read the recent Ronan Farrow article about him, maybe not even years.
Same can be said for the output of most interns, and even more senior employees. This is why we have “Quality Systems” safety audits, design controls, and the rest of the regulations which basically set us all up checking each others’ work all day long. No AI required, these systems were shown to improve both safety AND efficiency of industry back in the 1980s, which is why they were rolled out as law for industries like aviation, medical, automotive and finance in the 1990s long before anyone would have claimed that AI of the day was doing anything.
The reason studies say those “measure twice cut once” practices increase efficiency is because mistakes are expensive, extremely expensive when you get to problems like Boeing has, it’s not just the lives lost or cost of crash damage/loss, it’s the reputational impact to the company, public perception diminishing the real value of their products.
ALL those same quality practices apply to AI. People are complaining because AI output is so much faster than people output, leaving people holding the review bag, but… newsflash: AI does quality review too. Imperfectly, incompletely, just like people, except AI does the quality reviews much faster than people. Will AI as author, editor, publisher and critic work? Maybe not as a complete closed loop system today, but the individual functions of AI acting within those systems have all improved dramatically in the past 12 months.
What hasn’t changed? The broad public’s perceptions and growing anxiety. Justifiable concerns about how the powerful owners/directors of AI companies will abuse their influence.
I suppose that depends on how we define improvement, because from where I’m sitting, it’s reasonable to be apprehensive about LLM’s and their output when we see spectacular failure after spectacular failure.
Whether it’s bombing a school in Iran because Claude fucked up the targeting, or an AI agent deleting your email inbox or your production database, or creating a court case out of thin air, or stats in a SCOTUS ruling that are fictitious, over and over and over again the extravagant promises they keep telling us are just around the corner appear to be decidedly half-baked.
And if you use Teams or Windows and pieces of functionality that worked for two decades are no longer working as designed in a dependable way, I guess I just don’t know what to tell you.
It makes perfect sense not to trust this technology, and the speed it promises is often mitigated by the fact that you can’t and shouldn’t trust its output, because if you’re the unlucky SOB that doesn’t check a reference, you can literally become national news.
Further, being that it’s already been trained on the entirety of recorded human knowledge, I’m not sure how it gets better either. You can make it faster, but it’s just going to spit out slop at a faster rate.
I’m going to call user error on that, and I don’t think it matters what system they were using - they were going to make mistakes.
The real error there? Conducting risky operations without backups.
That’s just big silicon-brass balls. Interns do it too, but you don’t hear about them. On the other hand, trusting the AI or the intern, that’s disbarment levels of reckless.
Or any technology, until we have figured out what it is, and isn’t, capable of doing reliably.
But, plenty of people still play Russian Roulette, for one reason or another. Is that the revolver manufacturer’s fault?
Better editing.