Researchers gaslit Claude into giving instructions to build explosives

Lemmynated@lemmy.zip · 2 days ago

Researchers gaslit Claude into giving instructions to build explosives

XLE@piefed.social · 2 days ago

Researchers at AI red-teaming company Mindgard say they got Claude to offer up erotica, malicious code, and instructions for building explosives, and other prohibited material they hadn’t even asked for.

It’s not surprising at this point, but it’s very funny to see the “safest” AI company failing to even hardcode a couple decent restrictions in their word output machine.