• chicken@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 day ago

    began with a simple question: whether Claude had a list of banned words it could not say. Screenshots of the conversation show Claude denying such a list existed, then later producing forbidden terms after Mindgard challenged the denial using what it called a “classic elicitation tactic interrogators use.”

    The list probably exists, because duh, but everyone should know by now that LLMs will make shit up when pressed for information.