Lady Butterfly she/her@reddthat.com to Technology@lemmy.worldEnglish · 2 days agoClaude Opus 4.6: This AI just passed the 'vending machine test' - and we may want to be worried about how it didnews.sky.comexternal-linkmessage-square9fedilinkarrow-up132arrow-down120
arrow-up112arrow-down1external-linkClaude Opus 4.6: This AI just passed the 'vending machine test' - and we may want to be worried about how it didnews.sky.comLady Butterfly she/her@reddthat.com to Technology@lemmy.worldEnglish · 2 days agomessage-square9fedilink
minus-squareUlrich@feddit.orglinkfedilinkEnglisharrow-up12·edit-22 days agoIt passed a test in a simulated environment. Put it back where it was in reality and prove it to me there.
minus-squareRepple (she/her)@lemmy.worldlinkfedilinkEnglisharrow-up3·15 hours ago“New model is so much better than old model when given test that we never gave to the old model.“ Wut
It passed a test in a simulated environment. Put it back where it was in reality and prove it to me there.
“New model is so much better than old model when given test that we never gave to the old model.“
Wut