by davidbarker on 6/20/25, 7:31 PM with 9 comments
by simonw on 6/20/25, 9:43 PM
I made some notes on it all here: https://simonwillison.net/2025/Jun/20/agentic-misalignment/
by nioj on 7/1/25, 1:32 PM
by beefnugs on 6/21/25, 6:52 PM
Or is this some undeniable mathematical proof that regular human interaction with side facts always trends to possible blackmail?