from Hacker News

More capable models are better at in-context scheming

by miles on 6/20/25, 9:28 PM with 1 comments

  • by chiph2o on 6/20/25, 10:03 PM

    in-context scheming = alignment red flag

    More capability + low clarity on intent = low trust