by headalgorithm on 2/3/25, 5:22 PM with 2 comments
by t1amat on 2/3/25, 7:06 PM
by althea_tx on 2/3/25, 9:17 PM
My jaw dropped a tiny bit when I read that “the model discovers on its own the most optimal Chain-of-Thought-like behavior, including advanced reasoning capabilities such as self-reflection and self-verification.”