by ed on 10/20/24, 8:36 PM with 1 comments
by ed on 10/20/24, 8:36 PM
The results are impressive - Llama 3 8b performs almost on par with GPT-4o across a wide range of tasks, not just logic and math.
Interestingly, the post-training process significantly improves model performance even without “thoughts” (the “direct baseline” case in the paper).