from Hacker News

Absolute Zero: Reinforced Self-Play Reasoning with Zero Data

by dave1010uk on 5/7/25, 11:12 AM with 2 comments