from
Hacker News
Top
New
Muon Is Scalable for LLM Training
by
renonce
on 2/25/25, 4:50 AM with 1 comments
by
yorwba
on 2/25/25, 5:40 AM
For people who want to know more about the Muon optimizer:
https://kellerjordan.github.io/posts/muon/