from
Hacker News
Top
New
MegaScale: Scaling Large Language Model Training to More Than 10k GPUs [pdf]
by
yankcrime
on 11/4/24, 7:46 PM with 0 comments