from Hacker News

  • Top
  • New

MegaScale: Scaling Large Language Model Training to More Than 10k GPUs [pdf]

by yankcrime on 11/4/24, 7:46 PM with 0 comments