from Hacker News

Model training diary/journal for LLMs?

by nalzok on 11/15/23, 2:42 AM with 1 comments

About half a year ago, some big tech company released an open-source LLM. What makes that model special is that they made available a model training diary/journal recording everything their engineers did to babysit the training process, e.g. "on day 143, the training loss plateaued, so we decreased the learning rate further". I think it was in a shared Google Doc.

Can you remind me of the name of the company/model?

by nalzok on 11/15/23, 2:54 AM
Nevermind, I figured it out: https://github.com/facebookresearch/metaseq/blob/main/projec...