by helloericsf on 11/5/24, 6:52 PM with 103 comments
by mrob on 11/5/24, 8:25 PM
EDIT: The HN title was changed, which previously made the claim. But as HN user swyx pointed out, Tencent is also claiming this is open source, e.g.: "The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry".
by a_wild_dandan on 11/5/24, 8:45 PM
We've come so astonishingly far in like two years. I have no idea what AI will do in another year, and it's thrilling.
by 1R053 on 11/5/24, 8:31 PM
They use
- 16 experts, of which one is activated per token
- 1 shared expert that is always active
in summary that makes around 52B active parameters per token instead of the 405B of LLama3.1.
by the_duke on 11/5/24, 8:33 PM
Anyone have some background on this?
by helloericsf on 11/5/24, 6:52 PM
by eptcyka on 11/5/24, 8:21 PM
by Tepix on 11/5/24, 10:50 PM
by adt on 11/5/24, 9:51 PM
by iqandjoke on 11/6/24, 2:45 AM