from
Hacker News
Top
New
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient LMs
by
panabee
on 3/15/24, 7:44 PM with 0 comments