from
Hacker News
Top
New
FlashAttention – optimizing GPU memory for more scalable transformers
by
mpaepper
on 2/14/25, 8:33 AM with 0 comments