from Hacker News

  • Top
  • New

FlashAttention – optimizing GPU memory for more scalable transformers

by mpaepper on 2/14/25, 8:33 AM with 0 comments