from
Hacker News
Top
New
Squeeze more out of your GPU for LLM inference–Accelerate and DeepSpeed
by
EntICOnc
on 11/2/23, 6:03 PM with 0 comments