from
Hacker News
Top
New
Accelerating LLM Inference with Parallel Draft Models (PARD)
by
dhruvdh
on 4/11/25, 6:10 PM with 0 comments