from Hacker News

Top
New

Accelerating LLM Inference with Parallel Draft Models (PARD)

by dhruvdh on 4/11/25, 6:10 PM with 0 comments