from Hacker News

Speculative sampling: LLMs writing a lot faster using smaller LLMs

by spolu on 6/9/23, 3:33 PM with 0 comments