from Hacker News

vLLM v0.6.0: 2.7x Throughput Improvement and 5x Latency Reduction

by xmo on 9/5/24, 5:17 PM with 0 comments