from Hacker News

zhwu

joined 10/5/22, 7:51 PM has 31 karma

A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM
by zhwu on 4/21/25, 10:28 PM, with 0 comments
Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE
by zhwu on 2/10/25, 7:26 PM, with 0 comments
New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server
by zhwu on 8/22/23, 4:20 PM, with 0 comments
Train Your Own Vicuna on Llama-2
by zhwu on 8/10/23, 4:34 PM, with 0 comments
Guide on fine-tuning your own Vicuna on Llama-2
by zhwu on 8/3/23, 6:18 PM, with 0 comments
Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot
by zhwu on 6/29/23, 5:11 PM, with 1 comments
Biologists are moving to the clouds with SkyPilot from UC Berkeley
by zhwu on 5/1/23, 5:15 PM, with 0 comments
Vicuna releases its secrete of finding available A100s on the cloud to train it
by zhwu on 4/13/23, 9:48 PM, with 2 comments