- A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM
by zhwu on 4/21/25, 10:28 PM, with comments
- Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE
by zhwu on 2/10/25, 7:26 PM, with comments
- New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server
by zhwu on 8/22/23, 4:20 PM, with comments
- Train Your Own Vicuna on Llama-2
by zhwu on 8/10/23, 4:34 PM, with comments
- Guide on fine-tuning your own Vicuna on Llama-2
by zhwu on 8/3/23, 6:18 PM, with comments
- Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot
by zhwu on 6/29/23, 5:11 PM, with comments
- Biologists are moving to the clouds with SkyPilot from UC Berkeley
by zhwu on 5/1/23, 5:15 PM, with comments
- Vicuna releases its secrete of finding available A100s on the cloud to train it
by zhwu on 4/13/23, 9:48 PM, with comments