from Hacker News

  • Top
  • New

zhwu

joined 10/5/22, 7:51 PM has 31 karma


  • A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM

    by zhwu on 4/21/25, 10:28 PM, with 0 comments

  • Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE

    by zhwu on 2/10/25, 7:26 PM, with 0 comments

  • New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server

    by zhwu on 8/22/23, 4:20 PM, with 0 comments

  • Train Your Own Vicuna on Llama-2

    by zhwu on 8/10/23, 4:34 PM, with 0 comments

  • Guide on fine-tuning your own Vicuna on Llama-2

    by zhwu on 8/3/23, 6:18 PM, with 0 comments

  • Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot

    by zhwu on 6/29/23, 5:11 PM, with 1 comments

  • Biologists are moving to the clouds with SkyPilot from UC Berkeley

    by zhwu on 5/1/23, 5:15 PM, with 0 comments

  • Vicuna releases its secrete of finding available A100s on the cloud to train it

    by zhwu on 4/13/23, 9:48 PM, with 2 comments