from Hacker News

TensorRT-LLM runtime now open-source

by mmoskal on 3/11/25, 9:56 PM with 1 comments

  • by mmoskal on 3/11/25, 9:56 PM

    Previously, the "Executor" runtime was shipped as binary blobs. This is the bit that schedules requests and manages KV cache (similar to vLLM or SGLang server).