from
Hacker News
Top
New
TensorRT-LLM runtime now open-source
by
mmoskal
on 3/11/25, 9:56 PM with 1 comments
by
mmoskal
on 3/11/25, 9:56 PM
Previously, the "Executor" runtime was shipped as binary blobs. This is the bit that schedules requests and manages KV cache (similar to vLLM or SGLang server).