Benchmarking LLM Inference Back Ends: VLLM, LMDeploy, MLC-LLM, TRT-LLM, and TGI

📅 June 6, 2024 ⏱️ 1 min read

"Compare the Llama 3 serving performance with vLLM, LMDeploy, MLC-LLM, TensorRT-LLM, and Hugging Face TGI on BentoCloud." # Description used for search engine.