Benchmark popular LLM inference engines
We believe in giving back to the community. So today we introduce Prem Benchmarks. This is a fully open-source project with its primary objective being to benchmark popular LLM inference engines (currently 13 + engines) like vLLM, TensorRT LLM, HuggingFace Transformers, etc on different precisions like float32, float16, int4, and int8.
Read more here: External Link