TPI-LLM: Serving 70B-Scale LLMs Efficiently on Low-Resource Edge Devices
TPI-LLM: A High-Performance Tensor Parallelism Inference System for Edge LLM Services. - Lizonghang/TPI-LLM
Read more here: External Link
TPI-LLM: A High-Performance Tensor Parallelism Inference System for Edge LLM Services. - Lizonghang/TPI-LLM
Read more here: External Link