Techniques for more efficient LLM serving (up to 10x)

📅 May 2, 2024 ⏱️ 1 min read

Building on our years of experience across the inference stack, we have built a number of leading edge optimization technologies into the OctoAI systems stack.