Jetstream: New LLM Inference Engine

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome). - google/JetStream

Read more here: External Link