LLM Paper on Mamba MoE: Jamba Technical Report from AI2

📅 April 1, 2024 ⏱️ 1 min read

We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is added in some of these layers to increase mo