LLM Leaderboard with explanations of what each score means

📅 April 20, 2024 ⏱️ 1 min read

The Holistic Evaluation of Language Models (HELM) serves as a living benchmark for transparency in language models. Providing broad coverage and recognizing incompleteness, multi-metric measurements, and standardization. All data and analysis are freely accessible on the website for exploration and study.