Are new LLM models trained on old benchmarks?

null

Read more here: External Link