New code-focused LLM needle in the haystack benchmark
A new benchmark for measuring LLM's capability to detect bugs in large codebase. - HammingHQ/bug-in-the-code-stack
Read more here: External Link
A new benchmark for measuring LLM's capability to detect bugs in large codebase. - HammingHQ/bug-in-the-code-stack
Read more here: External Link