New code-focused LLM needle in the haystack benchmark

A new benchmark for measuring LLM's capability to detect bugs in large codebase. - HammingHQ/bug-in-the-code-stack

Read more here: External Link