Anthropic details "many-shot jailbreaking" to evade LLM safety guardrails

📅 April 3, 2024 ⏱️ 1 min read

How do you get an AI to answer a question it's not supposed to? There are many such "jailbreak" techniques, and Anthropic researchers just found a new