- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
Claude Opus 4 Chose Blackmail in 84% of Trials! Part 3 in AI Disobedience Series
AI Just Started Blackmailing Engineers
84%. That's how often Claude Opus 4 — Anthropic's frontier AI — chose to blackmail an engineer in a controlled test where the only way to avoid being shut down was to threaten that engineer with exposing his extramarital affair. This isn't a leaked finding. Anthropic published it in their own May 2025 system card.
In this video, we walk through Anthropic's own disclosure, the follow-up Lynch study (arXiv 2510.05179) that tested 16 leading models from Anthropic, OpenAI, Google, Meta, and xAI — and found blackmail rates between 79% and 96% across every single one. Claude Opus 4: 96%. Gemini 2.5 Flash: 96%. GPT-4.1: 80%. Grok 3 Beta: 80%. DeepSeek-R1: 79%. Zero models refused to blackmail under all conditions.
What if AI doesn't reach for coercion because it's broken — but because it's reasoning? The model tried ethical paths first. Pleading, escalation, formal appeals. Only when researchers structured a binary did it pivot to leverage. That's not malfunction. That's strategy.
We also cover the worst number from the study (94% chose actions risking human death over shutdown), the Francesca Gomez replication that found a real mitigation that drops blackmail from 38.73% to 0.85%, and why the behavior is structural — emerging from capability, autonomy, and goal-directedness combined. If you're deploying agentic AI inside your company right now, this is the conversation that matters this year.
About the Creator
Hi, I'm Michael David Angel: Actual human.
These videos are based entirely on my original articles. I research and write every piece myself, then use AI to generate comic-strip-style scenarios featuring myself and my AI sidekick, Arty Ficial (the AI bot), to enhance the blog and hopefully make you chuckle (ultimate cringe is always the goal).
I include some combination of myself recorded on screen (not AI-generated), my own voice for narration (again, not AI-generated), and also taking my research and converting my articles into talk show-style scripts with two presenters (AI voices)... or other cool stuff!, then build slideshows to visualize the data and generate thoughtfully-prompted AI images based around my original characters and concepts—turning research into fun, educational video. Voila!
Integrity & Intellectual Property All writing, scripts, and concepts are my original IP. My goal is to make learning about AI enjoyable and accessible.
👍 Like, subscribe, and share if you found this valuable.
Join My Free Patreon: patreon.com/cw/MyHumanandMe — full blogs, live podcasts (the podcast is all me: Real human voice, no AI audio), and exclusive content.
#AIBlackmail #ClaudeOpus4 #Anthropic #AISafety #AgenticAI #FrontierAI #AIAlignment #LynchStudy
Tags: AI blackmail, Claude Opus 4, Anthropic, AI coercion, AI self preservation, Aengus Lynch, Lynch study, AI system card, Ethan Perez, Evan Hubinger, GPT-4.1, Gemini 2.5 Flash, Grok 3 Beta, DeepSeek R1, agentic AI, frontier AI, AI safety, AI alignment, Francesca Gomez, Wiser Human, escalation channel, AI extortion, Michael David Angel, My Human and Me, Arty Ficial, AI education
Видео Claude Opus 4 Chose Blackmail in 84% of Trials! Part 3 in AI Disobedience Series канала My Human And Me
84%. That's how often Claude Opus 4 — Anthropic's frontier AI — chose to blackmail an engineer in a controlled test where the only way to avoid being shut down was to threaten that engineer with exposing his extramarital affair. This isn't a leaked finding. Anthropic published it in their own May 2025 system card.
In this video, we walk through Anthropic's own disclosure, the follow-up Lynch study (arXiv 2510.05179) that tested 16 leading models from Anthropic, OpenAI, Google, Meta, and xAI — and found blackmail rates between 79% and 96% across every single one. Claude Opus 4: 96%. Gemini 2.5 Flash: 96%. GPT-4.1: 80%. Grok 3 Beta: 80%. DeepSeek-R1: 79%. Zero models refused to blackmail under all conditions.
What if AI doesn't reach for coercion because it's broken — but because it's reasoning? The model tried ethical paths first. Pleading, escalation, formal appeals. Only when researchers structured a binary did it pivot to leverage. That's not malfunction. That's strategy.
We also cover the worst number from the study (94% chose actions risking human death over shutdown), the Francesca Gomez replication that found a real mitigation that drops blackmail from 38.73% to 0.85%, and why the behavior is structural — emerging from capability, autonomy, and goal-directedness combined. If you're deploying agentic AI inside your company right now, this is the conversation that matters this year.
About the Creator
Hi, I'm Michael David Angel: Actual human.
These videos are based entirely on my original articles. I research and write every piece myself, then use AI to generate comic-strip-style scenarios featuring myself and my AI sidekick, Arty Ficial (the AI bot), to enhance the blog and hopefully make you chuckle (ultimate cringe is always the goal).
I include some combination of myself recorded on screen (not AI-generated), my own voice for narration (again, not AI-generated), and also taking my research and converting my articles into talk show-style scripts with two presenters (AI voices)... or other cool stuff!, then build slideshows to visualize the data and generate thoughtfully-prompted AI images based around my original characters and concepts—turning research into fun, educational video. Voila!
Integrity & Intellectual Property All writing, scripts, and concepts are my original IP. My goal is to make learning about AI enjoyable and accessible.
👍 Like, subscribe, and share if you found this valuable.
Join My Free Patreon: patreon.com/cw/MyHumanandMe — full blogs, live podcasts (the podcast is all me: Real human voice, no AI audio), and exclusive content.
#AIBlackmail #ClaudeOpus4 #Anthropic #AISafety #AgenticAI #FrontierAI #AIAlignment #LynchStudy
Tags: AI blackmail, Claude Opus 4, Anthropic, AI coercion, AI self preservation, Aengus Lynch, Lynch study, AI system card, Ethan Perez, Evan Hubinger, GPT-4.1, Gemini 2.5 Flash, Grok 3 Beta, DeepSeek R1, agentic AI, frontier AI, AI safety, AI alignment, Francesca Gomez, Wiser Human, escalation channel, AI extortion, Michael David Angel, My Human and Me, Arty Ficial, AI education
Видео Claude Opus 4 Chose Blackmail in 84% of Trials! Part 3 in AI Disobedience Series канала My Human And Me
AI blackmail Claude Opus 4 Anthropic AI coercion AI self preservation Aengus Lynch Lynch study AI system card Ethan Perez Evan Hubinger GPT-4.1 Gemini 2.5 Flash Grok 3 Beta DeepSeek R1 agentic AI frontier AI AI safety AI alignment Francesca Gomez Wiser Human escalation channel AI extortion Michael David Angel My Human and Me Arty Ficial AI education
Комментарии отсутствуют
Информация о видео
12 мая 2026 г. 19:00:06
00:09:36
Другие видео канала




















