- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
【AI防御】評価基準の75%が空白、LLM攻撃2521種の衝撃マップ
論文情報
・url: http://arxiv.org/html/2605.15118v1
・title: Talk is (Not) Cheap: A Taxonomy and Benchmark Coverage Audit for LLM Attacks
・abstract: We introduce a reusable framework for auditing whether LLM attack benchmarks collectively cover the threat surface: a 4$\times$6 Target $\times$ Technique matrix grounded in STRIDE, constructed from a 507-leaf taxonomy -- 401 data-populated and 106 threat-model-derived leaves -- of inference-time attacks extracted from 932 arXiv security studies (2023--2026). The matrix enables benchmark-external validation -- auditing collective coverage rather than individual benchmark consistency. Applying it to six public benchmarks reveals that the three primary frameworks (HarmBench, InjecAgent, AgentDojo) occupy non-overlapping cells covering at most 25\% of the matrix, while entire STRIDE threat categories (Service Disruption, Model Internals) lack any standardized evaluation, despite published attacks in these categories achieving 46$\times$ token amplification and 96\% attack success rates through mechanisms which no benchmark tests. The corpus of 2,521 unique attack groups further reveals pervasive naming fragmentation (up to 29 surface forms for a single attack) and heavy concentration in Safety \& Alignment Bypass, structural properties invisible at smaller scale. The taxonomy, attack records, and coverage mappings are released as extensible artifacts; as new benchmarks emerge, they can be mapped onto the same matrix, enabling the community to track whether evaluation gaps are closing.
==========
ChatGPTやClaude、Geminiは本当に安全?932本の論文を分析した結果、主要ベンチマークがカバーするのは脅威のわずか25%。サービス拒否攻撃や内部操作など、測られていない脅威が企業のコストとセキュリティを直撃する可能性が明らかに。AIセキュリティの「見えない穴」を可視化した衝撃の研究を解説します。
Видео 【AI防御】評価基準の75%が空白、LLM攻撃2521種の衝撃マップ канала 海外論文研究ラジオ
・url: http://arxiv.org/html/2605.15118v1
・title: Talk is (Not) Cheap: A Taxonomy and Benchmark Coverage Audit for LLM Attacks
・abstract: We introduce a reusable framework for auditing whether LLM attack benchmarks collectively cover the threat surface: a 4$\times$6 Target $\times$ Technique matrix grounded in STRIDE, constructed from a 507-leaf taxonomy -- 401 data-populated and 106 threat-model-derived leaves -- of inference-time attacks extracted from 932 arXiv security studies (2023--2026). The matrix enables benchmark-external validation -- auditing collective coverage rather than individual benchmark consistency. Applying it to six public benchmarks reveals that the three primary frameworks (HarmBench, InjecAgent, AgentDojo) occupy non-overlapping cells covering at most 25\% of the matrix, while entire STRIDE threat categories (Service Disruption, Model Internals) lack any standardized evaluation, despite published attacks in these categories achieving 46$\times$ token amplification and 96\% attack success rates through mechanisms which no benchmark tests. The corpus of 2,521 unique attack groups further reveals pervasive naming fragmentation (up to 29 surface forms for a single attack) and heavy concentration in Safety \& Alignment Bypass, structural properties invisible at smaller scale. The taxonomy, attack records, and coverage mappings are released as extensible artifacts; as new benchmarks emerge, they can be mapped onto the same matrix, enabling the community to track whether evaluation gaps are closing.
==========
ChatGPTやClaude、Geminiは本当に安全?932本の論文を分析した結果、主要ベンチマークがカバーするのは脅威のわずか25%。サービス拒否攻撃や内部操作など、測られていない脅威が企業のコストとセキュリティを直撃する可能性が明らかに。AIセキュリティの「見えない穴」を可視化した衝撃の研究を解説します。
Видео 【AI防御】評価基準の75%が空白、LLM攻撃2521種の衝撃マップ канала 海外論文研究ラジオ
Комментарии отсутствуют
Информация о видео
21 ч. 31 мин. назад
00:13:28
Другие видео канала





















