- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
AI Security 4.2: Human-in-the-Loop Controls for AI Agents - When to Block, When to Allow
A system prompt saying "ask before sending" is not a security control — it's a suggestion the model can be talked out of. This video shows how to build real approval gates in application code that AI agents cannot bypass, with risk-tier classification and patterns that prevent both catastrophic actions and alert fatigue.
In this video, you'll learn:
WHY HITL MATTERS
- Why prompt-level guards fail: the Freysa incident ($47K transferred after 482 failed social engineering attempts)
- The critical difference between prompt-level guards and application-level gates
- How CVE-2025-32711 (EchoLeak) exploited missing checkpoints in Microsoft 365 Copilot
RISK TIER CLASSIFICATION
- Tier 1 (Autonomous): read-only, sandboxed actions — execute and log
- Tier 2 (Log & Notify): reversible writes — execute with audit trail and real-time notification
- Tier 3 (Require Confirmation): irreversible actions — hard block until human approves
- When uncertain, always classify up
IMPLEMENTATION PATTERNS
- Per-tool confirmation callbacks with "approve with changes" support
- Approval queues for background/async agents with TTL expiration
- Dry run mode for batch operations (the Terraform "plan before apply" model)
- Vulnerable vs. secure email-sending code comparison
PREVENTING ALERT FATIGUE
- Why Tier 3 actions must be rare — if confirmations are frequent, your tier classification is wrong
- Making high-risk prompts visually distinct from routine notifications
- Progressive trust: promoting actions only via human administrator, never self-promotion
LIMITATIONS
- HITL does not replace Least Privilege — Tier 1 reads can still exfiltrate data silently
- Confirmation quality depends on showing predicted impact, not just action names
- Defense in depth remains necessary alongside HITL
This is Section 4.2 in the AI Agent Security series. Previous: Section 4.1 — Excessive Agency. Next: Section 4.3 — Multi-Agent Trust.
#AISecurity #HumanInTheLoop #HITL #AgentSecurity #CVE202532711 #EchoLeak #PromptInjection #SecureCoding #DevSecOps #AIAgents #MicrosoftCopilot #Freysa #RiskManagement #ApprovalGates #AlertFatigue #LLMSecurity #OWASP #ApplicationSecurity #AIGovernance #CyberSecurity
Видео AI Security 4.2: Human-in-the-Loop Controls for AI Agents - When to Block, When to Allow канала WiseBuilder
In this video, you'll learn:
WHY HITL MATTERS
- Why prompt-level guards fail: the Freysa incident ($47K transferred after 482 failed social engineering attempts)
- The critical difference between prompt-level guards and application-level gates
- How CVE-2025-32711 (EchoLeak) exploited missing checkpoints in Microsoft 365 Copilot
RISK TIER CLASSIFICATION
- Tier 1 (Autonomous): read-only, sandboxed actions — execute and log
- Tier 2 (Log & Notify): reversible writes — execute with audit trail and real-time notification
- Tier 3 (Require Confirmation): irreversible actions — hard block until human approves
- When uncertain, always classify up
IMPLEMENTATION PATTERNS
- Per-tool confirmation callbacks with "approve with changes" support
- Approval queues for background/async agents with TTL expiration
- Dry run mode for batch operations (the Terraform "plan before apply" model)
- Vulnerable vs. secure email-sending code comparison
PREVENTING ALERT FATIGUE
- Why Tier 3 actions must be rare — if confirmations are frequent, your tier classification is wrong
- Making high-risk prompts visually distinct from routine notifications
- Progressive trust: promoting actions only via human administrator, never self-promotion
LIMITATIONS
- HITL does not replace Least Privilege — Tier 1 reads can still exfiltrate data silently
- Confirmation quality depends on showing predicted impact, not just action names
- Defense in depth remains necessary alongside HITL
This is Section 4.2 in the AI Agent Security series. Previous: Section 4.1 — Excessive Agency. Next: Section 4.3 — Multi-Agent Trust.
#AISecurity #HumanInTheLoop #HITL #AgentSecurity #CVE202532711 #EchoLeak #PromptInjection #SecureCoding #DevSecOps #AIAgents #MicrosoftCopilot #Freysa #RiskManagement #ApprovalGates #AlertFatigue #LLMSecurity #OWASP #ApplicationSecurity #AIGovernance #CyberSecurity
Видео AI Security 4.2: Human-in-the-Loop Controls for AI Agents - When to Block, When to Allow канала WiseBuilder
Комментарии отсутствуют
Информация о видео
Вчера, 9:52:00
00:12:31
Другие видео канала




















