Mission-Critical Evals at Scale (Learnings from 100k medical decisions)
So you've built your LLM product, have paying customers and your LLM throughput is increasing. Great! But scale introduces its own problems: it'll uncover new edge case user inputs and failure cases that your current evaluations don't capture.
And what if you just can't afford to make mistakes? (At Anterior, our product helps health insurers make decisions around approving medical treatment - this is mission-critical, with no room for error!)
The solution? A scalable and self-auditing reference-free evaluation system (rolls off the tongue, right?).
In this talk, we'll explain how to build one, why it should run real-time and how building this system provides company defensibility.
For further details and discussion, see: https://chrislovejoy.me/mission-critical-evals
Видео Mission-Critical Evals at Scale (Learnings from 100k medical decisions) канала AI Engineer
And what if you just can't afford to make mistakes? (At Anterior, our product helps health insurers make decisions around approving medical treatment - this is mission-critical, with no room for error!)
The solution? A scalable and self-auditing reference-free evaluation system (rolls off the tongue, right?).
In this talk, we'll explain how to build one, why it should run real-time and how building this system provides company defensibility.
For further details and discussion, see: https://chrislovejoy.me/mission-critical-evals
Видео Mission-Critical Evals at Scale (Learnings from 100k medical decisions) канала AI Engineer
Комментарии отсутствуют
Информация о видео
23 февраля 2025 г. 0:00:06
00:12:15
Другие видео канала



















