Загрузка страницы

Continuous Data Pipeline for Real time Benchmarking & Data Set Augmentation | Teleskope

ABOUT THE TALK:
Building and curating representative datasets is crucial for accurate ML systems. Monitoring metrics post-deployment helps improve the model. Unstructured language models may face data shifts, leading to unpredictable inferences. Open-source APIs and annotation tools streamline annotation and reduce analyst workload.

This talk discusses generating datasets and real-time precision/recall splits to detect data shifts, prioritize data collection, and retrain models.

ABOUT THE SPEAKER:
Ivan Aguilar is a data scientist at Teleskope focused on building scalable models for detecting PII/PHI/Secrets and other compliance related entities within customers' clouds. Prior to joining Teleskope, Ivan was a ML Engineer at Forge.AI, a Boston based shop working on information extraction, content extraction, and other NLP related tasks.

ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Видео Continuous Data Pipeline for Real time Benchmarking & Data Set Augmentation | Teleskope канала Data Council
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
12 мая 2023 г. 1:50:22
00:15:05
Яндекс.Метрика