Continuous Data Pipeline for Real time Benchmarking & Data Set Augmentation | Teleskope
ABOUT THE TALK:
Building and curating representative datasets is crucial for accurate ML systems. Monitoring metrics post-deployment helps improve the model. Unstructured language models may face data shifts, leading to unpredictable inferences. Open-source APIs and annotation tools streamline annotation and reduce analyst workload.
This talk discusses generating datasets and real-time precision/recall splits to detect data shifts, prioritize data collection, and retrain models.
ABOUT THE SPEAKER:
Ivan Aguilar is a data scientist at Teleskope focused on building scalable models for detecting PII/PHI/Secrets and other compliance related entities within customers' clouds. Prior to joining Teleskope, Ivan was a ML Engineer at Forge.AI, a Boston based shop working on information extraction, content extraction, and other NLP related tasks.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.
Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai/
Видео Continuous Data Pipeline for Real time Benchmarking & Data Set Augmentation | Teleskope канала Data Council
Building and curating representative datasets is crucial for accurate ML systems. Monitoring metrics post-deployment helps improve the model. Unstructured language models may face data shifts, leading to unpredictable inferences. Open-source APIs and annotation tools streamline annotation and reduce analyst workload.
This talk discusses generating datasets and real-time precision/recall splits to detect data shifts, prioritize data collection, and retrain models.
ABOUT THE SPEAKER:
Ivan Aguilar is a data scientist at Teleskope focused on building scalable models for detecting PII/PHI/Secrets and other compliance related entities within customers' clouds. Prior to joining Teleskope, Ivan was a ML Engineer at Forge.AI, a Boston based shop working on information extraction, content extraction, and other NLP related tasks.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.
Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai/
Видео Continuous Data Pipeline for Real time Benchmarking & Data Set Augmentation | Teleskope канала Data Council
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Hot or Not: Latest Trends & Buzzwords in DataThe Road to Exceptional Data CorrectnessHow to End the Long tail of Most Data Requests | NarratorWhat I Don't Want to Exist in the Data World in 5 Years | Seattle Data GuyData Contracts in the Modern Data Stack | WhatnotIncident Management for Data People | BigeyeAutomatically Fix Data Issues & Label Errors in Most ML Datasets | CleanlabData Product Success: Aligning with Data's Core Purpose | EnteraData Products Aren't Just for Data Teams! LightdashInnovating on Software Development | Fast AIHierarchical Forecasting in Python | NixtlaThe Story of DevRel at Snowflake - How We Got Here | SnowflakeAI The Future is NowHow to Be a 10x Analyst | HyperqueryWhat it Takes to Support the World's Most Popular Open Source Communities | NumFOCUSHow Vercel Builds Dozens of Metrics from One Heterogenous TableGenerative AI for Search | TonitaFrom 1 to IPO: Growing the Data Team and Data Culture at GitLabWhen to Move from Batch to Streaming and how to do it without hiring an entirely new team | BytewaxDesigning & Building Metric Trees