Sergey Feldman: You Should Probably Be Doing Nested Cross-Validation | PyData Miami 2019
It is common to perform model selection while also attempting to estimate accuracy on a held-out set. The traditional solution is to split a data set into training, validation, and test subsets. On small datasets, however, this strategy suffers from high variance. A common approach to reusing a small number of samples for model selection is cross-validation, which typically is applied across an entire dataset. Then the best model is evaluated on the test set. This approach has a fundamental flaw: if the test is small, the performance estimate is high variance. The solution is double (or nested) cross-validation, which will be explained in this talk.
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Видео Sergey Feldman: You Should Probably Be Doing Nested Cross-Validation | PyData Miami 2019 канала PyData
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Видео Sergey Feldman: You Should Probably Be Doing Nested Cross-Validation | PyData Miami 2019 канала PyData
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![](https://i.ytimg.com/vi/e37GOzb1krk/default.jpg)
![Vincent Warmerdam: Winning with Simple, even Linear, Models | PyData London 2018](https://i.ytimg.com/vi/68ABAU_V8qI/default.jpg)
![Machine Learning Fundamentals: Cross Validation](https://i.ytimg.com/vi/fSytzGwwBVw/default.jpg)
![The 7 Reasons Most Machine Learning Funds Fail Marcos Lopez de Prado from QuantCon 2018](https://i.ytimg.com/vi/BRUlSm4gdQ4/default.jpg)
![Attacking Clustered Data with a Mixed Effects Random Forests Model in Python - Sourav Dey](https://i.ytimg.com/vi/gWj4ZwB7f3o/default.jpg)
![Nested Cross Validation](https://i.ytimg.com/vi/az60jS7MQhU/default.jpg)
![Maria Khalusova: Machine Learning Model Evaluation Metrics | PyData LA 2019](https://i.ytimg.com/vi/PeYQIyOyKB8/default.jpg)
![GridSearchCV vs RandomizedSeachCV|Difference between Grid GridSearchCV and RandomizedSeachCV](https://i.ytimg.com/vi/w4frwjt8uCo/default.jpg)
![Cross Validation Overview with R](https://i.ytimg.com/vi/FcAJYJ2JFi8/default.jpg)
![Thomas J Fan: Deep Dive into scikit-learn's HistGradientBoosting Classifier.. | PyData New York 2019](https://i.ytimg.com/vi/J9QQ6l_HToU/default.jpg)
![What is nested cross-validation for Machine Learning](https://i.ytimg.com/vi/OEOOZxld_Cw/default.jpg)
![Principles of Machine Learning | Nested Cross Validation](https://i.ytimg.com/vi/LpOsxBeggM0/default.jpg)
![Bonus Lecture. Time Series Cross Validation](https://i.ytimg.com/vi/g9iO2AwTXyI/default.jpg)
![What is Cross Validation and its types?](https://i.ytimg.com/vi/7062skdX05Y/default.jpg)
![Logistic Regression Loss Function – Hyper Parameter Tuning & Evaluation Metrics – Part 3 (2020)](https://i.ytimg.com/vi/0HDy6n3UD5M/default.jpg)
![Episode 2: A Cross Validation Framework](https://i.ytimg.com/vi/2wQlD46eICE/default.jpg)
![Scale EDA & ML Workloads To Clusters & Back With Dask I PyData Chicago January 2022 Meetup](https://i.ytimg.com/vi/6mn9X3PScW8/default.jpg)
![11.5 Nested CV for Algorithm Selection (L11 Model Eval. Part 4)](https://i.ytimg.com/vi/XXFLFWHP9Nc/default.jpg)
![All Type Of Cross Validation With Python All In 1 Video](https://i.ytimg.com/vi/3fzYdnuvEfk/default.jpg)
![Natural Language Processing: Trends, Challenges and Opportunities | PyData Global 2021](https://i.ytimg.com/vi/Y2WZEV-Ds-o/default.jpg)