4-17. Robust Z-score & Winsorization: Advanced Methods for Univariate Outlier Detection Part 2
Previous: https://youtu.be/ScrDxkdQ-ik
Next: https://youtu.be/GNYp4OOQwN0
Playlist: https://www.youtube.com/playlist?list=PLnccJ9vccaGlC77PmUoPh2cUziN2k6yrs
In this video, we dive into advanced methods for univariate outlier detection—essential techniques in data preprocessing for machine learning and statistics. Following the first part, we now explore two powerful approaches: Robust Z-Score and Winsorization.
The story begins with a challenge—how do we identify outliers when our data is filled with extreme values that distort the mean and standard deviation? Enter the Robust Z-Score method, which uses the median and MAD (Median Absolute Deviation) instead of traditional measures. By calculating robust Z-scores and setting a threshold (like ±3), we reveal outliers without being misled by the data’s extremes.
Next, we introduce a practical solution for taming extreme values without removing them—Winsorization. This technique replaces the top and bottom percentiles with boundary values, minimizing distortion in the IQR and standard deviation. Through Python code examples, we show how to apply winsorization using scipy.stats.mstats.
Finally, we compare five popular outlier detection techniques: IQR-based iterative method, trimmed mean ± 3σ, median ± k*MAD, robust Z-scores, and Winsorization. Each has strengths and trade-offs, depending on your data distribution and whether you want to remove or retain outliers.
【Predictive Analytics by Machin Learning】
This course "Predictive Analytics by Machine Learning" explicates essential concepts and techniques ranging from foundational to advanced. It covers not only machine learning algorithms but also various concepts and methods for data preprocessing. This course will guide you step-by-step, equipping you with the skills to confidently apply machine learning to real-world predictive analytics.
Instructor: Takuma Kimura (木村 琢磨), Ph.D.
Scientist of Organizational Behavior and Analytics
https://orcid.org/0000-0001-7126-188X
https://www.linkedin.com/in/takuma-kimura-ba6242104/
#machinelearning #datascience #outlier #outliers #outlierdetection #robustzscore #winsorization
Видео 4-17. Robust Z-score & Winsorization: Advanced Methods for Univariate Outlier Detection Part 2 канала Takuma Organizational & Data Analytics
Next: https://youtu.be/GNYp4OOQwN0
Playlist: https://www.youtube.com/playlist?list=PLnccJ9vccaGlC77PmUoPh2cUziN2k6yrs
In this video, we dive into advanced methods for univariate outlier detection—essential techniques in data preprocessing for machine learning and statistics. Following the first part, we now explore two powerful approaches: Robust Z-Score and Winsorization.
The story begins with a challenge—how do we identify outliers when our data is filled with extreme values that distort the mean and standard deviation? Enter the Robust Z-Score method, which uses the median and MAD (Median Absolute Deviation) instead of traditional measures. By calculating robust Z-scores and setting a threshold (like ±3), we reveal outliers without being misled by the data’s extremes.
Next, we introduce a practical solution for taming extreme values without removing them—Winsorization. This technique replaces the top and bottom percentiles with boundary values, minimizing distortion in the IQR and standard deviation. Through Python code examples, we show how to apply winsorization using scipy.stats.mstats.
Finally, we compare five popular outlier detection techniques: IQR-based iterative method, trimmed mean ± 3σ, median ± k*MAD, robust Z-scores, and Winsorization. Each has strengths and trade-offs, depending on your data distribution and whether you want to remove or retain outliers.
【Predictive Analytics by Machin Learning】
This course "Predictive Analytics by Machine Learning" explicates essential concepts and techniques ranging from foundational to advanced. It covers not only machine learning algorithms but also various concepts and methods for data preprocessing. This course will guide you step-by-step, equipping you with the skills to confidently apply machine learning to real-world predictive analytics.
Instructor: Takuma Kimura (木村 琢磨), Ph.D.
Scientist of Organizational Behavior and Analytics
https://orcid.org/0000-0001-7126-188X
https://www.linkedin.com/in/takuma-kimura-ba6242104/
#machinelearning #datascience #outlier #outliers #outlierdetection #robustzscore #winsorization
Видео 4-17. Robust Z-score & Winsorization: Advanced Methods for Univariate Outlier Detection Part 2 канала Takuma Organizational & Data Analytics
Комментарии отсутствуют
Информация о видео
21 апреля 2025 г. 18:56:29
00:08:03
Другие видео канала