Handling Imbalanced Data in machine learning classification (Python) - 2
Welcome to our Handling Imbalanced Data in machine learning classification series. You'll work on a highly imbalanced example dataset in Python.
In this Part 2 video, we'll learn 6 popular techniques to deal with the imbalanced data problem in Python.
00:00 Overview
01:21 Collecting a bigger sample
02:15 Oversampling (e.g., random, SMOTE)
09:55 Undersampling (e.g., random, K-Means, Tomek links)
15:05 Combining over and undersampling
16:42 Weighing classes differently
19:07 Changing algorithms
GitHub Repo with code: https://github.com/liannewriting/YouTube-videos-public/tree/main/imbalanced-data-machine-learning-abalone19
Source of the dataset: https://sci2s.ugr.es/keel/dataset.php?cod=115 Please download from GitHub, since we've made minor changes to the original dataset.
Technologies that will be used:
☑️ JupyterLab (Notebook)
☑️ pandas
☑️ sklearn
☑️ imbalanced-learn (imblearn)
Links mentioned in the video
►Logistic Regression Example in Python: Step-by-Step Guide: https://www.justintodata.com/logistic-regression-example-in-python/
►Shrinkage effect: https://imbalanced-learn.org/stable/auto_examples/over-sampling/plot_shrinkage_effect.html#
►SMOTE: Synthetic Minority Over-sampling Technique: https://arxiv.org/abs/1106.1813
►Decision Tree Model in Machine Learning: Practical Tutorial with Python: https://www.justintodata.com/decision-tree-model-in-machine-learning-tutorial-python/
►Unlocking Random Forest in Machine Learning: https://www.justintodata.com/random-forest-machine-learning/
►Paper with comparisons (Survey of Imbalanced Data Methodologies): https://arxiv.org/pdf/2104.02240.pdf
There's also an article version of the same content. If you prefer reading, please check it out. How to handle Imbalanced Data in machine learning classification: https://www.justintodata.com/imbalanced-data-machine-learning-classification/
Get access to more data science materials, check out our website Just into Data: https://justintodata.com/
Видео Handling Imbalanced Data in machine learning classification (Python) - 2 канала Lianne and Justin
In this Part 2 video, we'll learn 6 popular techniques to deal with the imbalanced data problem in Python.
00:00 Overview
01:21 Collecting a bigger sample
02:15 Oversampling (e.g., random, SMOTE)
09:55 Undersampling (e.g., random, K-Means, Tomek links)
15:05 Combining over and undersampling
16:42 Weighing classes differently
19:07 Changing algorithms
GitHub Repo with code: https://github.com/liannewriting/YouTube-videos-public/tree/main/imbalanced-data-machine-learning-abalone19
Source of the dataset: https://sci2s.ugr.es/keel/dataset.php?cod=115 Please download from GitHub, since we've made minor changes to the original dataset.
Technologies that will be used:
☑️ JupyterLab (Notebook)
☑️ pandas
☑️ sklearn
☑️ imbalanced-learn (imblearn)
Links mentioned in the video
►Logistic Regression Example in Python: Step-by-Step Guide: https://www.justintodata.com/logistic-regression-example-in-python/
►Shrinkage effect: https://imbalanced-learn.org/stable/auto_examples/over-sampling/plot_shrinkage_effect.html#
►SMOTE: Synthetic Minority Over-sampling Technique: https://arxiv.org/abs/1106.1813
►Decision Tree Model in Machine Learning: Practical Tutorial with Python: https://www.justintodata.com/decision-tree-model-in-machine-learning-tutorial-python/
►Unlocking Random Forest in Machine Learning: https://www.justintodata.com/random-forest-machine-learning/
►Paper with comparisons (Survey of Imbalanced Data Methodologies): https://arxiv.org/pdf/2104.02240.pdf
There's also an article version of the same content. If you prefer reading, please check it out. How to handle Imbalanced Data in machine learning classification: https://www.justintodata.com/imbalanced-data-machine-learning-classification/
Get access to more data science materials, check out our website Just into Data: https://justintodata.com/
Видео Handling Imbalanced Data in machine learning classification (Python) - 2 канала Lianne and Justin
data science imbalanced data imbalanced data machine learning imbalanced data classification python imbalanced data problem imbalanced data smote imbalanced data sampling imbalanced data handling techniques imbalanced data classification techniques imbalanced data for regression logistic regression imbalanced-learn imblearn sklearn
Комментарии отсутствуют
Информация о видео
4 ноября 2021 г. 20:05:30
00:21:07
Другие видео канала