Загрузка...

Handling Imbalanced Data in machine learning classification (Python) - 2

Welcome to our Handling Imbalanced Data in machine learning classification series. You'll work on a highly imbalanced example dataset in Python.

In this Part 2 video, we'll learn 6 popular techniques to deal with the imbalanced data problem in Python.

00:00 Overview
01:21 Collecting a bigger sample
02:15 Oversampling (e.g., random, SMOTE)
09:55 Undersampling (e.g., random, K-Means, Tomek links)
15:05 Combining over and undersampling
16:42 Weighing classes differently
19:07 Changing algorithms

GitHub Repo with code: https://github.com/liannewriting/YouTube-videos-public/tree/main/imbalanced-data-machine-learning-abalone19

Source of the dataset: https://sci2s.ugr.es/keel/dataset.php?cod=115 Please download from GitHub, since we've made minor changes to the original dataset.

Technologies that will be used:
☑️ JupyterLab (Notebook)
☑️ pandas
☑️ sklearn
☑️ imbalanced-learn (imblearn)

Links mentioned in the video

►Logistic Regression Example in Python: Step-by-Step Guide: https://www.justintodata.com/logistic-regression-example-in-python/

►Shrinkage effect: https://imbalanced-learn.org/stable/auto_examples/over-sampling/plot_shrinkage_effect.html#

►SMOTE: Synthetic Minority Over-sampling Technique: https://arxiv.org/abs/1106.1813

►Decision Tree Model in Machine Learning: Practical Tutorial with Python: https://www.justintodata.com/decision-tree-model-in-machine-learning-tutorial-python/

►Unlocking Random Forest in Machine Learning: https://www.justintodata.com/random-forest-machine-learning/

►Paper with comparisons (Survey of Imbalanced Data Methodologies): https://arxiv.org/pdf/2104.02240.pdf

There's also an article version of the same content. If you prefer reading, please check it out. How to handle Imbalanced Data in machine learning classification: https://www.justintodata.com/imbalanced-data-machine-learning-classification/

Get access to more data science materials, check out our website Just into Data: https://justintodata.com/

Видео Handling Imbalanced Data in machine learning classification (Python) - 2 канала Lianne and Justin
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки

На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.

Об использовании CookiesПринять