Загрузка страницы

StatQuest: Random Forests Part 2: Missing data and clustering

NOTE: There is an updated version of this video!!! The new version corrects two small errors listed below: See: https://youtu.be/sQ870aTKqiM

Last time we talked about how to create, use and evaluate random forests. Now it's time to see how they can deal with missing data and how they can be used to cluster samples, even when the data comes from all kinds of crazy sources.

NOTE: At 10:22 I overlooked one step. In this case, you plug in the most common value/median value for all observations in the training dataset that have that same category as the new copy that you created. For example, we created two new copies of the observation: one with with heart disease and one without heart disease. Now, for the new copy with heart disease, we plug in the most common value from the observations in the training dataset that have heart disease. For the new copy without heart disease, we plug in the most common value from the observations in the training dataset that do not have heart disease. We can then use the iterative method to refine the guess if we want, or we can just run those two copies down the tree and use the classification from the copy that got the most correct votes.

NOTE: This StatQuest is based on Leo Breiman's (one of the creators of Random Forests) website: https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

⭐ NOTE: When I code, I use Kite, a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I love it! https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=statquest&utm_content=description-only

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

If you'd like to support StatQuest, please consider...
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join

...a cool StatQuest t-shirt or sweatshirt (USA/Europe): https://teespring.com/stores/statquest
(everywhere):
https://www.redbubble.com/people/starmer/works/40421224-statquest-double-bam?asc=u&p=t-shirt

...buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer

#statquest #randomforest #ML

Видео StatQuest: Random Forests Part 2: Missing data and clustering канала StatQuest with Josh Starmer
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
12 февраля 2018 г. 21:10:20
00:11:24
Яндекс.Метрика