Все видео Новые видео Популярные видео Категории видео

Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

Gradient Boost Part 2 (of 4): Regression Details

Gradient Boost is one of the most popular Machine Learning algorithms in use. And get this, it's not that complicated! This video is the second part in a series that walks through it one step at a time. This video focuses on the original Gradient Boost algorithm used to predict a continuous value, like someone's weight. We call this, "using Gradient Boost for Regression". In part 3, we'll walk though how Gradient Boost classifies samples into two different categories, and in part 4, we'll go through the math again, this time focusing on classification.

This StatQuest assumes that you have already watched Part 1:
https://youtu.be/3CC4N4z3GJc

...it also assumes that you know about Regression Trees:
https://youtu.be/g9c66TUylZ4

...and, while it required, it might be useful if you understood Gradient Descent: https://youtu.be/sDv4f4s2SB8

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

This StatQuest is based on the following sources:

A 1999 manuscript by Jerome Friedman that introduced Stochastic Gradient Boost: https://statweb.stanford.edu/~jhf/ftp/stobst.pdf

The Wikipedia article on Gradient Boosting: https://en.wikipedia.org/wiki/Gradient_boosting
NOTE: The key to understanding how the wikipedia article relates to this video is to keep reading past the "pseudo algorithm" section. The very next section in the article called "Gradient Tree Boosting" shows how the algorithm works for trees (which is pretty much the only weak learner people ever use for gradient boost, which is why I focus on it in the video). In that section, you see how the equation is modified so that each leaf from a tree can have a different output value, rather than the entire "weak learner" having a single output value - and this is the exact same equation that I use in the video.
Later in the article, in the section called "Shrinkage", they show how the learning rate can be included. Since this is also pretty much always used with gradient boost, I simply included it in the base algorithm that I describe.

The scikit-learn implementation of Gradient Boosting: https://scikit-learn.org/stable/modules/ensemble.html#gradient-boosting

If you'd like to support StatQuest, please consider...

Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - https://statquest.gumroad.com/l/wvtmc
Paperback - https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - https://www.amazon.com/dp/B09ZG79HXC

Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join

...a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statquest-with-josh-starmer/

...buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer

0:00 Awesome song and introduction
0:00 Step 0: The data and the loss function
6:30 Step 1: Initialize the model with a constant value
9:10 Step 2: Build M trees
10:01 Step 2.A: Calculate residuals
12:47 Step 2.B: Fit a regression tree to the residuals
14:50 Step 2.C: Optimize leaf output values
20:38 Step 2.D: Update predictions with the new tree
23:19 Step 2: Summary of step 2
24:59 Step 3: Output the final prediction

Corrections:
15:47. It should be R_jm, not R_ij.
16:18, the leaf in the script is R_1,2 and it should be R_2,1.
21:08. With regression trees, the sample will only go to a single leaf, and this summation simply isolates the one output value of interest from all of the others. However, when I first made this video I was thinking that because Gradient Boost is supposed to work with any "weak learner", not just small regression trees, that this summation was a way to add flexibility to the algorithm.
24:15, the header for the residual column should be r_i,2.

#statquest #gradientboost

Видео Gradient Boost Part 2 (of 4): Regression Details канала StatQuest with Josh Starmer

Показать

Комментарии отсутствуют