Загрузка страницы

Vanishing/Exploding Gradients - An Old Problem results from backpropagation (Deep Learning) | NerdML

In this video we will understand what Vanishing Gradients & Exploding Gradients are & the problems they cause during training. How can we fix the vanishing gradient problem & exploding gradient problem with your network.
If deep neural networks are so powerful, why aren’t they used more often? The reason is that they are very difficult to train due to an issue known as the vanishing gradient & exploding gradient. Vanishing Gradient Problem occurs when we try to train a Neural Network model using Gradient based optimization techniques. Vanishing Gradient Problem was actually a major problem 10 years back to train a Deep neural Network Model due to the long training process and the degraded accuracy of the Model.
-------------
Timeline:
Start ( 0:00 )
1).​ What is a Neural Network? ( 1:19 )
2). Backpropagation Intuition ( 3:14 )
3). Derivation of Sigmoid Activation Function ( 5:14 )
4). Vanishing Gradient Problem & it's Solution ( 8:14 )
5). Exploding Gradient Problem & it's Solution ( 10:46 )

-------------
To train a neural network over a large set of labelled data, you must continuously compute the difference between the network’s predicted output and the actual output. This difference is called the cost, and the process for training a net is known as backpropagation, or backprop. During backprop, weights and biases are tweaked slightly until the lowest possible cost is achieved. An important aspect of this process is the gradient, which is a measure of how much the cost changes with respect to a change in a weight or bias value.

Backprop suffers from a fundamental problem known as the vanishing gradient. During training, the gradient decreases in value back through the net. Because higher gradient values lead to faster training, the layers closest to the input layer take the longest to train. Unfortunately, these initial layers are responsible for detecting the simple patterns in the data, while the later layers help to combine the simple patterns into complex patterns. Without properly detecting simple patterns, a deep net will not have the building blocks necessary to handle the complexity. This problem is the equivalent of to trying to build a house without the proper foundation.

Have you ever had this difficulty while using backpropagation? Please comment and let me know your thoughts.

So what causes the gradient to decay back through the net? Backprop, as the name suggests, requires the gradient to be calculated first at the output layer, then backwards across the net to the first hidden layer. Each time the gradient is calculated, the net must compute the product of all the previous gradients up to that point. Since all the gradients are fractions between 0 and 1 – and the product of fractions in this range results in a smaller fraction – the gradient continues to shrink.

For example, if the first two gradients are one fourth and one third, then the next gradient would be one fourth of one third, which is one twelfth. The following gradient would be one twelfth of one fourth, which is one forty-eighth, and so on. Since the layers near the input layer receive the smallest gradients, the net would take a very long time to train. As a subsequent result, the overall accuracy would suffer.
Do subscribe to my channel and hit the bell icon to never miss an update in the future:
https://www.youtube.com/channel/UC7tzG9dDMcp0-WfGROT9cYw/

Please find the previous Video link -
What is Forward Propagation & backpropagation calculus really doing in Deep learning? | Demystified | NerdML : https://youtu.be/MLGv2MijDbY

Machine Learning Tutorial Playlist: https://youtube.com/playlist?list=PLAH6DbJL1J2KroCzEWRF0xnrmet9esxFH

Deep Learning Tutorial Playlist : https://youtube.com/playlist?list=PLAH6DbJL1J2KK-5PlYCt3v2PaIGdvuLmB

Creator : Rahul Saini
Please write back to me at rahulsainipusa@gmail.com for more information

Instagram: https://www.instagram.com/96_saini
Facebook: https://www.facebook.com/rahulsainipusa
LinkedIn: https://www.linkedin.com/in/rahul-s-22ba1993

deep learning
gradient descent
image recognition
backpropagation
multilayer perceptron
deep learning tutorial
gradient
neural network
artificial intelligence
machine learning
deep neural networks
artificialintelligence
machinelearning
what solves vanishing gradient problem
vanishing gradient problem pdf
why residual block can avoid vanishing gradient problem
how does relu solve vanishing gradient
cs231n vanishing gradient
relu exploding gradient
exploding gradient wiki
exploding gradient sigmoid

#VanishingGradient, #ExplodingGradient, #NerdML

Видео Vanishing/Exploding Gradients - An Old Problem results from backpropagation (Deep Learning) | NerdML канала NerdML
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
17 февраля 2021 г. 13:22:41
00:14:14
Яндекс.Метрика