Загрузка страницы

Recurrent Neural Networks (RNNs) and Vanishing Gradients

For one, the way plain or vanilla RNN model sequences by recalling information from the immediate past, allows you to capture dependencies to a certain degree, at least. They're also relatively lightweight compared to other n-gram models, taking up less RAM and space. But there are downsides, the RNNs architecture optimized for recalling the immediate past causes it to struggle with longer sequences. And the RNNs method of propagating information is part of how vanishing/exploding gradients are created, both of which can cause your model training to fail. Vanishing/exploding gradients are a problem, this can arise due to the fact that RNNs propagates information from the beginning of the sequence through to the end. Starting with the first word of the sequence, the hidden value at the far left, the first values are computed here. Then it propagates some of the computed information, takes the second word in the sequence, and gets new values.

Видео Recurrent Neural Networks (RNNs) and Vanishing Gradients канала Machine Learning TV
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
8 мая 2021 г. 1:56:50
00:05:43
Яндекс.Метрика