Recurrent Neural Networks (RNNs) and Vanishing Gradients
For one, the way plain or vanilla RNN model sequences by recalling information from the immediate past, allows you to capture dependencies to a certain degree, at least. They're also relatively lightweight compared to other n-gram models, taking up less RAM and space. But there are downsides, the RNNs architecture optimized for recalling the immediate past causes it to struggle with longer sequences. And the RNNs method of propagating information is part of how vanishing/exploding gradients are created, both of which can cause your model training to fail. Vanishing/exploding gradients are a problem, this can arise due to the fact that RNNs propagates information from the beginning of the sequence through to the end. Starting with the first word of the sequence, the hidden value at the far left, the first values are computed here. Then it propagates some of the computed information, takes the second word in the sequence, and gets new values.
Видео Recurrent Neural Networks (RNNs) and Vanishing Gradients канала Machine Learning TV
Видео Recurrent Neural Networks (RNNs) and Vanishing Gradients канала Machine Learning TV
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Limitations of the ChatGPT and LLMs - Part 3Understanding ChatGPT and LLMs from Scratch - Part 2Understanding ChatGPT and LLMs from Scratch - Part 1Understanding BERT Embeddings and How to Generate them in SageMakerUnderstanding Coordinate DescentBootstrap and Monte Carlo MethodsMaximum Likelihood as Minimizing KL DivergenceUnderstanding The Shapley ValueKalman Filter - Part 2Kalman Filter - Part 1Transformers vs Recurrent Neural Networks (RNN)!Language Model Evaluation and PerplexityCommon Patterns in Time Series: Seasonality, Trend and AutocorrelationLimitations of Graph Neural Networks (Stanford University)Understanding Metropolis-Hastings algorithmLearning to learn: An Introduction to Meta LearningPage Ranking: Web as a Graph (Stanford University 2019)Deep Graph Generative Models (Stanford University - 2019)Graph Node Embedding Algorithms (Stanford - Fall 2019)Graph Representation Learning (Stanford university)