Transformers vs Recurrent Neural Networks (RNN)!
Course link: https://www.coursera.org/learn/attention-models-in-nlp/lecture/glNgT/transformers-vs-rnns
Using an RNN, you have to take sequential steps to encode your input, and you start from the beginning of your input making computations at every step until you reach the end. At that point, you decode the information following a similar sequential procedure. As you can see here, you have to go through every word in your inputs starting with the first word followed by the second word, one after another. In sequential matcher in order to start the translation, that is done in a sequential way too. For that reason, there is not much room for parallel computations here. The more words you have in the input sequence, the more time it will take to process that sentence. Take a look at a more general sequence to sequence architecture.In this case, to propagate information from your first word to the last output, you have to go through T sequential steps.
Видео Transformers vs Recurrent Neural Networks (RNN)! канала Machine Learning TV
Using an RNN, you have to take sequential steps to encode your input, and you start from the beginning of your input making computations at every step until you reach the end. At that point, you decode the information following a similar sequential procedure. As you can see here, you have to go through every word in your inputs starting with the first word followed by the second word, one after another. In sequential matcher in order to start the translation, that is done in a sequential way too. For that reason, there is not much room for parallel computations here. The more words you have in the input sequence, the more time it will take to process that sentence. Take a look at a more general sequence to sequence architecture.In this case, to propagate information from your first word to the last output, you have to go through T sequential steps.
Видео Transformers vs Recurrent Neural Networks (RNN)! канала Machine Learning TV
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Limitations of the ChatGPT and LLMs - Part 3Understanding ChatGPT and LLMs from Scratch - Part 2Understanding ChatGPT and LLMs from Scratch - Part 1Understanding BERT Embeddings and How to Generate them in SageMakerUnderstanding Coordinate DescentBootstrap and Monte Carlo MethodsMaximum Likelihood as Minimizing KL DivergenceUnderstanding The Shapley ValueKalman Filter - Part 2Kalman Filter - Part 1Recurrent Neural Networks (RNNs) and Vanishing GradientsLanguage Model Evaluation and PerplexityCommon Patterns in Time Series: Seasonality, Trend and AutocorrelationLimitations of Graph Neural Networks (Stanford University)Understanding Metropolis-Hastings algorithmLearning to learn: An Introduction to Meta LearningPage Ranking: Web as a Graph (Stanford University 2019)Deep Graph Generative Models (Stanford University - 2019)Graph Node Embedding Algorithms (Stanford - Fall 2019)Graph Representation Learning (Stanford university)