Recurrent Neural Networks & Long Short-Term Memory - Andrej Karpathy, Research Scientist, OpenAI
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing a comprehensive analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, an extensive analysis with finite horizon n-gram models suggests that these dependencies are actively discovered and utilized by the networks. Finally, we provide a detailed error analysis that suggests areas for further study.
At the time of recording, Andrej was a 5th year PhD student at Stanford University, studying Deep Learning and its applications to Computer Vision and Natural Language Processing. In particular, his recent work has focused on Image Captioning, Recurrent Neural Network Language Models and Reinforcement Learning. On a side, he enjoys implementing state of the art Deep Learning models in Javascript, competing against Convolutional Networks on the ImageNet challenge, and blogging. Before joining Stanford he completed an undergraduate degree in Computer Science and Physics at the University of Toronto and a Computer Science Master's degree at the University of British Columbia.
Видео Recurrent Neural Networks & Long Short-Term Memory - Andrej Karpathy, Research Scientist, OpenAI канала RE•WORK
At the time of recording, Andrej was a 5th year PhD student at Stanford University, studying Deep Learning and its applications to Computer Vision and Natural Language Processing. In particular, his recent work has focused on Image Captioning, Recurrent Neural Network Language Models and Reinforcement Learning. On a side, he enjoys implementing state of the art Deep Learning models in Javascript, competing against Convolutional Networks on the ImageNet challenge, and blogging. Before joining Stanford he completed an undergraduate degree in Computer Science and Physics at the University of Toronto and a Computer Science Master's degree at the University of British Columbia.
Видео Recurrent Neural Networks & Long Short-Term Memory - Andrej Karpathy, Research Scientist, OpenAI канала RE•WORK
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
What is Recurrent Neural Network (RNN)? Deep Learning Tutorial 33 (Tensorflow, Keras & Python)[CVPR'21 WAD] Keynote - Andrej Karpathy, TeslaBuilding the Software 2 0 Stack (Andrej Karpathy)How AI Powers Self-Driving Tesla with Elon Musk and Andrej KarpathyIlya Sutskever: The Learning of AlgorithmsLSTM is dead. Long Live Transformers!Bjarne Stroustrup: Deep Learning, Software 2.0, and Fuzzy ProgrammingAI for Full-Self Driving by Andrej Karpathy in 10 MinutesSimple Explanation of LSTM | Deep Learning Tutorial 36 (Tensorflow, Keras & Python)Multi-Agent Hide and SeekIlya Sutskever & Lex Fridman - Fireside Chat: The Current State of AIMIT 6.S191 (2021): Recurrent Neural NetworksAndrej Karpathy (Tesla): CVPR 2021 Workshop on Autonomous VehiclesIlya Sutskever at AI Frontiers 2018: Recent Advances in Deep Learning and AI from OpenAIMIT 6.S191 (2020): Recurrent Neural NetworksIlya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)[CVPR'20 Workshop on Scalability in Autonomous Driving] Keynote - Andrej KarpathyRecurrent Neural Networks | RNN LSTM Tutorial | Why use RNN | On Whiteboard | Compare ANN, CNN, RNNCallable Neural Networks - Linear Layers in DepthRNN Symposium 2016: Panel Discussion - The Future of Machines that Learn Algorithms