- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
Pytorch LSTM Part 7
Today we are going to take what we learned building our LSTM this week and tweak a few settings to see if we can lower the loss. Our model is tiny, about 19 lines, and most of the work is in data prep like building a dictionary, tags, and a mask token set to zero so we can ignore padding. I also want to test why view with 1 and negative 1 is not the same as unsqueeze, since they look similar but behave differently.
Then we will try the next step from the tutorial: add character level features so the model can use endings like ly that often signal an adverb. That means a second LSTM that reads characters, plus character embeddings, and then we combine that with the word level LSTM to predict part of speech tags per word, not per letter. We ran into shape issues, especially with batching and NLLLoss, and we also saw that character sequences are longer than word tag sequences, so we will need pooling or another way to collapse character outputs into one vector per word.
We started refactoring the tokenizer to work over characters and padding, but it is not finished yet, so the plan for tomorrow is to wire up the two embeddings, two LSTMs, and the pooling step, then get training stable and see if the loss improves.
Видео Pytorch LSTM Part 7 канала Stephen Blum
Then we will try the next step from the tutorial: add character level features so the model can use endings like ly that often signal an adverb. That means a second LSTM that reads characters, plus character embeddings, and then we combine that with the word level LSTM to predict part of speech tags per word, not per letter. We ran into shape issues, especially with batching and NLLLoss, and we also saw that character sequences are longer than word tag sequences, so we will need pooling or another way to collapse character outputs into one vector per word.
We started refactoring the tokenizer to work over characters and padding, but it is not finished yet, so the plan for tomorrow is to wire up the two embeddings, two LSTMs, and the pooling step, then get training stable and see if the loss improves.
Видео Pytorch LSTM Part 7 канала Stephen Blum
Комментарии отсутствуют
Информация о видео
5 июня 2026 г. 12:00:01
00:27:22
Другие видео канала




















