Загрузка страницы

Knowledge Distillation - Keras Code Examples

This Keras Code Examples show you how to implement Knowledge Distillation! Knowledge Distillation has lead to new advances in compression, training state of the art models, and stabilizing Transformers for Computer Vision. All you need to do to build on this is swap out the Teacher and Student architectures. I think the example of how to overwrite keras.Model and integrate two loss functions controlled with an alpha hyperparameter weighting is very useful as well.

Content Links
Knowledge Distillation (Keras Code Examples): https://keras.io/examples/vision/knowledge_distillation/
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
Self-Training with Noisy Student: https://arxiv.org/pdf/1911.04252.pdf
Data-efficient Image Transformers: https://ai.facebook.com/blog/data-efficient-image-transformers-a-promising-new-technique-for-image-classification/
KL Divergence: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence

0:00 Beginning
0:44 Motivation, Success Stories
2:47 Custom keras.Model
11:18 Teacher and Student models
12:17 Data Loading, Train the Teacher
14:05 Distill Teacher to Student

Видео Knowledge Distillation - Keras Code Examples канала Henry AI Labs
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
28 февраля 2021 г. 23:00:44
00:16:54
Яндекс.Метрика