Загрузка...

Contrastive Language-Image Pretraining (CLIP)

GitHub repository: https://github.com/andandandand/practical-computer-vision

0:00 CLIP: Contrastive Language-Image Pretraining
0:08 Learning goals
0:30 CLIP: ‘Contrastive Language Image Pretraining’
1:37 Aligning text and image embeddings
2:58 Text encoders
4:12 CLIP’s architecture
5:11 Maximizing cosine similarity of matching text and image embeddings
5:45 Training algorithm
6:18 Zero-shot classification with CLIP
7:21 Producing embeddings with CLIP (1/2)
7:49 Producing embeddings with CLIP (2/2)
8:47 Zero-shot classification with CLIP (Note: Same title as 6:18, but slide focuses on code/softmax details)
9:53 Transferable representations: CLIP against a ResNet101 pretrained on Imagenet
11:29 Limitations against fully supervised models
12:31 Semantic search with CLIP
13:09 CLIP guides image generation of diffusion models
13:38 Summary

Видео Contrastive Language-Image Pretraining (CLIP) канала Antonio Rueda-Toicen
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки