Supervised Contrastive Learning
The cross-entropy loss has been the default in deep learning for the last few years for supervised learning. This paper proposes a new loss, the supervised contrastive loss, and uses it to pre-train the network in a supervised fashion. The resulting model, when fine-tuned to ImageNet, achieves new state-of-the-art.
https://arxiv.org/abs/2004.11362
Abstract:
Cross entropy is the most widely used loss function for supervised training of image classification models. In this paper, we propose a novel training methodology that consistently outperforms cross entropy on supervised learning tasks across different architectures and data augmentations. We modify the batch contrastive loss, which has recently been shown to be very effective at learning powerful representations in the self-supervised setting. We are thus able to leverage label information more effectively than cross entropy. Clusters of points belonging to the same class are pulled together in embedding space, while simultaneously pushing apart clusters of samples from different classes. In addition to this, we leverage key ingredients such as large batch sizes and normalized embeddings, which have been shown to benefit self-supervised learning. On both ResNet-50 and ResNet-200, we outperform cross entropy by over 1%, setting a new state of the art number of 78.8% among methods that use AutoAugment data augmentation. The loss also shows clear benefits for robustness to natural corruptions on standard benchmarks on both calibration and accuracy. Compared to cross entropy, our supervised contrastive loss is more stable to hyperparameter settings such as optimizers or data augmentations.
Authors: Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan
Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Видео Supervised Contrastive Learning канала Yannic Kilcher
https://arxiv.org/abs/2004.11362
Abstract:
Cross entropy is the most widely used loss function for supervised training of image classification models. In this paper, we propose a novel training methodology that consistently outperforms cross entropy on supervised learning tasks across different architectures and data augmentations. We modify the batch contrastive loss, which has recently been shown to be very effective at learning powerful representations in the self-supervised setting. We are thus able to leverage label information more effectively than cross entropy. Clusters of points belonging to the same class are pulled together in embedding space, while simultaneously pushing apart clusters of samples from different classes. In addition to this, we leverage key ingredients such as large batch sizes and normalized embeddings, which have been shown to benefit self-supervised learning. On both ResNet-50 and ResNet-200, we outperform cross entropy by over 1%, setting a new state of the art number of 78.8% among methods that use AutoAugment data augmentation. The loss also shows clear benefits for robustness to natural corruptions on standard benchmarks on both calibration and accuracy. Compared to cross entropy, our supervised contrastive loss is more stable to hyperparameter settings such as optimizers or data augmentations.
Authors: Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan
Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Видео Supervised Contrastive Learning канала Yannic Kilcher
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Big Self-Supervised Models are Strong Semi-Supervised Learners (Paper Explained)Gradient Surgery for Multi-Task LearningDETR: End-to-End Object Detection with Transformers (Paper Explained)[Drama] Schmidhuber: Critique of Honda Prize for Dr. HintonLecture 7 Self-Supervised Learning -- UC Berkeley Spring 2020 - CS294-158 Deep Unsupervised LearningContrastive Predictive Coding (CPC)CURL: Contrastive Unsupervised Representations for Reinforcement LearningC4W4L03 Siamese NetworkThe AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies (Paper Explained)Image GPT: Generative Pretraining from Pixels (Paper Explained)SimCLR Explained!OpenAI DALL·E: Creating Images from Text (Blog Post Explained)iMAML: Meta-Learning with Implicit Gradients (Paper Explained)FixMatch: Simplifying Semi-Supervised Learning with Consistency and ConfidenceBackpropagation and the brain#032- Simon Kornblith / GoogleAI - SimCLR and Paper Haul!Yonglong Tian - Contrastive Learning: A General Self-supervised Learning ApproachA critical analysis of self-supervision, or what we can learn from a single image (Paper Explained)Longformer: The Long-Document TransformerAttention Is All You Need