Manifold Mixup: Better Representations by Interpolating Hidden States
Standard neural networks suffer from problems such as un-smooth classification boundaries and overconfidence. Manifold Mixup is an easy regularization technique that rectifies these problems. It works by interpolating hidden representations of different data points and then train them to predict equally interpolated labels.
https://arxiv.org/abs/1806.05236
Abstract:
Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it on practical situations, and connect it to previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.
Authors:
Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio
Видео Manifold Mixup: Better Representations by Interpolating Hidden States канала Yannic Kilcher
https://arxiv.org/abs/1806.05236
Abstract:
Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it on practical situations, and connect it to previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.
Authors:
Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio
Видео Manifold Mixup: Better Representations by Interpolating Hidden States канала Yannic Kilcher
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![mixup: Beyond Empirical Risk Minimization (Paper Explained)](https://i.ytimg.com/vi/a-VQfQqIMrE/default.jpg)
![Adversarial Examples Are Not Bugs, They Are Features](https://i.ytimg.com/vi/hMO6rbMAPew/default.jpg)
![[ML News] Hugging Face course | GAN Theft Auto | AI Programming Puzzles | PyTorch 1.9 Released](https://i.ytimg.com/vi/6_q9DbX35kk/default.jpg)
![Stokes' Theorem on Manifolds](https://i.ytimg.com/vi/1lGM5DEdMaw/default.jpg)
![What is a Manifold? - Mikhail Gromov](https://i.ytimg.com/vi/u5DLpAqX4YA/default.jpg)
![Processing Megapixel Images with Deep Attention-Sampling Models](https://i.ytimg.com/vi/H6Qiegq_36c/default.jpg)
![Topological adventures in machine learning | AI & Topology | Kathryn Hess Bellwald](https://i.ytimg.com/vi/Gba0Rupi3W4/default.jpg)
![Curiosity-driven Exploration by Self-supervised Prediction](https://i.ytimg.com/vi/_Z9ZP1eiKsI/default.jpg)
![Neural Network Learns to Play Snake](https://i.ytimg.com/vi/zIkBYwdkuTk/default.jpg)
![[ML News] GitHub Copilot - Copyright, GPL, Patents & more | Brickit LEGO app | Distill goes on break](https://i.ytimg.com/vi/TrLrBL1U8z0/default.jpg)
![Manifolds #1 - Introducing Manifolds](https://i.ytimg.com/vi/GqRoiZgd6N8/default.jpg)
![The Mathematics of Neural Networks](https://i.ytimg.com/vi/e5xKayCBOeU/default.jpg)
![FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence](https://i.ytimg.com/vi/eYgPJ_7BkEw/default.jpg)
![MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model](https://i.ytimg.com/vi/We20YSAJZSE/default.jpg)
![Introduction to Machine Learning - 11 - Manifold learning and t-SNE](https://i.ytimg.com/vi/MnRskV3NY1k/default.jpg)
![Simple Explanation of AutoEncoders](https://i.ytimg.com/vi/3jmcHZq3A5s/default.jpg)
![An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)](https://i.ytimg.com/vi/TrdevFK_am4/default.jpg)
![Planning to Explore via Self-Supervised World Models (Paper Explained)](https://i.ytimg.com/vi/IiBFqnNu7A8/default.jpg)
![[Trash] Automated Inference on Criminality using Face Images](https://i.ytimg.com/vi/zt_R85Ife_U/default.jpg)
![[Rant] The Male Only History of Deep Learning](https://i.ytimg.com/vi/yPjuAo53uNI/default.jpg)