Все видео Новые видео Популярные видео Категории видео

Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

Stefano Soatto: "Invariance and disentanglement in deep representations"

New Deep Learning Techniques 2018

"Invariance and disentanglement in deep representations"
Stefano Soatto, University of California, Los Angeles (UCLA)

Abstract: Theories of Deep Learning are like anatomical parts best not named explicitly in an abstract: Everyone seems to have one. That is why it is important for a theory to be inclusive: It has to be compatible with all known results, and at the very least explain known empirical phenomena. I will describe the basic elements of the Emegence Theory of Deep Learning, that started as a general theory for represenations, and is comprised of three parts: (1) Formalization of desirable properties a representation should possess, based on classical principles of statistical decision and information theory: Sufficiency, Invariance, Minimiality, Independence. This has nothing to do with Deep Leaerning, but is closely tied with the notion of Information Bottleneck and Variational Inference. (2) Description of common empirical losses employed in Deep Learning (e.g., empirical cross-entropy), and implicit or explicit regularization practices, including Dropout, Pooling, as well as recently proven additive entropic components of the loss computed by SGD. Finally, (3) theorems and bounds that show that minimizing suitably (implicitly or explicitly) regularized losses with SGD with respect of the weights implies optimization of the loss described in (1) with respect to the activations of a deep network, and therefore achievement of the desirable properties of the resulting representation formalized in (1). The link between the two is specific to the architecture of deep networks. The theory is related to the Information Bottleneck, but not that described in recent theories, but instead a new Information Bottleneck for the weights of a network, rater than the activation. It is also related to PAC-Bayes, and could be derived with that lens, providing independent validation. It is also related to Kolmogorov complexity. It is also related to “flat minima”, in the sense that the crucial regularizing quantity - the information in the weights - bounds the nuclear norm of the Hessian around critical points. It also shows that there is no need to rethink regularization, and that - unlike the Hessian - information is invariant to reparametrization.

Joint work with Alessandro Achille and Pratik Chaudhari.

References: https://arxiv.org/pdf/1706.01350.pdf and https://arxiv.org/abs/1710.11029

Institute for Pure and Applied Mathematics, UCLA
February 8, 2018

For more information: http://www.ipam.ucla.edu/programs/workshops/new-deep-learning-techniques/?tab=overview

Видео Stefano Soatto: "Invariance and disentanglement in deep representations" канала Institute for Pure & Applied Mathematics (IPAM)

Показать

Комментарии отсутствуют