Why Deep Learning Works: Self Regularization in Neural Networks
Why Deep Learning Works: Self Regularization in Neural Networks
In Collaboration with Michael Mahoney, UC Berkeley
https://www.slideshare.net/charlesmartin141/why-deep-learning-works-self-regularization-in-deep-neural-networks-101447737
Empirical results, using the machinery of Random Matrix Theory (RMT), are presented that are aimed at clarifying and resolving some of the puzzling and seemingly-contradictory aspects of deep neural networks (DNNs). We apply RMT to several well known pre-trained models: LeNet5, AlexNet, and Inception V3, as well as 2 small, toy models. We show that the DNN training process itself implicitly implements a form of self-regularization associated with the entropy collapse / information bottleneck. We find that the self-regularization in small models like LeNet5, resembles the familar Tikhonov regularization, whereas large, modern deep networks display a new kind of heavy tailed self-regularization. We characterize self-regularization using RMT by identifying a taxonomy of the 5+1 phases of training. Then, with our toy models, we show that even in the absence of any explicit regularization mechanism, the DNN training process itself leads to more and more capacity-controlled models. Importantly, this phenomenon is strongly affected by the many knobs that are used to optimize DNN training. In particular, we can induce heavy tailed self-regularization by adjusting the batch size in training, thereby exploiting the generalization gap phenomena unique to DNNs. We argue that this heavy tailed self-regularization has practical implications both designing better DNNs and deep theoretical implications for understanding the complex DNN Energy landscape / optimization problem.
Видео Why Deep Learning Works: Self Regularization in Neural Networks канала Calculation Consulting
In Collaboration with Michael Mahoney, UC Berkeley
https://www.slideshare.net/charlesmartin141/why-deep-learning-works-self-regularization-in-deep-neural-networks-101447737
Empirical results, using the machinery of Random Matrix Theory (RMT), are presented that are aimed at clarifying and resolving some of the puzzling and seemingly-contradictory aspects of deep neural networks (DNNs). We apply RMT to several well known pre-trained models: LeNet5, AlexNet, and Inception V3, as well as 2 small, toy models. We show that the DNN training process itself implicitly implements a form of self-regularization associated with the entropy collapse / information bottleneck. We find that the self-regularization in small models like LeNet5, resembles the familar Tikhonov regularization, whereas large, modern deep networks display a new kind of heavy tailed self-regularization. We characterize self-regularization using RMT by identifying a taxonomy of the 5+1 phases of training. Then, with our toy models, we show that even in the absence of any explicit regularization mechanism, the DNN training process itself leads to more and more capacity-controlled models. Importantly, this phenomenon is strongly affected by the many knobs that are used to optimize DNN training. In particular, we can induce heavy tailed self-regularization by adjusting the batch size in training, thereby exploiting the generalization gap phenomena unique to DNNs. We argue that this heavy tailed self-regularization has practical implications both designing better DNNs and deep theoretical implications for understanding the complex DNN Energy landscape / optimization problem.
Видео Why Deep Learning Works: Self Regularization in Neural Networks канала Calculation Consulting
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Why Deep Learning Works: ICSI UC Berkeley 2018](https://i.ytimg.com/vi/6Zgul4oygMc/default.jpg)
![Heavy Tails Workshop NeurIPS2023 Talk](https://i.ytimg.com/vi/ZT0yZ-wFIaE/default.jpg)
![Data Quality Mismatch](https://i.ytimg.com/vi/tF0KCoAyYtA/default.jpg)
![Artificial Intelligence: implications for society](https://i.ytimg.com/vi/UOL1plte738/default.jpg)
![AI and Machine Learning for the Lean Startup](https://i.ytimg.com/vi/DcpxL_IFRbU/default.jpg)
![PaperWare](https://i.ytimg.com/vi/kn6PbQA4K7Q/default.jpg)
![Data Contraband..or Data from a friend](https://i.ytimg.com/vi/H_Fbqd0bl04/default.jpg)
![DataBase Management McDonough School of Business GU EMBA](https://i.ytimg.com/vi/N4tLF2PoUmc/default.jpg)
![ENS Talk on WeightWatcher March 2022](https://i.ytimg.com/vi/xEuBwBj_Ov4/default.jpg)
![Deep Learning Discussion: RBMs](https://i.ytimg.com/vi/c_VDyT91Ujs/default.jpg)
![Capsule networks: overview](https://i.ytimg.com/vi/YqazfBLLV4U/default.jpg)
![Why Deep Learning Works](https://i.ytimg.com/vi/kIbKHIPbxiU/default.jpg)
![Data Science Leadership](https://i.ytimg.com/vi/8eTtS9KrSt0/default.jpg)
![Data Science vs Machine Learning](https://i.ytimg.com/vi/bJ4JD0S5sHE/default.jpg)
![Stanford ICME Lecture on Why Deep Learning Works. Jan 2020](https://i.ytimg.com/vi/PQUItQi-B-I/default.jpg)
![C|C the Data Science Process](https://i.ytimg.com/vi/z0_4WuqPJc0/default.jpg)
![Data Cleaning](https://i.ytimg.com/vi/UWU2NmCHy1w/default.jpg)
![Back of the Envelope Calculations](https://i.ytimg.com/vi/ogcqWDHu4K0/default.jpg)
![C|C Demand Media Case Study](https://i.ytimg.com/vi/6Cv_Rv56TA0/default.jpg)
![Vlog intro to Calculation Consulting](https://i.ytimg.com/vi/62A_z0yFkw4/default.jpg)