Learned Optimizers - Jascha Sohl-Dickstein
Learned Optimizers: Why They're The Future, Why They’re Hard, and What They Can Do Now - Jascha Sohl-Dickstein
The success of deep learning has hinged on learned functions dramatically outperforming hand-designed functions for many tasks. However, we still train models using hand designed optimizers acting on hand designed loss functions. Jascha will argue that these hand designed components are typically mismatched to the desired behavior, and that we can expect meta-learned optimizers to perform much better. He will discuss the challenges and pathologies that make meta-training learned optimizers difficult. These include: chaotic and high variance meta-loss landscapes; extreme computational costs for meta-training; lack of comprehensive meta-training datasets; challenges designing learned optimizers with the right inductive biases; challenges interpreting the method of action of learned optimizers. I will share solutions to some of these challenges. He will show experimental results where learned optimizers outperform hand-designed optimizers in several contexts. Jascha will discuss novel capabilities that can be achieved by meta-training learned optimizers to target downstream performance rather than training loss. He will end with a demo of an open source JAX library for training, testing, and applying learned optimizers.
______
Jascha Sohl-Dickstein is a senior staff research scientist in Google Brain, and leads a research team with interests spanning machine learning, physics, and neuroscience. Recent projects have focused on theory of overparameterized neural networks, meta-training of learned optimizers, and understanding the capabilities of large language models. Jascha was previously a visiting scholar in Surya Ganguli's lab at Stanford, and an academic resident at Khan Academy. He earned his PhD in 2012 in Bruno Olshausen's lab in the Redwood Center for Theoretical Neuroscience at UC Berkeley. Prior to his PhD, he spent several years working for NASA on the Mars Exploration Rover mission.
Learn more here: https://www.iarai.ac.at/events/learned-optimizers-why-theyre-the-future-why-theyre-hard-and-what-they-can-do-now/
Subscribe to our newsletter and stay in the know:
https://www.iarai.ac.at/event-type/seminars/
___________________________________________________________________
IARAI | Institute of Advanced Research in Artificial Intelligence
www.iarai.ac.at
Видео Learned Optimizers - Jascha Sohl-Dickstein канала IARAI Research
The success of deep learning has hinged on learned functions dramatically outperforming hand-designed functions for many tasks. However, we still train models using hand designed optimizers acting on hand designed loss functions. Jascha will argue that these hand designed components are typically mismatched to the desired behavior, and that we can expect meta-learned optimizers to perform much better. He will discuss the challenges and pathologies that make meta-training learned optimizers difficult. These include: chaotic and high variance meta-loss landscapes; extreme computational costs for meta-training; lack of comprehensive meta-training datasets; challenges designing learned optimizers with the right inductive biases; challenges interpreting the method of action of learned optimizers. I will share solutions to some of these challenges. He will show experimental results where learned optimizers outperform hand-designed optimizers in several contexts. Jascha will discuss novel capabilities that can be achieved by meta-training learned optimizers to target downstream performance rather than training loss. He will end with a demo of an open source JAX library for training, testing, and applying learned optimizers.
______
Jascha Sohl-Dickstein is a senior staff research scientist in Google Brain, and leads a research team with interests spanning machine learning, physics, and neuroscience. Recent projects have focused on theory of overparameterized neural networks, meta-training of learned optimizers, and understanding the capabilities of large language models. Jascha was previously a visiting scholar in Surya Ganguli's lab at Stanford, and an academic resident at Khan Academy. He earned his PhD in 2012 in Bruno Olshausen's lab in the Redwood Center for Theoretical Neuroscience at UC Berkeley. Prior to his PhD, he spent several years working for NASA on the Mars Exploration Rover mission.
Learn more here: https://www.iarai.ac.at/events/learned-optimizers-why-theyre-the-future-why-theyre-hard-and-what-they-can-do-now/
Subscribe to our newsletter and stay in the know:
https://www.iarai.ac.at/event-type/seminars/
___________________________________________________________________
IARAI | Institute of Advanced Research in Artificial Intelligence
www.iarai.ac.at
Видео Learned Optimizers - Jascha Sohl-Dickstein канала IARAI Research
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Evaluating Machine (Human) Accuracy and Robustness on ImageNet - Ludwig SchmidtA Number Sense as an Emergent Property of the Manipulating Brain - Pietro PeronaEmbedding and Language Modeling for Effective Text Mining - Jiawei HanLearned data augmentation in natural language processing - Kyunghyun ChoCDCEO 22: Session I - Invited talk by Vipin KumarCDCEO 22: Session III - Invited talk by Nebojsa JojicTowards General and Robust AI at Scale - Irina RishPerformers & Memory - fireside chat: Sepp Hochreiter, Krzysztof Choromanski & Johannes BrandstetterScience4cast Special Session - Special Prize: Francisco AndradesScience4cast Special Session - 2nd Place: Ngoc TranProtein structure prediction with AlphaFold - Andrew SeniorTraffic4cast Special Session: Part II - NeurIPS 2020Neural diffusion PDEs, differential geometry, and graph neural networks - Michael BronsteinWeather4cast 2021 Special Session - Part 2Hopfield Networks in 2021 - Fireside chat between Sepp Hochreiter and Dmitry Krotov | NeurIPS 2020Machine Learning for Location Based Services - Prof. Dr. Ioannis GiannopoulosCDCEO22: Session II - Invited talk by Nantheera AnantrasirichaiScience4cast Special Session - 3rd Place: Milad AghajohariPatenting AI: Why, What and How? - Dr. Alexander KorenbergThe Importance of Motion Perception in Visual Recognition - Roman PflugfelderModern Hopfield Networks in DL Architectures II - Hubert Ramsauer