NeurIPS 2019 Test of Time Award - Lin Xiao
Dual Averaging Method for Regularized Stochastic Learning and Online Optimization
Slides: https://imgur.com/a/b2AiEUI
Paper: https://papers.nips.cc/paper/3882-dual-averaging-method-for-regularized-stochastic-learning-and-online-optimization.pdf
Abstract:
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as L1-norm for sparsity. We develop a new online algorithm, the regularized dual averaging method, that can explicitly exploit the regularization structure in an online setting. In particular, at each iteration, the learning variables are adjusted by solving a simple optimization problem that involves the running average of all past subgradients of the loss functions and the whole regularization term, not just its subgradient. This method achieves the optimal convergence rate and often enjoys a low complexity per iteration similar as the standard stochastic gradient method. Computational experiments are presented for the special case of sparse online learning using L1-regularization.
Видео NeurIPS 2019 Test of Time Award - Lin Xiao канала Preserve Knowledge
Slides: https://imgur.com/a/b2AiEUI
Paper: https://papers.nips.cc/paper/3882-dual-averaging-method-for-regularized-stochastic-learning-and-online-optimization.pdf
Abstract:
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as L1-norm for sparsity. We develop a new online algorithm, the regularized dual averaging method, that can explicitly exploit the regularization structure in an online setting. In particular, at each iteration, the learning variables are adjusted by solving a simple optimization problem that involves the running average of all past subgradients of the loss functions and the whole regularization term, not just its subgradient. This method achieves the optimal convergence rate and often enjoys a low complexity per iteration similar as the standard stochastic gradient method. Computational experiments are presented for the special case of sparse online learning using L1-regularization.
Видео NeurIPS 2019 Test of Time Award - Lin Xiao канала Preserve Knowledge
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
The 7 secrets of the greatest speakers in history | Richard Greene | TEDxOrangeCoastNeurIPS 19 Poster SessionAfter watching this, your brain will not be the same | Lara Boyd | TEDxVancouverNIPS 2017 Test of Time Award "Machine learning has become alchemy.” | Ali Rahimi, GoogleUniform convergence may be unable to explain generalization in deep learning | NeurIPSDavid Duvenaud | Reflecting on Neural ODEs | NeurIPS 2019Celeste Kidd - NeurIPS 2019 - How to Know [Talk only, no slides]How AI Powers Self-Driving Tesla with Elon Musk and Andrej KarpathyLearning Representations: A Challenge for Learning Theory, COLT 2013 | Yann LeCun, NYUMeet Geoffrey Hinton, U of T's Godfather of Deep LearningYoshua Bengio | From System 1 Deep Learning to System 2 Deep Learning | NeurIPS 2019The surprising secret to speaking with confidence | Caroline Goyder | TEDxBrixtonNobel Minds 2019Geoffrey Hinton: Turing Award Lecture "The Deep Learning Revolution"Unrolled Generative Adversarial Networks, NIPS 2016 | Luke Metz, Google BrainThat's how Top AI/ML Conference looks (NeurIPS 2019, Vancouver)Introduction to GANs, NIPS 2016 | Ian Goodfellow, OpenAIConnecting Generative Adversarial Networks and Actor Critic Methods, NIPS 2016 | David PfauNeurIPS 2019 Track 1 Session 2