Загрузка страницы

Peter Richtarik - The Resolution of a Question Related to Local Training in Federated Learning

Presentation given by Peter Richtarik on 5th October 2022 in the one world seminar on the mathematics of machine learning on the topic "On the Resolution of a Theoretical Question Related to the Nature of Local Training in Federated Learning".

Abstract: We study distributed optimization methods based on the local training (LT) paradigm - achieving improved communication efficiency by performing richer local gradient-based training on the clients before parameter averaging - which is of key importance in federated learning. Looking back at the progress of the field in the last decade, we identify 5 generations of LT methods: 1) heuristic, 2) homogeneous, 3) sublinear, 4) linear, and 5) accelerated. The 5th generation, initiated by the ProxSkip method of Mishchenko et al (2022) and its analysis, is characterized by the first theoretical confirmation that LT is a communication acceleration mechanism. In this talk, I will explain the problem, its solution, and some subsequent work generalizing, extending and improving the ProxSkip method in various ways.

References:

1. Konstantin Mishchenko, Grigory Malinovsky, Sebastian Stich and Peter Richtárik. ProxSkip: Yes! Local gradient steps provably lead to communication acceleration! Finally! Proceedings of the 39th International Conference on Machine Learning, 2022
2. Grigory Malinovsky, Kai Yi and Peter Richtárik. Variance reduced ProxSkip: Algorithm, theory and application to federated learning, arXiv:2207.04338, 2022
3. Laurent Condat and Peter Richtárik. RandProx: Primal-dual optimization algorithms with randomized proximal updates, arXiv:2207.12891, 2022
4. Abdurakhmon Sadiev, Dmitry Kovalev and Peter Richtárik. Communication acceleration of local gradient methods via an accelerated primal-dual algorithm with inexact prox, arXiv:2207.03957, 2022

Видео Peter Richtarik - The Resolution of a Question Related to Local Training in Federated Learning канала One world theoretical machine learning
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
6 октября 2022 г. 14:13:29
01:04:35
Другие видео канала
Lukasz Szpruch - Mean-Field Neural ODEs, Relaxed Control and Generalization ErrorsLukasz Szpruch - Mean-Field Neural ODEs, Relaxed Control and Generalization ErrorsMatthew Colbrook - Smale’s 18th Problem and the Barriers of Deep LearningMatthew Colbrook - Smale’s 18th Problem and the Barriers of Deep LearningYu Bai - How Important is the Train-Validation Split in Meta-Learning?Yu Bai - How Important is the Train-Validation Split in Meta-Learning?Anna Korba - Kernel Stein Discrepancy DescentAnna Korba - Kernel Stein Discrepancy DescentAnirbit Mukherjee - Provable Training of Neural Nets With One Layer of ActivationAnirbit Mukherjee - Provable Training of Neural Nets With One Layer of ActivationKevin Miller - Ensuring Exploration and Exploitation in Graph-Based Active LearningKevin Miller - Ensuring Exploration and Exploitation in Graph-Based Active LearningTheo Bourdais - Computational Hypergraph Discovery, a Gaussian Process frameworkTheo Bourdais - Computational Hypergraph Discovery, a Gaussian Process frameworkYaoqing Yang - Predicting & improving generalization by measuring loss landscapes & weight matricesYaoqing Yang - Predicting & improving generalization by measuring loss landscapes & weight matricesKonstantinos Spiliopoulos - Mean field limits of neural networks: typical behavior and fluctuationsKonstantinos Spiliopoulos - Mean field limits of neural networks: typical behavior and fluctuationsNadia Drenska - A PDE Interpretation of Prediction with Expert AdviceNadia Drenska - A PDE Interpretation of Prediction with Expert AdviceMatthias Ehrhardt - Bilevel Learning for Inverse ProblemsMatthias Ehrhardt - Bilevel Learning for Inverse ProblemsMarcus Hutter - Testing Independence of Exchangeable Random VariablesMarcus Hutter - Testing Independence of Exchangeable Random VariablesYury Korolev - Approximation properties of two-layer neural networks with values in a Banach spaceYury Korolev - Approximation properties of two-layer neural networks with values in a Banach spaceSophie Langer - Circumventing the curse of dimensionality with deep neural networksSophie Langer - Circumventing the curse of dimensionality with deep neural networksStephan Mandt - Compressing Variational Bayes: From neural data compression to video predictionStephan Mandt - Compressing Variational Bayes: From neural data compression to video predictionDerek Driggs - Barriers to Deploying Deep Learning Models During the COVID-19 PandemicDerek Driggs - Barriers to Deploying Deep Learning Models During the COVID-19 PandemicGal Vardi - Implications of the implicit bias in neural networksGal Vardi - Implications of the implicit bias in neural networksZiwei Ji - The dual of the margin: improved analyses and rates for gradient descent’s implicit biasZiwei Ji - The dual of the margin: improved analyses and rates for gradient descent’s implicit biasQi Lei - Predicting What You Already Know Helps: Provable Self-Supervised LearningQi Lei - Predicting What You Already Know Helps: Provable Self-Supervised LearningAlessandro Scagliotti - Deep Learning Approximation of Diffeomorphisms via Linear-Control SystemsAlessandro Scagliotti - Deep Learning Approximation of Diffeomorphisms via Linear-Control Systems
Яндекс.Метрика