Загрузка страницы

Yu Bai - How Important is the Train-Validation Split in Meta-Learning?

Presentation given by Yu Bai on October 22nd 2020 in the one world seminar on the mathematics of machine learning on the topic "How Important is the Train-Validation Split in Meta-Learning?".

Abstract: Meta-learning aims to perform fast adaptation on a new task through learning a “prior” from multiple existing tasks. A common practice in meta-learning is to perform a train-validation split where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split. Despite its prevalence, the importance of the train-validation split is not well understood either in theory or in practice, particularly in comparison to the more direct non-splitting method, which uses all the per-task data for both training and evaluation. We provide a detailed theoretical study on whether and when the train-validation split is helpful on the linear centroid meta-learning problem, in the asymptotic setting where the number of tasks goes to infinity. We show that the splitting method converges to the optimal prior as expected, whereas the non-splitting method does not in general without structural assumptions on the data. In contrast, if the data are generated from linear models (the realizable regime), we show that both the splitting and non-splitting methods converge to the optimal prior. Further, perhaps surprisingly, our main result shows that the non-splitting method achieves a strictly better asymptotic excess risk under this data distribution, even when the regularization parameter and split ratio are optimally tuned for both methods. Our results highlight that data splitting may not always be preferable, especially when the data is realizable by the model. We validate our theories by experimentally showing that the non-splitting method can indeed outperform the splitting method, on both simulations and real meta-learning tasks.

Видео Yu Bai - How Important is the Train-Validation Split in Meta-Learning? канала One world theoretical machine learning
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
28 октября 2020 г. 22:57:18
00:53:14
Другие видео канала
Lukasz Szpruch - Mean-Field Neural ODEs, Relaxed Control and Generalization ErrorsLukasz Szpruch - Mean-Field Neural ODEs, Relaxed Control and Generalization ErrorsMatthew Colbrook - Smale’s 18th Problem and the Barriers of Deep LearningMatthew Colbrook - Smale’s 18th Problem and the Barriers of Deep LearningAnna Korba - Kernel Stein Discrepancy DescentAnna Korba - Kernel Stein Discrepancy DescentAnirbit Mukherjee - Provable Training of Neural Nets With One Layer of ActivationAnirbit Mukherjee - Provable Training of Neural Nets With One Layer of ActivationKevin Miller - Ensuring Exploration and Exploitation in Graph-Based Active LearningKevin Miller - Ensuring Exploration and Exploitation in Graph-Based Active LearningTheo Bourdais - Computational Hypergraph Discovery, a Gaussian Process frameworkTheo Bourdais - Computational Hypergraph Discovery, a Gaussian Process frameworkYaoqing Yang - Predicting & improving generalization by measuring loss landscapes & weight matricesYaoqing Yang - Predicting & improving generalization by measuring loss landscapes & weight matricesKonstantinos Spiliopoulos - Mean field limits of neural networks: typical behavior and fluctuationsKonstantinos Spiliopoulos - Mean field limits of neural networks: typical behavior and fluctuationsNadia Drenska - A PDE Interpretation of Prediction with Expert AdviceNadia Drenska - A PDE Interpretation of Prediction with Expert AdviceMatthias Ehrhardt - Bilevel Learning for Inverse ProblemsMatthias Ehrhardt - Bilevel Learning for Inverse ProblemsPeter Richtarik - The Resolution of a Question Related to Local Training in Federated LearningPeter Richtarik - The Resolution of a Question Related to Local Training in Federated LearningMarcus Hutter - Testing Independence of Exchangeable Random VariablesMarcus Hutter - Testing Independence of Exchangeable Random VariablesYury Korolev - Approximation properties of two-layer neural networks with values in a Banach spaceYury Korolev - Approximation properties of two-layer neural networks with values in a Banach spaceSophie Langer - Circumventing the curse of dimensionality with deep neural networksSophie Langer - Circumventing the curse of dimensionality with deep neural networksStephan Mandt - Compressing Variational Bayes: From neural data compression to video predictionStephan Mandt - Compressing Variational Bayes: From neural data compression to video predictionDerek Driggs - Barriers to Deploying Deep Learning Models During the COVID-19 PandemicDerek Driggs - Barriers to Deploying Deep Learning Models During the COVID-19 PandemicGal Vardi - Implications of the implicit bias in neural networksGal Vardi - Implications of the implicit bias in neural networksZiwei Ji - The dual of the margin: improved analyses and rates for gradient descent’s implicit biasZiwei Ji - The dual of the margin: improved analyses and rates for gradient descent’s implicit biasQi Lei - Predicting What You Already Know Helps: Provable Self-Supervised LearningQi Lei - Predicting What You Already Know Helps: Provable Self-Supervised LearningAlessandro Scagliotti - Deep Learning Approximation of Diffeomorphisms via Linear-Control SystemsAlessandro Scagliotti - Deep Learning Approximation of Diffeomorphisms via Linear-Control Systems
Яндекс.Метрика