Andrew Rowan - Bayesian Deep Learning with Edward (and a trick using Dropout)
Filmed at PyData London 2017
Description
Bayesian neural networks have seen a resurgence of interest as a way of generating model uncertainty estimates. I use Edward, a new probabilistic programming framework extending Python and TensorFlow, for inference on deep neural nets for several benchmark data sets. This is compared with dropout training, which has recently been shown to be formally equivalent to approximate Bayesian inference.
Abstract
Deep learning methods represent the state-of-the-art for many applications such as speech recognition, computer vision and natural language processing. Conventional approaches generate point estimates of deep neural network weights and hence make predictions that can be overconfident since they do not account well for uncertainty in model parameters. However, having some means of quantifying the uncertainty of our predictions is often a critical requirement in fields such as medicine, engineering and finance. One natural response is to consider Bayesian methods, which offer a principled way of estimating predictive uncertainty while also showing robustness to overfitting.
Bayesian neural networks have a long history. Exact Bayesian inference on network weights is generally intractable and much work in the 1990s focused on variational and Monte Carlo based approximations [1-3]. However, these suffered from a lack of scalability for modern applications. Recently the field has seen a resurgence of interest, with the aim of constructing practical, scalable techniques for approximate Bayesian inference on more complex models, deep architectures and larger data sets [4-10].
Edward is a new, Turing-complete probabilistic programming language built on Python [11]. Probabilistic programming frameworks typically face a trade-off between the range of models that can be expressed and the efficiency of inference engines. Edward can leverage graph frameworks such as TensorFlow to enable fast distributed training, parallelism, vectorisation, and GPU support, while also allowing composition of both models and inference methods for a greater degree of flexibility.
In this talk I will give a brief overview of developments in Bayesian deep learning and demonstrate results of Bayesian inference on deep architectures implemented in Edward for a range of publicly available data sets. Dropout is an empirical technique which has been very successfully applied to reduce overfitting in deep learning models [12]. Recent work by Gal and Ghahramani [13] has demonstrated a surprising formal equivalence between dropout and approximate Bayesian inference in neural networks. I will compare some results of inference via the machinery of Edward with model averaging over neural nets with dropout training.
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
We aim to be an accessible, community-driven conference, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Видео Andrew Rowan - Bayesian Deep Learning with Edward (and a trick using Dropout) канала PyData
Description
Bayesian neural networks have seen a resurgence of interest as a way of generating model uncertainty estimates. I use Edward, a new probabilistic programming framework extending Python and TensorFlow, for inference on deep neural nets for several benchmark data sets. This is compared with dropout training, which has recently been shown to be formally equivalent to approximate Bayesian inference.
Abstract
Deep learning methods represent the state-of-the-art for many applications such as speech recognition, computer vision and natural language processing. Conventional approaches generate point estimates of deep neural network weights and hence make predictions that can be overconfident since they do not account well for uncertainty in model parameters. However, having some means of quantifying the uncertainty of our predictions is often a critical requirement in fields such as medicine, engineering and finance. One natural response is to consider Bayesian methods, which offer a principled way of estimating predictive uncertainty while also showing robustness to overfitting.
Bayesian neural networks have a long history. Exact Bayesian inference on network weights is generally intractable and much work in the 1990s focused on variational and Monte Carlo based approximations [1-3]. However, these suffered from a lack of scalability for modern applications. Recently the field has seen a resurgence of interest, with the aim of constructing practical, scalable techniques for approximate Bayesian inference on more complex models, deep architectures and larger data sets [4-10].
Edward is a new, Turing-complete probabilistic programming language built on Python [11]. Probabilistic programming frameworks typically face a trade-off between the range of models that can be expressed and the efficiency of inference engines. Edward can leverage graph frameworks such as TensorFlow to enable fast distributed training, parallelism, vectorisation, and GPU support, while also allowing composition of both models and inference methods for a greater degree of flexibility.
In this talk I will give a brief overview of developments in Bayesian deep learning and demonstrate results of Bayesian inference on deep architectures implemented in Edward for a range of publicly available data sets. Dropout is an empirical technique which has been very successfully applied to reduce overfitting in deep learning models [12]. Recent work by Gal and Ghahramani [13] has demonstrated a surprising formal equivalence between dropout and approximate Bayesian inference in neural networks. I will compare some results of inference via the machinery of Edward with model averaging over neural nets with dropout training.
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
We aim to be an accessible, community-driven conference, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Видео Andrew Rowan - Bayesian Deep Learning with Edward (and a trick using Dropout) канала PyData
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Eric J. Ma - An Attempt At Demystifying Bayesian Deep LearningVariational Inference: Foundations and InnovationsProbabilistic Programming and Bayesian Modeling with PyMC3 - Christopher Fonnesbeck[ICML 2020] How Good is the Bayes Posterior in Deep Neural Networks Really?Talking Bayes to Business: A/B Testing Use Case | ShopifyVincent Warmerdam: How to Constrain Artificial Stupidity | PyData London 2019TensorFlow Probability (TensorFlow @ O’Reilly AI Conference, San Francisco '18)1/ Réseaux convolutifs (CNN)Monte Carlo SimulationYann LeCun - Graph Embedding, Content Understanding, and Self-Supervised LearningJames Powell: So you want to be a Python expert? | PyData Seattle 20176. Monte Carlo SimulationProbabilistic Machine Learning and AI: Zoubin GhahramaniBayesian Deep Learning — ANDREW GORDON WILSONUncertainty Quantification and Deep Learning ǀ Elise Jennings, Argonne National LaboratoryAll that likelihood with PyMC3 - Junpeng LaoCutting Edge TensorFlow: New Techniques (Google I/O'19)Towards an Understanding of Wide, Deep Neural Networks | NeurIPS 2019 | Yasaman BahriWebinar: Theory and Practice of Bayesian Inference Using JASP