MLconf Online 2020: Mathematical Approaches to Clustering by Joseph Ross
Clustering is a fundamental operation in many data science workflows. This talk will approach clustering from the mathematical point of view.
First we will review and compare different clustering methods (e.g. k-means and spectral clustering), with an emphasis on deciding when methods succeed or fail based on understanding their mathematical properties. We will analyze several examples.
This leads naturally to a more theoretical discussion about clustering algorithms. To this end, we will discuss Kleinberg's impossibility theorem, namely his axioms and a sketch of the proof. After explaining some ideas from category theory, we will examine the functorial approach to clustering developed by Carlsson-Memoli. We will compare the functorial approach to Kleinberg's axioms, and the role of density in the functorial approach will emerge.
One of the insights of Carlsson-Memoli is that clustering is the statistical counterpart to taking the connected components of a topological space. We conclude by discussing generalizations of clustering (persistent homology) suggested by this motto.
This talk will be mostly expository. The main hope is that practitioners will be able to better apply and reason about clustering algorithms.
Видео MLconf Online 2020: Mathematical Approaches to Clustering by Joseph Ross канала MLconf
First we will review and compare different clustering methods (e.g. k-means and spectral clustering), with an emphasis on deciding when methods succeed or fail based on understanding their mathematical properties. We will analyze several examples.
This leads naturally to a more theoretical discussion about clustering algorithms. To this end, we will discuss Kleinberg's impossibility theorem, namely his axioms and a sketch of the proof. After explaining some ideas from category theory, we will examine the functorial approach to clustering developed by Carlsson-Memoli. We will compare the functorial approach to Kleinberg's axioms, and the role of density in the functorial approach will emerge.
One of the insights of Carlsson-Memoli is that clustering is the statistical counterpart to taking the connected components of a topological space. We conclude by discussing generalizations of clustering (persistent homology) suggested by this motto.
This talk will be mostly expository. The main hope is that practitioners will be able to better apply and reason about clustering algorithms.
Видео MLconf Online 2020: Mathematical Approaches to Clustering by Joseph Ross канала MLconf
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital](https://i.ytimg.com/vi/8rQ9g03yJ8E/default.jpg)
![Building Machine Learning Models with Strict Privacy Boundaries](https://i.ytimg.com/vi/HIKpXVc1mpo/default.jpg)
![Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor, CalTech](https://i.ytimg.com/vi/RRy-3VXA0nw/default.jpg)
![Manipulating and Measuring Model Interpretability](https://i.ytimg.com/vi/hHAW1ug2qlE/default.jpg)
![Jennifer Marsman, Principal Developer Evangelist, Microsoft @ MLconf NYC](https://i.ytimg.com/vi/T8FaWkqzK0A/default.jpg)
![MLconf Online 2020: DevOps for Data Science With Kubernetes by Sophie Watson](https://i.ytimg.com/vi/9TqHilvnUuM/default.jpg)
![Sven Kreiss, Lead Data Scientist, Wildcard @ MLconf ATL](https://i.ytimg.com/vi/09kpP-w4DLI/default.jpg)
![Virginia Smith - A General Framework for Communication-Efficient Distributed... - MLconf SF 2016](https://i.ytimg.com/vi/vuGiNJoq8NQ/default.jpg)
![Jeremy Stanley, EVP/Data Scientist, Sailthru @ MLconf NYC](https://i.ytimg.com/vi/vEemVVLGo6E/default.jpg)
![Sanjeev Satheesh, The Story of End to End Models in Deep Learning at The AI Conference 2017](https://i.ytimg.com/vi/h3Y3Gohn1HI/default.jpg)
![MLconf Online 2020: Data Science is Key to Achieving Energy Access in Africa Madeleine Gleave](https://i.ytimg.com/vi/i9FXqOeFpwY/default.jpg)
![Subutai Ahmad, VP of Research, Numenta @ MLconf SF](https://i.ytimg.com/vi/SxtsCrTHz-4/default.jpg)
![Justin Basilico, Senior Researcher Engineer in Recommendation Systems, Netlix @ MLconf ATL](https://i.ytimg.com/vi/doWgbo-c9sM/default.jpg)
![Sergei Vassilvitskii, Research Scientist, Google @ MLconf NYC](https://i.ytimg.com/vi/rtXeauFFCE4/default.jpg)
![Byron Galbraith, Chief Data Scientist, Talla, NYC 2017](https://i.ytimg.com/vi/IHCtfiI8llA/default.jpg)
![MLconf NYC 2022: How to Detect and Interpret Data Drift in Production by Emeli Dral of Evidently AI](https://i.ytimg.com/vi/FnVi_-eq4yE/default.jpg)
![Optimized Image Classification on the Cheap](https://i.ytimg.com/vi/P5rU5LJfV5A/default.jpg)
![Carlos Guestrin, CEO of Dato Inc. @ MLconf SEA](https://i.ytimg.com/vi/gjSC5ZjLnII/default.jpg)
![Johann Schleier Smith, Co Founder and CTO, ifwe @ MLconf SF](https://i.ytimg.com/vi/t6eAdPof9yQ/default.jpg)
![Using a Bayesian Neural Network in the Detection of Exoplanets](https://i.ytimg.com/vi/u42czORKkt8/default.jpg)