The Feature Store - Jim Dowling
PyData London Meetup #56
Tuesday, May 7, 2019
Machine Learning (ML) pipelines are the fundamental building block for productionizing ML code. However, existing tutorials and educational material in Python for Data Scientists emphasizes ad-hoc feature engineering and training pipelines to experiment with ML models. Such pipelines have a tendency to become complex over time and do not allow features to be easily re-used between different ML pipelines. Features used for training and serving may have different implementations that are not consistent.
In this talk, we will show how ML pipelines can be programmed, end-to-end, in Python. We will show how a Feature Store can provide a natural interface between Data Engineers, who create reusable features from diverse data sources, and Data Scientists, who experiment with predictive models, built from the same features. In an example end-to-end pipeline in Python, we will show how Python dampens the impedance mismatch between Data Engineering and Data Science, enabling Python to become the only language needed for ML pipelines.
Sponsored & Hosted by Man AHL
****
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Видео The Feature Store - Jim Dowling канала PyData
Tuesday, May 7, 2019
Machine Learning (ML) pipelines are the fundamental building block for productionizing ML code. However, existing tutorials and educational material in Python for Data Scientists emphasizes ad-hoc feature engineering and training pipelines to experiment with ML models. Such pipelines have a tendency to become complex over time and do not allow features to be easily re-used between different ML pipelines. Features used for training and serving may have different implementations that are not consistent.
In this talk, we will show how ML pipelines can be programmed, end-to-end, in Python. We will show how a Feature Store can provide a natural interface between Data Engineers, who create reusable features from diverse data sources, and Data Scientists, who experiment with predictive models, built from the same features. In an example end-to-end pipeline in Python, we will show how Python dampens the impedance mismatch between Data Engineering and Data Science, enabling Python to become the only language needed for ML pipelines.
Sponsored & Hosted by Man AHL
****
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Видео The Feature Store - Jim Dowling канала PyData
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Feast: feature store for Machine LearningRedis AI#bbuzz: Jim Dowling - The Feature Store: where Data Engineering meets Data ScienceApache Pulsar Implementation Patterns & Best PracticesData Scientist vs Machine Learning Engineer | DS vs MLMLOps with a Feature StoreWebinar: The Feature Store for Machine LearningMLOps Feature Store ExplanationAnnouncing the Unity Catalog | Matei Zaharia | Keynote Data + AI Summit NA 2021Simplifying Feature Engineering with a Feature Store5 Reasons Parquet Files Are Better Than CSV for Data Analyses | PyData Global 2021Feature Store for Machine Learning - MLOpsScott Hanselman’s best demo! IoT, Azure, Machine Learning & moreNLP: Tf-Idf vs Doc2Vec - Contrast and CompareStandardize and automate your feature engineering workflows (February 2021)Webinar: Feature Store as a Foundation for Machine LearningRethinking Feature StoresAsynchronous Hyperparameter Optimization with Apache Spark -Jim Dowling & Moritz MeisterDeveloping a Machine Learning Feature Store