Using Spark Mllib Models in a Production Training and Serving Platform Experiences and ExtensionsA
Overview Uber's Michelangelo is a machine learning platform that supports training and serving thousands of models in production. Most Michelangelo customer models are based on Spark Mllib. In this talk, we will describe Michelangelo's experiences with and evolving use of Spark Mllib, particularly in the areas of model persistence and online serving. Extended Description Michelangelo [https://eng.uber.com/michelangelo/] was originally developed to support scalable machine learning for production models. Its end-to-end support for scheduled Spark-based data ingestion and model training, along with model evaluation and deployment for batch and online model serving, has gained wide acceptance across Uber. More recently, Michelangelo is evolving to handle more use cases, including evaluating and serving models trained outside of core Michelangelo, e.g., on a distributed tensorflow platform providing Horovod [https://eng.uber.com/horovod/] or using PySpark in a Jupyter notebook on Data Science Workbench [https://eng.uber.com/dsw/] To support evaluation and serving of models trained outside of Michelangelo, Michelangelo's use of Spark Mllib needed updating, to generalize its mechanisms for model persistence and online serving. In this talk, we will describe these mechanisms and explore possible avenues for open-sourcing them.
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Using Spark Mllib Models in a Production Training and Serving Platform Experiences and ExtensionsA канала Databricks
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Using Spark Mllib Models in a Production Training and Serving Platform Experiences and ExtensionsA канала Databricks
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Comcast makes home entertainment accessible to everyone with voice, data and AINBA Analytics | Data Brew | Season 4 Episode 2Data+AI Summit 2022 HighlightsDemo Video: Connect to Power BI Desktop from DatabricksLLM Module 3 - Multi-stage Reasoning | 3.7.1 Notebook Demo Part 1Power to the (SQL) People: Python UDFs in DBSQLProtecting PII/PHI Data in Data Lake via Column Level EncryptionGrab leverages data + AI to create economic opportunities in Southeast AsiaWehkamp excites shoppers with a better online experience with MLSpline: Central Data-Lineage Tracking, Not Only For SparkHow To Make Apache Spark on Kubernetes Run Reliably on Spot InstancesApache Arrow Flight SQL: High Performance, Simplicity, and Interoperability for Data TransfersLakehouse ML on Databricks DemoData Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data LakeExploring Lakehouse for RetailIntroducing Databricks on Google CloudAn easier way to optimise the layout of your Delta tablesApplied Predictive Maintenance in Aviation: Without Sensor DataLLM Module 0 - Introduction | 0.3 PrimerDelta Sharing - A New Paradigm for Secure Data Sharing and Data Collaboration on LakehouseIngesting data into Lakehouse with COPY INTO