Загрузка страницы

DASK and Apache SparkGurpreet Singh Microsoft Corporation

For a Python driven Data Science team, DASK presents a very obvious logical next step for distributed analysis. However, today the de-facto standard choice for exact same purpose is Apache Spark. DASK is a pure Python framework, which does more of same i.e. it allows one to run the same Pandas or NumPy code either locally or on a cluster. Whereas, Apache Spark brings about a learning curve involving a new API and execution model although with a Python wrapper. Given the above statement, do we even need to compare and contrast to make a choice? Shouldn't DASK be the default choice? Well, that's what this session is about. It goes in detail explaining the various viewpoints and dimensions that need to be considered to pick one over other.

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner

Видео DASK and Apache SparkGurpreet Singh Microsoft Corporation канала Databricks
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
7 мая 2019 г. 3:14:34
00:37:25
Яндекс.Метрика