DASK and Apache SparkGurpreet Singh Microsoft Corporation
For a Python driven Data Science team, DASK presents a very obvious logical next step for distributed analysis. However, today the de-facto standard choice for exact same purpose is Apache Spark. DASK is a pure Python framework, which does more of same i.e. it allows one to run the same Pandas or NumPy code either locally or on a cluster. Whereas, Apache Spark brings about a learning curve involving a new API and execution model although with a Python wrapper. Given the above statement, do we even need to compare and contrast to make a choice? Shouldn't DASK be the default choice? Well, that's what this session is about. It goes in detail explaining the various viewpoints and dimensions that need to be considered to pick one over other.
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео DASK and Apache SparkGurpreet Singh Microsoft Corporation канала Databricks
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео DASK and Apache SparkGurpreet Singh Microsoft Corporation канала Databricks
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Low-Code Apache SparkComcast makes home entertainment accessible to everyone with voice, data and AINBA Analytics | Data Brew | Season 4 Episode 2Data+AI Summit 2022 HighlightsAccelerating the Pace of Autism Diagnosis with Machine Learning ModelsMagnet Shuffle Service: Push-based Shuffle at LinkedInSciplay enables real-time customer insights with the Databricks Data Intelligence PlatformDemo Video: Connect to Power BI Desktop from DatabricksRay and Its Growing EcosystemGain 3 Benefits with Delta SharingPower to the (SQL) People: Python UDFs in DBSQLAutomating Data Quality Processes at ReckittLLM Module 3 - Multi-stage Reasoning | 3.7.3 Notebook Demo Part 3Modern Architecture of a Cloud-Enabled Data and Analytics PlatformLLM Module 2 - Embeddings, Vector Databases, and Search | 2.7 SummaryProtecting PII/PHI Data in Data Lake via Column Level EncryptionState-of-the-Art Natural Language Processing with Apache Spark NLPRun Your Queries Instantly in One of the Most Optimized EnvironmentsGrab leverages data + AI to create economic opportunities in Southeast AsiaMoving to the Lakehouse: Fast & Efficient Ingestion with Auto LoaderSpline: Central Data-Lineage Tracking, Not Only For Spark