Загрузка страницы

Everyday I'm Shuffling - Tips for Writing Better Apache Spark Programs

Want to learn how to write faster and more efficient programs for Apache Spark? Two Spark experts from Databricks, Vida Ha and Holden Karau, provide some performance tuning and testing tips for your Spark applications.

Overview:
Understanding the Shuffle in Spark
- Common causes of inefficiency
Understanding when code runs on the drive vs. the workers
- Common causes of errors
How to factor your code
- For reuse between batch and streaming

View slides at: http://www.slideshare.net/databricks/strata-sj-everyday-im-shuffling-tips-for-writing-better-spark-programs

Additional reading:

7 Tips to Debug Apache Spark Code Faster with Databricks
https://databricks.com/blog/2016/10/18/7-tips-to-debug-apache-spark-code-faster-with-databricks.html

Databricks Best Practices and Tips
https://docs.databricks.com/user-guide/clusters/best-practices.html

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/

Видео Everyday I'm Shuffling - Tips for Writing Better Apache Spark Programs канала Databricks
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
23 марта 2015 г. 22:05:36
00:36:25
Яндекс.Метрика