Загрузка страницы

Running Apache Spark on Kubernetes: Best Practices and Pitfalls

Since initial support was added in Apache Spark 2.3, running Spark on Kubernetes has been growing in popularity. Reasons include the improved isolation and resource sharing of concurrent Spark applications on Kubernetes, as well as the benefit to use an homogeneous and cloud native infrastructure for the entire tech stack of a company. But running Spark on Kubernetes in a stable, performant, cost-efficient and secure manner also presents specific challenges. In this talk, JY and Julien will go over lessons learned while building Data Mechanics, a serverless Spark platform powered by Kubernetes.

Topics include:

- Core concepts and setup of Spark on Kubernetes
- Configuration tips for performance and efficient resource sharing
- Spark-app level dynamic allocation and cluster level autoscaling
- Specificities of Kubernetes for data I/O performance
- Monitoring and security best practices
- Limitations and planned future works

About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/

Видео Running Apache Spark on Kubernetes: Best Practices and Pitfalls канала Databricks
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
7 июля 2020 г. 20:28:00
00:24:48
Яндекс.Метрика