Загрузка страницы

Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal (Paytm)

Quby is the creator and provider of Toon, a leading European smart home platform. We enable Toon users to control and monitor their homes using both an in-home display and app. As a data driven company, we use machine learning algorithms to generate actionable insights for our end users. We have developed data driven services to ensure that users do not needlessly waste energy and can receive real-time alerts about problems with their heating system. In this talk, Erni will describe our journey of productionizing data science algorithms. We'll take a deep dive into our pipeline and describe our streamlined development and deployment workflow. We'll explain how we define and manage dependencies between jobs in multiple environments (test, acceptance and production) and schedule the pipeline computation. We'll delve into scale challenges, metrics, monitoring and data quality. Also, we will reflect on the lessons learned while building high volume infrastructure that offers multiple data driven services to hundreds of thousands of users.

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/

Видео Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal (Paytm) канала Databricks
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
10 октября 2018 г. 21:39:05
00:32:59
Яндекс.Метрика