Delta Lake: Reliability and Data Quality for Data Lakes and Apache Spark by Michael Armbrust
Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.
#BIGTH19 #BigData #MachineLearning
Session presented at Big Things Conference 2019 by Michael Armbrust, Principal Engineer at Databricks
20th November 2019
Kinépolis, Madrid
Do you want to know more? https://www.bigthingsconference.com/
Видео Delta Lake: Reliability and Data Quality for Data Lakes and Apache Spark by Michael Armbrust канала Big Things Conference
#BIGTH19 #BigData #MachineLearning
Session presented at Big Things Conference 2019 by Michael Armbrust, Principal Engineer at Databricks
20th November 2019
Kinépolis, Madrid
Do you want to know more? https://www.bigthingsconference.com/
Видео Delta Lake: Reliability and Data Quality for Data Lakes and Apache Spark by Michael Armbrust канала Big Things Conference
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Delta Lake for apache Spark | How does it work | How to use delta lake | Delta Lake for Spark ACID](https://i.ytimg.com/vi/xYtU6fpsS3M/default.jpg)
![Data Quality Dimensions - Data Mining and Predictive Analysis](https://i.ytimg.com/vi/odfKgBVgL9Q/default.jpg)
![ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scale Storage and Analytics](https://i.ytimg.com/vi/iRXNtsayENg/default.jpg)
![AI-powered Automated Data Quality on Data Lakes by Aitor Murguzur](https://i.ytimg.com/vi/eiiRrIm0EeA/default.jpg)
![5 Data Management Best Practices for Better Data Analytics #dataanalytics #datamanagement](https://i.ytimg.com/vi/QenSWYVwcU0/default.jpg)
![Boston Spark Meetup @ Wayfair / Delta Lake: Open Source Reliability and Quality for Data Lakes](https://i.ytimg.com/vi/whaV6bMaf5o/default.jpg)
![Near Real-Time Netflix Recommendations using Apache Spark (Nitin Sharma and Elliot Chow)](https://i.ytimg.com/vi/IGfvVd-v3P8/default.jpg)
![Real-Time Data Pipelines Made Easy with Structured Streaming in Apache Spark | Databricks](https://i.ytimg.com/vi/wQfm4P23Hew/default.jpg)
![ACID Transactions](https://i.ytimg.com/vi/VRm2UMsFVz0/default.jpg)
![Data Lake Architecture every Data Engineer Looking for | Data Lake Architecture Diagram](https://i.ytimg.com/vi/hEUGBF5wa9Y/default.jpg)
![Running Apache Spark on Kubernetes: Best Practices and Pitfalls](https://i.ytimg.com/vi/3EbTr79wLkU/default.jpg)
![Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Michael Armbrust](https://i.ytimg.com/vi/1a4pgYzeFwE/default.jpg)
![Generating Surrogate Keys for your Data Lakehouse with Spark SQL and Delta Lake](https://i.ytimg.com/vi/aF2hRH5WZAU/default.jpg)
![What is a Data Lake?](https://i.ytimg.com/vi/aC9_fDoMH6M/default.jpg)
![How to begin writing data tests with Great Expectations](https://i.ytimg.com/vi/Cy-ic1kFkuc/default.jpg)
![Big Data | Hadoop and Apache Spark Ecosystem](https://i.ytimg.com/vi/S2TRpVUmqAs/default.jpg)
![Building Data Quality Audit Framework using Delta Lake at Cerner](https://i.ytimg.com/vi/pJO2TK0lR6E/default.jpg)
![Data Management - Data Quality](https://i.ytimg.com/vi/kDOelMaTOuM/default.jpg)
![Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal (Paytm)](https://i.ytimg.com/vi/BWbY3Pny-9k/default.jpg)
![AWS re:Invent 2019: [REPEAT 1] Deep dive into running Apache Spark on Amazon EMR (ANT308-R1)](https://i.ytimg.com/vi/aIwJlfEAlHQ/default.jpg)