Achieving Lakehouse Models with Spark 3.0
It’s very easy to be distracted by the latest and greatest approaches with technology, but sometimes there’s a reason old approaches stand the test of time. Star Schemas & Kimball is one of those things that isn’t going anywhere, but as we move towards the “Data Lakehouse” paradigm – how appropriate is this modelling technique, and how can we harness the Delta Engine & Spark 3.0 to maximise it’s performance?
This session looks through the historical problems of attempting to build star-schemas in a lake and steps through a series of technical examples using features such as Delta file formats, Dynamic Partition Pruning and Adaptive Query Execution to tackle these problems.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform
See all the previous Summit sessions: https://databricks.com/sparkaisummit/north-america/sessions
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks/
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Achieving Lakehouse Models with Spark 3.0 канала Databricks
This session looks through the historical problems of attempting to build star-schemas in a lake and steps through a series of technical examples using features such as Delta file formats, Dynamic Partition Pruning and Adaptive Query Execution to tackle these problems.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform
See all the previous Summit sessions: https://databricks.com/sparkaisummit/north-america/sessions
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks/
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Achieving Lakehouse Models with Spark 3.0 канала Databricks
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Making Apache Spark™ Better with Delta Lake](https://i.ytimg.com/vi/LJtShrQqYZY/default.jpg)
![Why Power BI loves a Star Schema](https://i.ytimg.com/vi/vZndrBBPiQc/default.jpg)
![SQL Analytics and the Lakehouse Architecture | Ali Ghodsi | Keynote Data + AI Summit EU 2020](https://i.ytimg.com/vi/9oYosh-AoX0/default.jpg)
![Structured Data Ingestion with Common Data Model](https://i.ytimg.com/vi/eZa2HCrpb7k/default.jpg)
![Apache Spark Core—Deep Dive—Proper Optimization Daniel Tomes Databricks](https://i.ytimg.com/vi/daXEp4HmS-E/default.jpg)
![Apache Spark & Databricks Tutorial : Read Data From Azure Blob Storage | Shared Access Signature](https://i.ytimg.com/vi/ooBpvdtJifA/default.jpg)
![Tech Talk | Using Delta as a Change Data Capture Source](https://i.ytimg.com/vi/7y0AAQ6qX5w/default.jpg)
![What is a Data Lakehouse? A Simple Explanation for Anyone](https://i.ytimg.com/vi/cnCIoNDaGvg/default.jpg)
![Advancing Spark - Data Lakehouse Star Schemas with Dynamic Partition Pruning!](https://i.ytimg.com/vi/-86iMCKeYxI/default.jpg)
![Announcing Delta Live Tables with Demo | Michael Armbrust | Keynote Data + AI Summit NA 2021](https://i.ytimg.com/vi/fJhlTsh34h4/default.jpg)
![Fully Orchestrating Databricks with Airflow [Presentation]](https://i.ytimg.com/vi/LMv9DoSs_WI/default.jpg)
![Advancing Spark - Crazy Performance with Spark 3 Adaptive Query Execution](https://i.ytimg.com/vi/jlr8_RpAGuU/default.jpg)
![Advancing Databricks: Taking your ETL to the Next Level - Simon Whiteley](https://i.ytimg.com/vi/Vv0H6NxasGg/default.jpg)
![Why Data Warehouse Projects are a Bad Idea](https://i.ytimg.com/vi/N_b6H2Dn9HQ/default.jpg)
![](https://i.ytimg.com/vi/Rou1WqyYpWw/default.jpg)
![Introducing Apache Spark 3.0 | Matei Zaharia and Brooke Wenig | Keynote Spark + AI Summit 2020](https://i.ytimg.com/vi/p4PkA2huzVc/default.jpg)
![Ingest, prepare & transform using Azure Databricks & Data Factory | Azure Friday](https://i.ytimg.com/vi/CZQOxPY7UuA/default.jpg)
![Data Management: The Good, The Bad, The Ugly](https://i.ytimg.com/vi/EXmIxuXSzbA/default.jpg)
![New Developments in the Open Source Ecosystem: Apache Spark 3 0, Delta Lake, and Koalas](https://i.ytimg.com/vi/scM_WQMhB3A/default.jpg)
![Realizing the Vision of the Data Lakehouse | Ali Ghodsi | Keynote Spark + AI Summit 2020](https://i.ytimg.com/vi/g11y-kJHr3I/default.jpg)