Generating Surrogate Keys for your Data Lakehouse with Spark SQL and Delta Lake
For this tech chat, we will discuss a popular data warehousing fundamental - surrogate keys. As we had discussed in various other Delta Lake tech talks, the reliability brought to data lakes by Delta Lake has brought a resurgence of many of the data warehousing fundamentals such as Change Data Capture in data lakes. Surrogate keys are unique and lack any business context so they can stand the test of time when joining domain (or dimensional) and fact data. This can be difficult in single-node systems and can be even more complex for distributed systems. In this session, we will discuss the history and value of surrogate keys and what are the requirements for good strategies to implement this data warehousing fundamental into your Delta Lake.
You can find the notebooks for this video at: https://github.com/databricks/tech-talks/tree/master/2020-08-25%20%7C%20Generating%20Surrogate%20Keys%20for%20your%20Data%20Lakehouse%20with%20Spark%20SQL%20and%20Delta%20Lake Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Generating Surrogate Keys for your Data Lakehouse with Spark SQL and Delta Lake канала Databricks
You can find the notebooks for this video at: https://github.com/databricks/tech-talks/tree/master/2020-08-25%20%7C%20Generating%20Surrogate%20Keys%20for%20your%20Data%20Lakehouse%20with%20Spark%20SQL%20and%20Delta%20Lake Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Generating Surrogate Keys for your Data Lakehouse with Spark SQL and Delta Lake канала Databricks
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![](https://i.ytimg.com/vi/Rou1WqyYpWw/default.jpg)
![Strategies to create Surrogate keys in Spark-Databricks](https://i.ytimg.com/vi/I6_XGS-jHwM/default.jpg)
![Realizing the Vision of the Data Lakehouse | Ali Ghodsi | Keynote Spark + AI Summit 2020](https://i.ytimg.com/vi/g11y-kJHr3I/default.jpg)
![Tech Chat | Slowly Changing Dimensions (SCD) Type 2](https://i.ytimg.com/vi/HZWwZG07hzQ/default.jpg)
![Why Surrogate Keys are used in Data Warehouse](https://i.ytimg.com/vi/OjoOSWusBsc/default.jpg)
![Primary key and Surrogate key in DATABASE](https://i.ytimg.com/vi/WoCUuy0XpT0/default.jpg)
![Power BI Drill Through - Easy And Fast! || Zebra BI Knowledge Base](https://i.ytimg.com/vi/cgCYFRyZfwo/default.jpg)
![Advancing Spark - Give your Delta Lake a boost with Z-Ordering](https://i.ytimg.com/vi/A1aR1A8OwOU/default.jpg)
![What is a CDN and why Developers should Care about using one • Artur Bergman • GOTO 2016](https://i.ytimg.com/vi/farO15_0NUQ/default.jpg)
![Data Collab Lab: Automate Data Pipelines with PySpark SQL](https://i.ytimg.com/vi/QpVsP9Y7qIg/default.jpg)
![Building a Data Warehouse Dimensional Model using Azure Synapse Analytics SQL Serverless](https://i.ytimg.com/vi/ayuHbfHJT0o/default.jpg)
![Delta Lake for Apache Spark - Why do we need Delta Lake for Spark?](https://i.ytimg.com/vi/0GhFAzN4qs4/default.jpg)
![An Introduction to Delta Lakes and Delta Lake Houses](https://i.ytimg.com/vi/2KMTIU9Gksk/default.jpg)
![Koalas: Pandas on Apache Spark](https://i.ytimg.com/vi/iUpBSHoqzLM/default.jpg)
![Accelerating Shuffle: A Tailor Made RDMA Solution for Apache Spark - Yuval Degani](https://i.ytimg.com/vi/PQz_6VDAHO4/default.jpg)
![The ONLY PySpark Tutorial You Will Ever Need.](https://i.ytimg.com/vi/cZS5xYYIPzk/default.jpg)
![Databricks Platform Features - Deep Dive into Delta Lake using Spark SQL](https://i.ytimg.com/vi/RuMgec50adA/default.jpg)
![Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginners | Simplilearn](https://i.ytimg.com/vi/agqn_-KN4hU/default.jpg)
![Why Data Warehouse Projects are a Bad Idea](https://i.ytimg.com/vi/N_b6H2Dn9HQ/default.jpg)
![Basic Concept of Database Normalization - Simple Explanation for Beginners](https://i.ytimg.com/vi/xoTyrdT9SZI/default.jpg)