Загрузка страницы

Spark 3.0 Features | Dynamic Partition Pruning (DPP) | Avoid Scanning Irrelevant Data

Spark 3.0 has introduced multiple optimization features. Dynamic Partition Pruning (DPP) is one among them, which is an optimization on Star schema queries(data warehouse architecture model). DPP is implemented using Broadcast hashing technique for passing the subquery results of dimension table to fact table before loading the complete data into memory.
Check this video to know more about DPP feature in Spark 3.0
Medium Blog - https://medium.com/@prabhakaran.electric/spark-3-0-feature-dynamic-partition-pruning-dpp-to-avoid-scanning-irrelevant-data-1a7bbd006a89

Check this video to know more about AQE feature in Spark 3.0
https://youtu.be/PiZcQKbomDU

Content By - Prabhakaran Vijayanagulu [LinkedIn - https://www.linkedin.com/in/prabhakaran-vijayanagulu-248ba2118/]
Editing By - Sivaraman Ravi [LinkedIn - https://www.linkedin.com/in/sivaraman-ravi-791838114/]
Facebook Page - https://www.facebook.com/Tech-Island-113793100393638/?modal=admin_todo_tour

Please SUBSCRIBE to our channel :)

Share your feedback with us.
techieeisland@gmail.com

Видео Spark 3.0 Features | Dynamic Partition Pruning (DPP) | Avoid Scanning Irrelevant Data канала Tech Island
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
28 июля 2020 г. 19:58:51
00:07:24
Другие видео канала
Spark 3.0 Features | Adaptive Query Execution(AQE) | Part 1 - Optimizing SKEW JoinsSpark 3.0 Features | Adaptive Query Execution(AQE) | Part 1 - Optimizing SKEW Joins1. Clean way to rename columns in Spark Dataframe | one line code | Spark🌟 Tips 💡1. Clean way to rename columns in Spark Dataframe | one line code | Spark🌟 Tips 💡Spark Executor Memory Calculation | Number of Executors | Executor Cores | Spark Interview Q&ASpark Executor Memory Calculation | Number of Executors | Executor Cores | Spark Interview Q&AApache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11Apache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11Spark performance optimization Part1 | How to do performance optimization in sparkSpark performance optimization Part1 | How to do performance optimization in sparkSpark Structured Streaming as a Batch Job? File based data ingestion benefits from pseudo streaming?Spark Structured Streaming as a Batch Job? File based data ingestion benefits from pseudo streaming?How to handle Data skewness in Apache Spark using Key Salting TechniqueHow to handle Data skewness in Apache Spark using Key Salting TechniqueSpark Interview Question | Partition Pruning | Predicate PushdownSpark Interview Question | Partition Pruning | Predicate PushdownSpark Optimization | Bucket Pruning in Spark with Demo | Session-3 | LearntoSparkSpark Optimization | Bucket Pruning in Spark with Demo | Session-3 | LearntoSparkCatalyst Optimizer In Spark SQL | Spark Interview questions | Bigdata FAQCatalyst Optimizer In Spark SQL | Spark Interview questions | Bigdata FAQSpark memory allocation and reading large files| Spark Interview QuestionsSpark memory allocation and reading large files| Spark Interview QuestionsHandling Skewed Data | Tips on running Spark in Production | Course on Apache Spark Core | Lesson 25Handling Skewed Data | Tips on running Spark in Production | Course on Apache Spark Core | Lesson 25PySpark Tutorial: Spark SQL & DataFrame BasicsPySpark Tutorial: Spark SQL & DataFrame Basics5. eqNullSafe | Equality test that is safe for null values | Apache Spark🌟Tips 💡5. eqNullSafe | Equality test that is safe for null values | Apache Spark🌟Tips 💡spark snowflake connector with sample spark/scala codespark snowflake connector with sample spark/scala codeDelta Lake Features with practical Demo & CDC use case - Part -2Delta Lake Features with practical Demo & CDC use case - Part -26. Compare 2 DataFrame using STACK and eqNullSafe to get corrupt records | Apache Spark🌟Tips 💡6. Compare 2 DataFrame using STACK and eqNullSafe to get corrupt records | Apache Spark🌟Tips 💡Tutorial 2-Pyspark With Python-Pyspark DataFrames- Part 1Tutorial 2-Pyspark With Python-Pyspark DataFrames- Part 1Advancing Spark - Crazy Performance with Spark 3 Adaptive Query ExecutionAdvancing Spark - Crazy Performance with Spark 3 Adaptive Query ExecutionSpark Scenario Based Interview Question | out of memorySpark Scenario Based Interview Question | out of memory
Яндекс.Метрика