Spark 3.0 Features | Dynamic Partition Pruning (DPP) | Avoid Scanning Irrelevant Data
Spark 3.0 has introduced multiple optimization features. Dynamic Partition Pruning (DPP) is one among them, which is an optimization on Star schema queries(data warehouse architecture model). DPP is implemented using Broadcast hashing technique for passing the subquery results of dimension table to fact table before loading the complete data into memory.
Check this video to know more about DPP feature in Spark 3.0
Medium Blog - https://medium.com/@prabhakaran.electric/spark-3-0-feature-dynamic-partition-pruning-dpp-to-avoid-scanning-irrelevant-data-1a7bbd006a89
Check this video to know more about AQE feature in Spark 3.0
https://youtu.be/PiZcQKbomDU
Content By - Prabhakaran Vijayanagulu [LinkedIn - https://www.linkedin.com/in/prabhakaran-vijayanagulu-248ba2118/]
Editing By - Sivaraman Ravi [LinkedIn - https://www.linkedin.com/in/sivaraman-ravi-791838114/]
Facebook Page - https://www.facebook.com/Tech-Island-113793100393638/?modal=admin_todo_tour
Please SUBSCRIBE to our channel :)
Share your feedback with us.
techieeisland@gmail.com
Видео Spark 3.0 Features | Dynamic Partition Pruning (DPP) | Avoid Scanning Irrelevant Data канала Tech Island
Check this video to know more about DPP feature in Spark 3.0
Medium Blog - https://medium.com/@prabhakaran.electric/spark-3-0-feature-dynamic-partition-pruning-dpp-to-avoid-scanning-irrelevant-data-1a7bbd006a89
Check this video to know more about AQE feature in Spark 3.0
https://youtu.be/PiZcQKbomDU
Content By - Prabhakaran Vijayanagulu [LinkedIn - https://www.linkedin.com/in/prabhakaran-vijayanagulu-248ba2118/]
Editing By - Sivaraman Ravi [LinkedIn - https://www.linkedin.com/in/sivaraman-ravi-791838114/]
Facebook Page - https://www.facebook.com/Tech-Island-113793100393638/?modal=admin_todo_tour
Please SUBSCRIBE to our channel :)
Share your feedback with us.
techieeisland@gmail.com
Видео Spark 3.0 Features | Dynamic Partition Pruning (DPP) | Avoid Scanning Irrelevant Data канала Tech Island
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Spark 3.0 Features | Adaptive Query Execution(AQE) | Part 1 - Optimizing SKEW Joins1. Clean way to rename columns in Spark Dataframe | one line code | Spark🌟 Tips 💡Spark Executor Memory Calculation | Number of Executors | Executor Cores | Spark Interview Q&AApache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11Spark performance optimization Part1 | How to do performance optimization in sparkSpark Structured Streaming as a Batch Job? File based data ingestion benefits from pseudo streaming?How to handle Data skewness in Apache Spark using Key Salting TechniqueSpark Interview Question | Partition Pruning | Predicate PushdownSpark Optimization | Bucket Pruning in Spark with Demo | Session-3 | LearntoSparkCatalyst Optimizer In Spark SQL | Spark Interview questions | Bigdata FAQSpark memory allocation and reading large files| Spark Interview QuestionsHandling Skewed Data | Tips on running Spark in Production | Course on Apache Spark Core | Lesson 25PySpark Tutorial: Spark SQL & DataFrame Basics5. eqNullSafe | Equality test that is safe for null values | Apache Spark🌟Tips 💡spark snowflake connector with sample spark/scala codeDelta Lake Features with practical Demo & CDC use case - Part -26. Compare 2 DataFrame using STACK and eqNullSafe to get corrupt records | Apache Spark🌟Tips 💡Tutorial 2-Pyspark With Python-Pyspark DataFrames- Part 1Advancing Spark - Crazy Performance with Spark 3 Adaptive Query ExecutionSpark Scenario Based Interview Question | out of memory