Apache Spark : Commonly used Transformations : Map, Filter, Flatmap Transformations
Official Website: http://bigdataelearning.com
Learning Objectives :: In this module, you will learn some of the commonly used transformations. You will learn some of the basic RDD transformations like Map, Filter, and Flatmap transformations.
This video also shows how to apply the Map, Filter, and Flatmap transformation using Scala & python.
Topics :: Commonly used Transformations, Basic RDD Transformations, Map, Filter, Flatmap Transformations
Commonly used Transformations
=============================
Basic RDD Transformations,
=========================
map, filter, and flatmap are some of the basic RDD transformations
Map
====
Map is a transformation , that takes a function and applies the function to each elements of the input RDD.
The result of the function , will become the value of each element , in the resultant RDD.
say, If inputRDD contains values 1 to 4, then map transformation to square the values will return {1,4,9,16} as the resultant RDD
Here the square function is applied to each elements of the inputRDD
Filter
====
Filter is a transformation , that returns a new RDD , with only the elements that passes the filter condition.
say, if inputRDD contains values 1 to 3, then applying transformation to filter elements that are not '1' , will return only '2' and '3' as the resultant RDD.
Flatmap
=======
Flatmap() is the transformation that takes a function , and applies the function to each elements of the RDD as in map() function.
The difference is that flatmap will return multiple values for each element in the source RDD.
say, if inputRDD contains the values {"hello world" and "how are you"} then applying split function, to flatmap transformation, will
return an array of words like {"hello","world","how","are","you"}. Since the flatmap transformation returns multiple values for each element, there are 5 elements in the resultantRDD, where as inputRDD has only 2 elements.
To recollect , if we apply the split function to Map transformation instead of Flatmap transformation, then we will get the values as the highlighted ones.
Here the words will be splitted into multiple words, however the words belonging to an element of source RDD is still a single element in the resultant RDD.
Видео Apache Spark : Commonly used Transformations : Map, Filter, Flatmap Transformations канала BigDataElearning
Learning Objectives :: In this module, you will learn some of the commonly used transformations. You will learn some of the basic RDD transformations like Map, Filter, and Flatmap transformations.
This video also shows how to apply the Map, Filter, and Flatmap transformation using Scala & python.
Topics :: Commonly used Transformations, Basic RDD Transformations, Map, Filter, Flatmap Transformations
Commonly used Transformations
=============================
Basic RDD Transformations,
=========================
map, filter, and flatmap are some of the basic RDD transformations
Map
====
Map is a transformation , that takes a function and applies the function to each elements of the input RDD.
The result of the function , will become the value of each element , in the resultant RDD.
say, If inputRDD contains values 1 to 4, then map transformation to square the values will return {1,4,9,16} as the resultant RDD
Here the square function is applied to each elements of the inputRDD
Filter
====
Filter is a transformation , that returns a new RDD , with only the elements that passes the filter condition.
say, if inputRDD contains values 1 to 3, then applying transformation to filter elements that are not '1' , will return only '2' and '3' as the resultant RDD.
Flatmap
=======
Flatmap() is the transformation that takes a function , and applies the function to each elements of the RDD as in map() function.
The difference is that flatmap will return multiple values for each element in the source RDD.
say, if inputRDD contains the values {"hello world" and "how are you"} then applying split function, to flatmap transformation, will
return an array of words like {"hello","world","how","are","you"}. Since the flatmap transformation returns multiple values for each element, there are 5 elements in the resultantRDD, where as inputRDD has only 2 elements.
To recollect , if we apply the split function to Map transformation instead of Flatmap transformation, then we will get the values as the highlighted ones.
Here the words will be splitted into multiple words, however the words belonging to an element of source RDD is still a single element in the resultant RDD.
Видео Apache Spark : Commonly used Transformations : Map, Filter, Flatmap Transformations канала BigDataElearning
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Apache Spark Architecture : Run Time Architecture of Spark Application](https://i.ytimg.com/vi/rJFg2i_auAg/default.jpg)
![Apache Spark Components : Different Components in Spark Framework](https://i.ytimg.com/vi/m4pYYnY4_gU/default.jpg)
![Apache Spark RDD operations : Transformations and Actions](https://i.ytimg.com/vi/9MeMWdILl5Q/default.jpg)
![Azure Databricks Tutorial | Data transformations at scale](https://i.ytimg.com/vi/M7t1T1Q5MNc/default.jpg)
![Broadcast vs Accumulator Variable - Broadcast Join & Counters - Apache Spark Tutorial For Beginners](https://i.ytimg.com/vi/AyfuUQtfWFY/default.jpg)
![Data Wrangling with PySpark for Data Scientists Who Know Pandas - Andrew Ray](https://i.ytimg.com/vi/XrpSRCwISdk/default.jpg)
![Transformations of Functions | Reflection, Stretch, and Compress](https://i.ytimg.com/vi/kUaaEIOFXtE/default.jpg)
![Persistence and storage levels](https://i.ytimg.com/vi/oCZbhSdu2jI/default.jpg)
![Get Rid of Traditional ETL, Move to Spark! (Bas Geerdink)](https://i.ytimg.com/vi/vZhSbs1xLx4/default.jpg)
![Streaming Event-Time Partitioning With Apache Flink and Apache Iceberg - Julia Bennett](https://i.ytimg.com/vi/-Q4UcXcIv1o/default.jpg)
![Spark Tutorials - Spark Dataframe | Deep dive](https://i.ytimg.com/vi/PUSuU2OrfCc/default.jpg)
![Spark Broadcast variable](https://i.ytimg.com/vi/irn2Ow4QhWM/default.jpg)
![Spark Accumulators](https://i.ytimg.com/vi/Nrxu7DfKy5w/default.jpg)
![Map, Filter & Reduce | Python Tutorials For Absolute Beginners In Hindi #48](https://i.ytimg.com/vi/zimHDlOpGXo/default.jpg)
![Spark Tutorial For Beginners | Big Data Spark Tutorial | Apache Spark Tutorial | Simplilearn](https://i.ytimg.com/vi/QaoJNXW6SQo/default.jpg)
![Spark Client Mode Vs Cluster Mode - Apache Spark Tutorial For Beginners](https://i.ytimg.com/vi/RCyPU7fbxko/default.jpg)
![RDDs, DataFrames and Datasets in Apache Spark - NE Scala 2016](https://i.ytimg.com/vi/pZQsDloGB4w/default.jpg)
![Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | Edureka](https://i.ytimg.com/vi/Mxw6QZk1CMY/default.jpg)
![Run Spark Application(Scala) on Amazon EMR (Elastic MapReduce) cluster](https://i.ytimg.com/vi/An8tw4lEkaI/default.jpg)
![Map, Filter, and Reduce Functions || Python Tutorial || Learn Python Programming](https://i.ytimg.com/vi/hUes6y2b--0/default.jpg)