Все видео Новые видео Популярные видео Категории видео

Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

Spark with Python. Operations Supported by Spark RDD API

Spark RDD API Operations introduced in the video:
- Map Transformations.
- Reduce Actions.
- Key-value Pairs.
- Join Transformations.
- Set Operations.

For this I will use Python 3 and Enthought Canopy framework.

More about each Spark RDD API Operation in the context of the video:

MAP TRANSFORMATIONS. Applies a transformation that returns words RDD mapped to Uppercase.

REDUCE ACTIONS: An action is a computation that returns a value after running one or more operations on the dataset. An example of an action is the reduce function, which takes two elements from the dataset and applies some computations.

KEY-VALUE PAIRS: We can define a key-value pair by using a tuple in the format(key, value).

JOIN TRANSFORMATIONS: Join transformations take two datasets and creates another one by joining the two initial datasets by key. We can use leftOuterJoin, rightOuterJoin, and fullOuterJoin to perform specific types of join. Those ones are standard SQL Joining Types.

SET OPERATIONS: We can perform common set operations such as unions and intersection between RDDs. I introducing Intersect and Union operations for very simple datasets.

Those RDD API Operations can be used in the same way as standard MapReduce commands for Big Data datasets.

Vytautas Bielinskas

Видео Spark with Python. Operations Supported by Spark RDD API канала Data Science Garage

Показать

Комментарии отсутствуют