Загрузка...

04 Configure AWS S3 with PySpark Delta Lake

Video explains - How to configure AWS S3 with PySpark ? How to configure AWS S3 with Delta Lake And How to configure boto3 with PySpark Cluster ? It also explains the configuration settings required for Delta Lake and to store the AWS credentials securely in Spark Cluster.

We are going to use AWS Cloud services to design the Data Lakehouse and Processing power of Spark to load & process the data.

Resources:
Ease With Data Github URL - https://github.com/subhamkharwal/ease-with-data
DW with PySpark Github URL - https://github.com/subhamkharwal/ease-with-data/tree/master/dw-with-pyspark
Docker Github URL - https://github.com/subhamkharwal/docker-images

Chapters:
00:00 - Introduction
00:23 - AWS S3 Setup for Data Lakehouse
01:03 - AWS User Group Configuration for Delta Lake S3 Access
01:30 - AWS User Configuration for Delta Lake S3 Access
02:06 - Generate Security Credentials for S3 User
02:40 - Data Lakehouse PySpark Configuration
03:11 - AWS Credentials configuration for boto3
03:57 - Spark Configuration for AWS and Delta Lake
06:01 - AWS Credentials configuration for PySpark Cluster

If you are new to Data Warehousing checkout our playlist on YouTube - https://www.youtube.com/watch?v=HrFMKGGb1gM&list=PL2IsFZBGM_IE-EvpN9gaZZukj-ysFudag

New Video will be uploaded every 3 days. Stay Tuned. Make sure to Like and Subscribe.

#datalakehouse #pyspark #deltalake #datawarehousing #dw #datawarehouse

Видео 04 Configure AWS S3 with PySpark Delta Lake канала Ease With Data
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять