Загрузка страницы

Data Engineering Course For Beginners - #2 TRANSFORM

This is the second part of the Free Data Engineering Course for Beginners that I've decided to create for you! Over the course of the four videos, we are going to cover the entire ETL process (extract, transform, load), and at the end we are also going to talk about job scheduling.

In this course you will build your first data feed (or data pipeline) using Spotify API. This feed will run daily, and it will download the data about the songs that you listened to during a day, and save that data in a SQLite database on your local machine.

In this video we are going to cover the Transform stage of the ETL process, which means that we will be learning how to validate the data that we received from a data vendor (Spotify in this case). We'll check for empty files, null values, stale data and duplicates! Along the way I will also explain some basic data engineering concepts such as a primary key constraint, or "garbage in, garbage out" principle.

Follow this link to generate your Spotify API token:
https://developer.spotify.com/console/get-recently-played/

Find the code with this data engineering project on GitHub:
https://github.com/karolina-sowinska/free-data-engineering-course-for-beginners/blob/master/main.py

Music:
What Now - Golden Age Radio

Connect with me on Instagram:
@karo_sowinska

And if you want to make my day with a cup of coffee... :)
https://ko-fi.com/karolina_sowinska

Видео Data Engineering Course For Beginners - #2 TRANSFORM канала Karolina Sowinska
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
27 августа 2020 г. 22:00:08
00:07:28
Яндекс.Метрика