Загрузка...

DE Zoomcamp 2.2.3 - ETL Pipelines with Postgres in Kestra

In this video, we'll cover how you can ingest the Yellow Taxi data from the NYC Taxi and Limousine Commission (TLC) and load it into a Postgres database. We'll cover how to extract data from CSV files, and load them into a local Postgres database running in a Docker container.

Check out the GitHub Repository here: https://go.kestra.io/de-zoomcamp

Check out the FAQ here: https://youtu.be/ywAPYNYFaB4

Kestra is an open-source, event-driven orchestration platform that makes both scheduled and event-driven workflows easy. By bringing Infrastructure as Code best practices to data, process, and microservice orchestration, you can build reliable workflows directly from the UI in just a few lines of YAML.

The course will cover the basics of workflow orchestration, why it's important, and how it can be used to build data engineering pipelines.

Chapters
0:00 - Introduction
0:58 - Create Workflow
1:46 - Add Inputs
2:58 - Create Dynamic Variables
3:59 - Set Labels at Execution
4:10 - Download and Uncompress CSV file
4:33 - Execute Extract Process
4:57 - Set up Postgres DB
5:46 - Create Postgres DB Table
9:55 - View DB with pgAdmin
10:47 - Load Data into Table
12:45 - Add Unique ID
15:53 - Truncate Staging Table
16:47 - Merge Data from Staging Table
19:31 - Purge Output Files
20:24 - Add If Statement to process Yellow or Green datasets
27:24 - Summary

----------

📖 Read the documentation: https://go.kestra.io/docs
⭐ Start your journey with Kestra: https://go.kestra.io/github
🚀 Join the Kestra Community: https://go.kestra.io/slack

For more information, visit Kestra's Website: https://go.kestra.io/

Видео DE Zoomcamp 2.2.3 - ETL Pipelines with Postgres in Kestra канала Kestra
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять