Загрузка...

How to integrate Great Expecation Data Quality tests in Airflow? | Data pipeline | Data Quality

In this video, we will cover how to integrate Great Expectation Data Quality tests in Apache Airflow. In this session, we will use the Great Expectation (GE) provider for Airlow and run the Great Expectations suite. Our data asset will be a PostgreSQL table.

In this tutorial, we will see how to test an ETL Pipeline with Great Expecations using Python. It is essential to test the quality of data before it lands in our production systems. We will focus on Product dimension and employ various built-in GE Data Quality tests.

Links to related sessions.

Link to GitHub (Updated DAG):
https://github.com/hnawaz007/pythondataanalysis/tree/main/AirflowSession2/dags

Airflow Installation & Configuration with custom image:
https://www.youtube.com/watch?v=In7zwp0FDX4

In the custom image we add the following line to install GE provider:
&& pip install airflow-provider-great-expectations

Orchestrate SQL Data Pipelines with Airflow:
https://www.youtube.com/watch?v=glzj7p7Yrrs

How to test your Data Pipelines with Great Expectations:
https://www.youtube.com/watch?v=7UQ91Ib7PtU

How to create Great Epxectations suite?
https://www.youtube.com/watch?v=UTIvGxNbg5w

Link to GE Expectations notebook:
https://github.com/hnawaz007/pythondataanalysis/blob/main/ETL%20Pipeline/GreatExpectations/Great%20Expectations%20Data%20Quality%20Tests.ipynb

Link to GE suite used in the vidoe:
https://github.com/hnawaz007/pythondataanalysis/tree/main/ETL%20Pipeline/GreatExpectations
Link to Channel's site:
https://hnawaz007.github.io/
--------------------------------------------------------------

💥Subscribe to our channel:
https://www.youtube.com/c/HaqNawaz

📌 Links
-----------------------------------------
#️⃣ Follow me on social media! #️⃣

🔗 GitHub: https://github.com/hnawaz007
📸 Instagram: https://www.instagram.com/bi_insights_inc
📝 LinkedIn: https://www.linkedin.com/in/haq-nawaz/
🔗 https://medium.com/@hnawaz100
🚀 https://hnawaz007.github.io/

-----------------------------------------

#ETL #dataquality #Airflow

Topics in this video (click to jump around):
==================================
0:00 - Introduction to Great Expectations Data Quality
0:49 - Prerequisites
1:16 - Create Great Expectation suite
1:46 - Review Great Expectation Data Quality Tests
2:29 - Airflow DAG
2:49 - Integrate Great Expectations Data Quality in Airflow
3:34 - Airflow UI: Dag review & run
3:56 - DAG logs: review Data Quality test run

Видео How to integrate Great Expecation Data Quality tests in Airflow? | Data pipeline | Data Quality канала BI Insights Inc
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять