Загрузка...

Troubleshooting and Solving Data Join Pitfalls (must GCP lab for data engineering course)

Overview
BigQuery is Google's fully managed, NoOps, low cost analytics database. With BigQuery you can query terabytes and terabytes of data without having any infrastructure to manage or needing a database administrator. BigQuery uses SQL and can take advantage of the pay-as-you-go model. BigQuery allows you to focus on analyzing data to find meaningful insights.

Joining data tables can provide meaningful insight into your dataset. However, when you join your data there are common pitfalls that could corrupt your results. This lab focuses on avoiding those pitfalls. Types of joins:

Cross join: combines each row of the first dataset with each row of the second dataset, where every combination is represented in the output.
Inner join: requires that key values exist in both tables for the records to appear in the results table. Records appear in the merge only if there are matches in both tables for the key values.
Left join: Each row in the left table appears in the results, regardless of whether there are matches in the right table.
Right join: the reverse of a left join. Each row in the right table appears in the results, regardless of whether there are matches in the left table.
For more information about joins, refer to the Join Page.

The dataset you'll use is an ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery. You have a copy of that dataset for this lab and will explore the available fields and row for insights.

For syntax information to help you follow and update the queries, see Standard SQL Query Syntax.

What you'll do
In this lab, you learn how to:

Use BigQuery to explore and troubleshoot duplicate rows in a dataset.
Create joins between data tables.
Choose between different join types.
Setup and requirements
#analytics #quiklab #google #courseraquizanswrs #courseracertificate #googlecertification #googlecertificationcourses #cloud #GSP412 #qwiklab #bigquery

Видео Troubleshooting and Solving Data Join Pitfalls (must GCP lab for data engineering course) канала OneStop Insightful Tech Notes
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки