Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

6. PySpark Interview Task | Deloitte, KPMG, Accenture, PwC, Deutsche Bank Data Engineer Preparation

6. PySpark Interview Task | Deloitte, KPMG, Accenture, PwC, Deutsche Bank Data Engineer Preparation

🚀 Are you preparing for Data Engineering interviews at Deloitte, KPMG, Accenture, PwC, Deutsche Bank, or other top MNCs? This video is a must-watch!

In this session, we cover a real-world PySpark coding task that is frequently asked in Data Engineer interviews. This is essential for candidates aiming for roles in top consulting and banking firms!

🔹 What You’ll Learn in This Video:

✅ Read a CSV file in PySpark
✅ Define Schema for structured data
✅ Create a new full_name column (first_name + last_name)
✅ Apply conditional logic for the address_new field (pin == 1111)
✅ Handle NULL values & update the address column
✅ Use PySpark functions like concat_ws(), when(), col()

📌 This is a common PySpark coding assignment in top MNCs like Deloitte, PwC, Accenture, KPMG, Deutsche Bank, EY, and more! Watch till the end to ace your interview! 💯

⏱️ Timestamps for Quick Navigation

0:00 – Introduction & Why This is Important
2:35 – CSV File in PySpark
3:10 – Defining Schema for DataFrame
5:49 - Create Pyspark DataFrame
7:13 – Creating full_name Column
8:16 – Applying Conditional Logic on address_new
9:37 – Handling NULL Values in address Column
10:56 – Final Output

🔥 Like, Subscribe, and Hit the Bell Icon 🔔 for more Data Engineering content!

📂 Resources & Code :
📌 Code & Dataset:

from pyspark.sql.functions import concat_ws, col, when
from pyspark.sql.types import StructType,StructField, StringType, IntegerType

file_path = "dbfs:/FileStore/tables/data-1.csv"

schema = StructType([
StructField('first_name',StringType(),True),
StructField('last_name',StringType(),True),
StructField('pin',IntegerType(),True),
StructField('address',StringType(),True)
])

df = spark.read.csv(file_path,schema=schema,header=True)
#Download data from : https://github.com/Rushi21-kesh/YouTube-Question-Dataset/blob/main/data-1.csv

result = df.withColumn('name',concat_ws(' ','first_name','last_name'))\
.withColumn('address_new',when(col('pin').isin(1111),col('address')).otherwise(None))\
.withColumn('address',when(col('address').isNotNull(),col('address')).otherwise("Unknown"))
result.display()

📌 Full PySpark Playlist: https://youtube.com/playlist?list=PLP3N3nYQOEKu0A2-Zzt5c-C7DbdU8uDPe&feature=shared

#pyspark #dataengineering #dataengineeringessentials #bigdata #deloitte #deloittejobs
#kpmg #accentureinterview #deutschebank
#databricks #databrickstutorial #apachespark #etl #sparksql #azuredataengineer #datapipeline
#dataprocessing #python #pythonprogramming #sql #sparksql #dataanalytics #cloudcomputing #ai
#MachineLearning #SparkStreaming #BigDataAnalytics #TechInterviews #CodingInterview
#PySparkTutorial #PySparkInterviewQuestions #FAANGInterviews #DataScience #GenAI

Видео 6. PySpark Interview Task | Deloitte, KPMG, Accenture, PwC, Deutsche Bank Data Engineer Preparation канала The Data Engineering Edge

Deloitte Data Engineering coding 2025 Data Engineering Coding 2025 Pyspark Round 1 Deloitte important question Deloitte Pysaprk coding questions 2025 KPMG 2025 KPIT 2025 Pyspark important questions most ask pysaprk coding questions case when transformations reading CSV file in pyspark databricks coding questions deloitte most ask interview questions most ask data engineering question in 2025 expected interview Question azure data engineering

Информация о видео

21 февраля 2025 г. 17:02:18

00:11:44

The Data Engineering Edge

Теги

Правообладателям

Жалоба на материал Недопустимый материал Нарушение авторских прав

Комментарии

Другие видео канала

6. PySpark Interview Task | Deloitte, KPMG, Accenture, PwC, Deutsche Bank Data Engineer Preparation

#2 🚀Getting Started with Databricks: Create Your First Notebook & Cluster Setup

3. PySpark Scenario based coding Interview Questions for Data Engineers | Real Coding Challenge 2025

4. Master Spark:Filter Employees Holding Only Active Laptop | PySpark Interview Question Metyis 2025

5. 🚀 Crypto Price Trend Analysis with PySpark | Stock Market & E-commerce Use Cases 2025

7. 🔥 PySpark Interview Preparation: Flatten & Explode Explained with Example | PySpark Tutorial

#4 🔥Databricks Catalog, Databases & DBFS Explained | Database table vs DBFS #databricks #catalog

#1🔥Databricks is FREE for Everyone – Don't Miss Out! #databricks #dataengineering #free #2025 #ai

#3 🚀 Databricks Workspace & Notebooks Explained | Create Notebook & Cluster | Hands-on Demo 2025

1. PySpark Coding Questions Accenture 2025 | Data Platform Engineer Hiring

2. PySpark coding Databricks Scenario Based Question | Interview Codding Question 2025