Загрузка страницы

Developing PySpark Applications Best Practices ✅ How To Structure Your PySpark Jobs and Code

Developing production suitable PySpark applications is very similar to normal Python applications or packages. It’s quite similar to writing command-line apps, you simply execute the script against the cluster.

00:00 Writing Spark Applications with Python
01:04 PySpark App example
02:04 Create a virtual environment with Pipenv in the same project directory
03:29 Install PySpark
04:08 How to distribute files to the cluster together with the application
05:48 Passing the SparkSession at runtime
07:12 Using spark-submit to run the app
08:02 Summary
08:52 Thank you

To facilitate code reuse, it's common to package multiple Python files into zip files and to include those files, you can use the --py-files argument of spark-submit to add .py, .zip, or .egg files to be distributed with your application. I prefer .zip files but you have options.

When it’s time to run your Spark code, you specify a certain script as an executable script that builds the SparkSession. This is the one that we will pass as the main argument to spark-submit, together with any arguments needed.

You can get the PySpark App Project Template from here:
https://gitlab.com/radufotolescu/pyspark-app

🎁 1 MONTH FREE TRIAL! Financial and Alternative Datasets for today's Data Analysts & Scientists:
https://www.decisionforest.com/accounts/signup/

📚 RECOMMENDED DATA SCIENCE BOOKS:
https://www.amazon.com/shop/decisionforest

✅ Subscribe and support us:
https://www.youtube.com/decisionforest?sub_confirmation=1

💻 Data Science resources I strongly recommend:
https://radufotolescu.com/#resources

🌐 Let's connect:
https://radufotolescu.com/#contact

-

At DecisionForest we serve both retail and institutional investors by providing them with the data necessary to make better decisions:
https://www.decisionforest.com

#DecisionForest

Видео Developing PySpark Applications Best Practices ✅ How To Structure Your PySpark Jobs and Code канала DecisionForest
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
11 мая 2020 г. 12:00:08
00:09:30
Яндекс.Метрика