Process Excel files in Azure with Data Factory and Databricks | Tutorial
Excel files are one of the most commonly used file format on the market. Popularity of the tool itself among the business users, business analysts and data engineers is driven by its flexibility, ease of use, powerful integration features and low price.
This is why every data engineer out there should be to understand advantages and disadvantages of this format. The variety of different internal formats like XLS, XLSX, XLSB and XLSM and which tools to use in order to process those files effectively in the cloud.
Today I bring to you a quick introduction to the process of building ETL solutions with Excel files in Azure using Data Factory and Databricks services.
Code samples: https://github.com/MarczakIO/azure4everyone-samples/tree/master/azure-excel-file-processing-with-data-factory-and-databricks
Agenda
00:00 Introduction
00:25 Excel Business Justification
01:22 Excel Challenges
02:20 Supported Services
04:30 Data Factory Introduction
05:35 Demo Setup
07:13 Demo using Data Factory
13:36 Databricks Introduction
14:44 Databricks Setup
18:14 Databricks Demo - Reading Excels
20:55 Databricks Demo - Reading Excels using References
25:56 Databricks Demo - Workbook Metadata
28:05 Databricks Demo - Defining Schema
30:03 Databricks Demo - Defining Schema
32:53 Additional Options
Next steps for you after watching the video
1. Excel format in Data Factory
- https://docs.microsoft.com/en-us/azure/data-factory/format-excel
2. Spark Excel by Crealytics documentation
- https://github.com/crealytics/spark-excel
### Want to connect?
- Blog https://marczak.io/
- Twitter https://twitter.com/MarczakIO
- Facebook https://www.facebook.com/MarczakIO
- LinkedIn https://www.linkedin.com/in/adam-marczak/
- Site https://azure4everyone.com
Видео Process Excel files in Azure with Data Factory and Databricks | Tutorial канала Adam Marczak - Azure for Everyone
This is why every data engineer out there should be to understand advantages and disadvantages of this format. The variety of different internal formats like XLS, XLSX, XLSB and XLSM and which tools to use in order to process those files effectively in the cloud.
Today I bring to you a quick introduction to the process of building ETL solutions with Excel files in Azure using Data Factory and Databricks services.
Code samples: https://github.com/MarczakIO/azure4everyone-samples/tree/master/azure-excel-file-processing-with-data-factory-and-databricks
Agenda
00:00 Introduction
00:25 Excel Business Justification
01:22 Excel Challenges
02:20 Supported Services
04:30 Data Factory Introduction
05:35 Demo Setup
07:13 Demo using Data Factory
13:36 Databricks Introduction
14:44 Databricks Setup
18:14 Databricks Demo - Reading Excels
20:55 Databricks Demo - Reading Excels using References
25:56 Databricks Demo - Workbook Metadata
28:05 Databricks Demo - Defining Schema
30:03 Databricks Demo - Defining Schema
32:53 Additional Options
Next steps for you after watching the video
1. Excel format in Data Factory
- https://docs.microsoft.com/en-us/azure/data-factory/format-excel
2. Spark Excel by Crealytics documentation
- https://github.com/crealytics/spark-excel
### Want to connect?
- Blog https://marczak.io/
- Twitter https://twitter.com/MarczakIO
- Facebook https://www.facebook.com/MarczakIO
- LinkedIn https://www.linkedin.com/in/adam-marczak/
- Site https://azure4everyone.com
Видео Process Excel files in Azure with Data Factory and Databricks | Tutorial канала Adam Marczak - Azure for Everyone
Показать
Комментарии отсутствуют
Информация о видео
21 июля 2020 г. 19:00:12
00:34:14
Другие видео канала
![Azure Databricks Tutorial | Data transformations at scale](https://i.ytimg.com/vi/M7t1T1Q5MNc/default.jpg)
![Azure Data Factory Tutorial | Introduction to ETL in Azure](https://i.ytimg.com/vi/EpDkxTHAhOs/default.jpg)
![Azure Key Vault Tutorial | Secure secrets, keys and certificates easily](https://i.ytimg.com/vi/PgujSug1ZbI/default.jpg)
![Azure Data Factory | Copy multiple tables in Bulk with Lookup & ForEach](https://i.ytimg.com/vi/KsO2FHQdILs/default.jpg)
![Azure Analysis Services Tutorial | Scale Power BI reports into hundreds of GBs](https://i.ytimg.com/vi/4Fv6cHdL8S0/default.jpg)
![Azure Storage Tutorial | Introduction to Blob, Queue, Table & File Share](https://i.ytimg.com/vi/UzTtastcBsk/default.jpg)
![Azure Function Apps Tutorial | Introduction for serverless programming](https://i.ytimg.com/vi/Vxf-rOEO1q4/default.jpg)
![Azure Data Lake Storage (Gen 2) Tutorial | Best storage solution for big data analytics in Azure](https://i.ytimg.com/vi/2uSkjBEwwq0/default.jpg)
![Azure SQL Database Tutorial | Relational databases in Azure](https://i.ytimg.com/vi/BgvEOkcR0Wk/default.jpg)
![Azure Application Insights Tutorial | Amazing telemetry service](https://i.ytimg.com/vi/A0jAeGf2zUQ/default.jpg)
![Azure Data Factory Parametrization Tutorial](https://i.ytimg.com/vi/pISBgwrdxPM/default.jpg)
![Azure Data Factory Custom Email Notifications Tutorial](https://i.ytimg.com/vi/zyqf8e-6u4w/default.jpg)
![Azure Logic Apps Tutorial](https://i.ytimg.com/vi/ZvsOzji_8ow/default.jpg)
![Azure Databricks Secret Scopes Tutorial | Secure your notebook secrets](https://i.ytimg.com/vi/9VzBS4OiP_A/default.jpg)
![AZ-900 Episode 5 | IaaS vs PaaS vs SaaS cloud service models | Microsoft Azure Fundamentals Course](https://i.ytimg.com/vi/9CVBohl6w0Q/default.jpg)
![Azure Table Storage Tutorial | Easy and scalable NoSQL database](https://i.ytimg.com/vi/HSL1poL1VR0/default.jpg)
![ARM Templates Tutorial | Infrastructure as Code (IaC) for Beginners | Azure Resource Manager](https://i.ytimg.com/vi/Ge_Sp-1lWZ4/default.jpg)
![Azure Databricks - What, Why & How](https://i.ytimg.com/vi/oLGRsuxpVMs/default.jpg)
![Azure Files Tutorial | Easy file shares in the cloud](https://i.ytimg.com/vi/BCzeb0IAy2k/default.jpg)