Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with Delta Lake
Columbia is a data-driven enterprise, integrating data from all line-of-business-systems to manage its wholesale and retail businesses. This includes integrating real-time and batch data to better manage purchase orders and generate accurate consumer demand forecasts. It also includes analyzing product reviews to increase customer satisfaction. In this presentation, we’ll walk through how we achieved a 70% reduction in pipeline creation time and reduced ETL workload times from four hours with previous data warehouses to minutes using Azure Databricks, hence enabling near real-time analytics. We migrated from multiple legacy data warehouses, run by individual lines of business, to a single scalable, reliable, performant data lake on top of Azure and Delta Lake.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with Delta Lake канала Databricks
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner
Видео Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with Delta Lake канала Databricks
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Low-Code Apache SparkComcast makes home entertainment accessible to everyone with voice, data and AINBA Analytics | Data Brew | Season 4 Episode 2Data+AI Summit 2022 HighlightsAccelerating the Pace of Autism Diagnosis with Machine Learning ModelsDistributed Machine Learning at LyftMagnet Shuffle Service: Push-based Shuffle at LinkedInDemo Video: Connect to Power BI Desktop from DatabricksRay and Its Growing EcosystemGain 3 Benefits with Delta SharingPower to the (SQL) People: Python UDFs in DBSQLAutomating Data Quality Processes at ReckittLLM Module 3 - Multi-stage Reasoning | 3.7.3 Notebook Demo Part 3Modern Architecture of a Cloud-Enabled Data and Analytics PlatformHyperspace: An Indexing Subsystem for Apache SparkProtecting PII/PHI Data in Data Lake via Column Level EncryptionRun Your Queries Instantly in One of the Most Optimized EnvironmentsGrab leverages data + AI to create economic opportunities in Southeast AsiaMoving to the Lakehouse: Fast & Efficient Ingestion with Auto LoaderWehkamp excites shoppers with a better online experience with MLSpline: Central Data-Lineage Tracking, Not Only For Spark