Speeding Up Your DataFrames With Polars | Real Python Podcast #140
How can you get more performance from your existing data science infrastructure? What if a DataFrame library could take advantage of your machine's available cores and provide built-in methods for handling larger-than-RAM datasets? This week on the show, Liam Brannigan is here to discuss Polars.
👉 Links from the show: https://realpython.com/podcasts/rpp/140/
Liam is an experienced data scientist working in finance, technology, and environmental analysis. He's recently started contributing to the documentation for Polars and developing a training course for the library.
We talk about the library's overall speed and lack of additional dependencies. Liam explains the advantages of lazy vs eager mode and which to choose when performing data exploration or attempting to load a dataset larger than your RAM.
We also discuss potential barriers to switching to Polars from a pandas workflow. Across our conversation, we explore several other libraries and technologies, including Apache Arrow, DuckDB, query optimization, and the "rustification" of Python tools.
Show Topics:
- 00:00:00 -- Introduction
- 00:02:06 -- Liam's background and intro to Polars
- 00:03:37 -- Hurdles to switching to Polars
- 00:05:23 -- Creating training resources
- 00:08:15 -- No indexes
- 00:09:46 -- Data science 2025 predictions
- 00:12:02 -- Contributions to Polars
- 00:15:07 -- Eager vs lazy mode & query optimization
- 00:19:25 -- Sponsor: Anaconda Nucleus
- 00:20:00 -- Apache Arrow and parquet
- 00:24:43 -- DuckDB and column orientation
- 00:29:27 -- The "rustification" of libraries
- 00:34:49 -- Video Course Spotlight
- 00:36:16 -- GPUs and memory requirements
- 00:45:49 -- No additional library requirements
- 00:47:37 -- Development of the ecosystem
- 00:51:33 -- Chaining operations
- 00:53:39 -- How can people follow your work?
- 00:54:51 -- What are you excited about in the world of Python?
- 00:56:09 -- What do you want to learn next?
- 00:56:58 -- Thanks and goodbye
👉 Links from the show: https://realpython.com/podcasts/rpp/140/
Видео Speeding Up Your DataFrames With Polars | Real Python Podcast #140 канала Real Python
👉 Links from the show: https://realpython.com/podcasts/rpp/140/
Liam is an experienced data scientist working in finance, technology, and environmental analysis. He's recently started contributing to the documentation for Polars and developing a training course for the library.
We talk about the library's overall speed and lack of additional dependencies. Liam explains the advantages of lazy vs eager mode and which to choose when performing data exploration or attempting to load a dataset larger than your RAM.
We also discuss potential barriers to switching to Polars from a pandas workflow. Across our conversation, we explore several other libraries and technologies, including Apache Arrow, DuckDB, query optimization, and the "rustification" of Python tools.
Show Topics:
- 00:00:00 -- Introduction
- 00:02:06 -- Liam's background and intro to Polars
- 00:03:37 -- Hurdles to switching to Polars
- 00:05:23 -- Creating training resources
- 00:08:15 -- No indexes
- 00:09:46 -- Data science 2025 predictions
- 00:12:02 -- Contributions to Polars
- 00:15:07 -- Eager vs lazy mode & query optimization
- 00:19:25 -- Sponsor: Anaconda Nucleus
- 00:20:00 -- Apache Arrow and parquet
- 00:24:43 -- DuckDB and column orientation
- 00:29:27 -- The "rustification" of libraries
- 00:34:49 -- Video Course Spotlight
- 00:36:16 -- GPUs and memory requirements
- 00:45:49 -- No additional library requirements
- 00:47:37 -- Development of the ecosystem
- 00:51:33 -- Chaining operations
- 00:53:39 -- How can people follow your work?
- 00:54:51 -- What are you excited about in the world of Python?
- 00:56:09 -- What do you want to learn next?
- 00:56:58 -- Thanks and goodbye
👉 Links from the show: https://realpython.com/podcasts/rpp/140/
Видео Speeding Up Your DataFrames With Polars | Real Python Podcast #140 канала Real Python
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Discussing Mojo & Improving Python Object-Oriented Programming | Real Python Podcast #157Real Python Community Members: Bob FrederickOrchestrating Large and Small Projects With Apache Airflow | Real Python Podcast #142Introduction to Integers and Floating Point Numbers: Python BasicsBuilding Python CI With Docker & Applying for a Hacker Initiative Grant | Real Python Podcast #158Creating Python Function docstrings and Running doctestsExploring Python With bpython & Formalizing f-String Grammar | Real Python Podcast #141Evaluating Python Packages & Celebrating 20 Years of PyCon US | Real Python Podcast #151Identifying a Substring Within a Python StringReal Python Community Members: Paul ParadisReal Python Community Members: Alex ElderAutomate Processes and Distribute Python Tools With RPA and RCC | Real Python Podcast #152Start Testing Your Python with doctest & Pagination in Django | Real Python Podcast #109Ready to Publish Your Python Packages? | Real Python Podcast #83Starting With REST APIs and Django NinjaSupporting Python Open Source Projects and Maintainers | Real Python Podcast #73Data Version Control in Python and Real Python Video Transcripts | Real Python Podcast #25Starting With YAML and PyYAML in PythonImprove Matplotlib With Style Sheets & Python Async for the Web | Real Python Podcast #125Real Python Community Members: Edward Wright