Загрузка...

Doris Lee - Scaling your data science workflows with Modin | SciPy 2024

pandas is one of the most commonly used data science libraries in Python, with a convenient set of APIs for data cleaning, preparation, analysis, and exploration. However, despite its widespread adoption, pandas suffers from severe memory and performance issues on even moderately sized datasets. Modin is an open-source project that serves as a fast, scalable drop-in replacement for pandas (https://github.com/modin-project/modin). By changing just a single line of code, Modin seamlessly speeds up pandas workflow on a laptop or in a cluster. Originally developed at UC Berkeley, Modin has been downloaded more than 17 million times and is used by leading data science teams across industries.

Видео Doris Lee - Scaling your data science workflows with Modin | SciPy 2024 канала SciPy
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки