High Performance Data Processing in Python - Donald Whyte
PyData Warsaw 2018
numpy and numba are popular Python libraries for processing large quantities of data. This talk explains how numpy/numba work under the hood and how they use vectorisation to process large amounts of data extremely quickly. We use these tools to reduce the processing time of a large, real 600GB dataset from one month to 40 minutes, even when the code is run on a single Macbook Pro.
===
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Видео High Performance Data Processing in Python - Donald Whyte канала PyData
numpy and numba are popular Python libraries for processing large quantities of data. This talk explains how numpy/numba work under the hood and how they use vectorisation to process large amounts of data extremely quickly. We use these tools to reduce the processing time of a large, real 600GB dataset from one month to 40 minutes, even when the code is run on a single Macbook Pro.
===
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Видео High Performance Data Processing in Python - Donald Whyte канала PyData
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Vincent Warmerdam: Winning with Simple, even Linear, Models | PyData London 2018Extending Pandas using Apache Arrow and Numba - Uwe L KornLosing your Loops Fast Numerical Computing with NumPyHigh Performance Data Processing in Python || Donald WhyteWhen Python Practices Go Wrong - Brandon Rhodes - code::dive 2019Sebastian Witowski - Writing faster PythonFunctional Programming in 40 Minutes • Russ Olsen • GOTO 2018Eric Dill: Is Spark still relevant? Multi-node CPU and single-node GPU workloads.. | PyData NYC 2019Cython: A First LookCuPy A NumPy compatible Library for the GPU - Sean FarleyHow to Accelerate an Existing Codebase with Numba | SciPy 2019 | Siu Kwan Lam, Stanley Seibert"Your Escape Plan From Numpy + Cython" - Cheng-Lin Yang (PyConline AU 2020)Daniel Chen: Cleaning and Tidying Data in Pandas | PyData DC 2018CSV Files in Python || Python Tutorial || Learn Python ProgrammingParallel Programming with (Py)OpenCL for Fun and ProfitTop To Down, Left To Right || James PowellTalk: Itamar Turner-Trauring - Small Big Data: using NumPy and Pandas when your data doesn't fit ...Raymond Hettinger - Modern solvers: Problems well-defined are problems solved - PyCon 2019High Performance Data Streaming with Amazon Kinesis: Best Practices and Common PitfallsCarl Meyer - Type-checked Python in the real world - PyCon 2018