Neal Richardson | Bigger Data With Ease Using Apache Arrow | RStudio
The Apache Arrow project enables data scientists using R, Python, and other languages to work with large datasets efficiently and with interactive speed. Arrow is so fast at some workflows that it seems to defy reality--or at least the limits of R's capabilities. This talk examines the unique characteristics of the Arrow project that enable it to redefine what is possible in R. The talk also highlights some of the latest developments in the arrow R package, including how you can query and manipulate multi-file datasets, and it presents strategies for speeding up workflows by up to 100x.
About Neal:
Currently Director of Engineering at Ursa Labs / RStudio. Previously led product and engineering at Crunch.io. Ph.D. in Political Science from the University of California, Berkeley.
Видео Neal Richardson | Bigger Data With Ease Using Apache Arrow | RStudio канала Posit PBC
About Neal:
Currently Director of Engineering at Ursa Labs / RStudio. Previously led product and engineering at Crunch.io. Ph.D. in Political Science from the University of California, Berkeley.
Видео Neal Richardson | Bigger Data With Ease Using Apache Arrow | RStudio канала Posit PBC
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Data Science Hangout | Satish Murthy, Janssen | Creating a validated environment for reproducibilityPeter Gandenberger | Dashboard-Builder: Building Shiny Apps without writing any code | RStudioHadley Wickham: Thank you, from Open Source at PositHow to: add Dark Mode on your Shiny R apps✨#datascience #datavisualizationShinywidgets - An Overview || Carson Sievert || RStudioResources for Python Data Scientists | Shiny for Python is here✨Dewey Dunnington | Accelerating geospatial computing using Apache Arrow | RStudio (2022)Sydeaka Watson | A Robust Framework for Automated Shiny App Testing | RStudio (2022)Using Python with RStudio TeamUsing R to develop production modeling workflows at Mayo Clinic - posit::conf(2023)Data Science Hangout | Ivonne Carrillo Domínguez, Bixal | Transitioning to data engineeringData Science Hangout | Jessie Pluto, Comcast | Taking Initiative with an IdeaIsaac Florence | Scaling and automating R workflows with Kubernetes and Airflow | Posit (2022)Hannah Podzorski | Advocating for Automation: Adapting Current Tools in Environmental Science with RMatthew Kay | Visualizing distributions and uncertainty using ggdist | RStudio (2022)posit::conf(2023) Workshop: Causal Inference with RData Science Hangout | Tiger Tang, CARFAX | Quantifying the Hours SavedNatalie O'Shea @ BetterUp | Focus on relationships first | Data Science HangoutSharla Gelfand | Don’t repeat yourself, talk to yourself! Reporting with R | RStudio (2020)Chelsea Parlett-Pelleriti | Hands-on ways to remotely teach data science are invaluable | RStudioJared Lander | R: Then and Now | RStudio (2020)