Data Science with Rust - Arrow, DataFusion, and Ballista by Andy Grove
Andy Grove spoke at the Denver Rust meetup on 2020-10-20 about Data Science with Rust - Arrow, DataFusion, and Ballista.
https://www.meetup.com/Rust-Boulder-Denver/events/272996842/
Details
Data Science with Rust - Arrow, DataFusion, and Ballista
Andy will explain why Rust is ideally suited for building the next generation of distributed compute platforms that are necessary for modern data science and will give an update on the current status of the various related projects that he is involved in.
Apache Arrow (https://arrow.apache.org/) defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.
DataFusion (https://docs.rs/datafusion/1.0.1/datafusion/), now part of the Arrow project, is an in-memory query engine implement in Rust that provides SQL and DataFrame APIs for querying CSV and Parquet files (as well as custom data sources).
Ballista (https://github.com/ballista-compute/ballista) is a distributed compute platform loosely modeled after Apache Spark and primarily implemented in Rust, that leverages Arrow and DataFusion.
Speaker: Andy Grove
Andy Grove is a PMC member of Apache Arrow, where he donated the initial Rust implementation as well as the DataFusion query engine and has more recently become a contributor to Apache Spark.
Видео Data Science with Rust - Arrow, DataFusion, and Ballista by Andy Grove канала Brooks Builds
https://www.meetup.com/Rust-Boulder-Denver/events/272996842/
Details
Data Science with Rust - Arrow, DataFusion, and Ballista
Andy will explain why Rust is ideally suited for building the next generation of distributed compute platforms that are necessary for modern data science and will give an update on the current status of the various related projects that he is involved in.
Apache Arrow (https://arrow.apache.org/) defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.
DataFusion (https://docs.rs/datafusion/1.0.1/datafusion/), now part of the Arrow project, is an in-memory query engine implement in Rust that provides SQL and DataFrame APIs for querying CSV and Parquet files (as well as custom data sources).
Ballista (https://github.com/ballista-compute/ballista) is a distributed compute platform loosely modeled after Apache Spark and primarily implemented in Rust, that leverages Arrow and DataFusion.
Speaker: Andy Grove
Andy Grove is a PMC member of Apache Arrow, where he donated the initial Rust implementation as well as the DataFusion query engine and has more recently become a contributor to Apache Spark.
Видео Data Science with Rust - Arrow, DataFusion, and Ballista by Andy Grove канала Brooks Builds
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Feature Engineering-How to Transform Data to Better Fit The Gaussian Distribution-Data Science](https://i.ytimg.com/vi/U_wKdCBC-w0/default.jpg)
![Should You Learn Rust in 2021?](https://i.ytimg.com/vi/UhAFN7d4N6g/default.jpg)
![InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in Apache Arrow](https://i.ytimg.com/vi/K6eCAVEk4kU/default.jpg)
![Parsing JSON Really Quickly: Lessons Learned](https://i.ytimg.com/vi/wlvKAT7SZIQ/default.jpg)
![Andy Grove- Ballista: Distributed Compute with Rust and Apache Arrow](https://i.ytimg.com/vi/ZZHQaOap9pQ/default.jpg)
![apply() Conference 2021 | Apache Arrow and the Next Generation of Data Analytics Systems](https://i.ytimg.com/vi/-ZikPi2nmSI/default.jpg)
![Kevin Hoffman — Building a Containerless Future with WebAssembly](https://i.ytimg.com/vi/vqBtoPJoQOE/default.jpg)
![Authentication and Security in gRPC Microservices - Jan Tattermusch, Google](https://i.ytimg.com/vi/_y-lzjdVEf0/default.jpg)
![Rust vs Go: Which is best? THE Definitive Answer](https://i.ytimg.com/vi/E-47VLwMY_U/default.jpg)
![An introduction to structs, traits, and zero-cost abstractions by Tim McLean - Rust KW Meetup](https://i.ytimg.com/vi/Sn3JklPAVLk/default.jpg)
![Alisa Dammer - Python vs Rust for Simulation](https://i.ytimg.com/vi/kytvDxxedWY/default.jpg)
!["Apache Arrow and the Future of Data Frames" with Wes McKinney](https://i.ytimg.com/vi/fyj4FyH3XdU/default.jpg)
![Apache Arrow: A New Gold Standard for Dataset Transport // Subsurface Summer 2020](https://i.ytimg.com/vi/SFjY7XGfl3M/default.jpg)
![Rust NYC: Jon Gjengset - Demystifying unsafe code](https://i.ytimg.com/vi/QAz-maaH0KM/default.jpg)
![How I Would Learn Data Science (If I Had to Start Over)](https://i.ytimg.com/vi/4OZip0cgOho/default.jpg)
![Type Theory for the Working Rustacean - Dan Pittman](https://i.ytimg.com/vi/BdXWlQsd7RI/default.jpg)
![AWS re:Invent 2020: Next-gen networking infrastructure with Rust and Tokio](https://i.ytimg.com/vi/MZyleK8elPk/default.jpg)
![Getting Started with WebAssembly and Rust: A First Look](https://i.ytimg.com/vi/YHJjmsw_Sx0/default.jpg)
![Max Orok - Considering Rust for scientific software — RustFest Global 2020](https://i.ytimg.com/vi/PEoPrMkg8W4/default.jpg)
![Rust Cologne: A generalist's view of traits](https://i.ytimg.com/vi/3YCqgwpuFM0/default.jpg)