Загрузка страницы

Data Science with Rust - Arrow, DataFusion, and Ballista by Andy Grove

Andy Grove spoke at the Denver Rust meetup on 2020-10-20 about Data Science with Rust - Arrow, DataFusion, and Ballista.
https://www.meetup.com/Rust-Boulder-Denver/events/272996842/
Details

Data Science with Rust - Arrow, DataFusion, and Ballista

Andy will explain why Rust is ideally suited for building the next generation of distributed compute platforms that are necessary for modern data science and will give an update on the current status of the various related projects that he is involved in.

Apache Arrow (https://arrow.apache.org/) defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.

DataFusion (https://docs.rs/datafusion/1.0.1/datafusion/), now part of the Arrow project, is an in-memory query engine implement in Rust that provides SQL and DataFrame APIs for querying CSV and Parquet files (as well as custom data sources).

Ballista (https://github.com/ballista-compute/ballista) is a distributed compute platform loosely modeled after Apache Spark and primarily implemented in Rust, that leverages Arrow and DataFusion.

Speaker: Andy Grove

Andy Grove is a PMC member of Apache Arrow, where he donated the initial Rust implementation as well as the DataFusion query engine and has more recently become a contributor to Apache Spark.

Видео Data Science with Rust - Arrow, DataFusion, and Ballista by Andy Grove канала Brooks Builds
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
26 октября 2020 г. 2:19:04
00:50:09
Яндекс.Метрика