Все видео Новые видео Популярные видео Категории видео

Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

Using Apache Arrow, Calcite and Parquet to build a Relational Cache | Dremio

WANT TO EXPERIENCE A TALK LIKE THIS LIVE?
Barcelona: https://www.datacouncil.ai/barcelona
New York City: https://www.datacouncil.ai/new-york-city
San Francisco: https://www.datacouncil.ai/san-francisco
Singapore: https://www.datacouncil.ai/singapore

Download slides for this talk: https://goo.gl/eMWk8i

Everybody wants to get to data faster. As we move from more general solution to specific optimization techniques, the level of performance impact grows. This talk will discuss how layering in-memory caching, columnar storage and relational caching can combine to provide a substantial improvement in overall data science and analytical workloads. It will include a detailed overview of how you can use Apache Arrow, Calcite and Parquet to achieve multiple magnitudes improvement in performance over what is currently possible.

We'll start by talking about in-memory caches and the difference between block-based and data-aware caching strategies. We'll discuss the deployment design of this type of solution as well as cover the strengths of each. There will also be a discussion of the relationship of security and predicate application in these scenarios. Then we'll go into detail about how columnar storage formats can further enhance performance by minimizing read time, optimizing for vectorized in-memory processing and powerful compression techniques.

Lastly, we'll introduce a much more advanced way to speed access to data called relational caching. Relational caching builds a cache on columnar in-memory caching techniques but also includes a full comprehension of how data is being used and how different forms of data relate to each other. This will include leveraging multiple sorting and partitioning strategies as well as maintaining multiple related derivations of data for different types of access patterns. As part of this and we also cover approaches to data ttl, relational cache consistency and several different approaches to data mutation and real-time updates.

FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai

Видео Using Apache Arrow, Calcite and Parquet to build a Relational Cache | Dremio канала Data Council

Показать

Комментарии отсутствуют