Thesis: Partial State in Dataflow-Based Materialized Views
This is my PhD dissertation presentation, which I gave at MIT (virtually) on October 22nd, 2020. It was immediately followed by my thesis defense, which I passed subject to the typical bits and pieces of revisions the committee wanted to see.
The person who introduces me in the video is Robert Morris, my thesis advisor. The Q&A has been cut out, as I want to edit it a bit and mix in questions from the public presentation I gave on YouTube the day before. I will link the Q&A video here once it is out. You can find the slides at https://jon.thesquareplanet.com/slides/thesis.pdf and at https://docs.google.com/presentation/d/1w2PlmqUIeue8VcNhBqOC0V3GTKQs9zBccrhd9tfAang/edit?usp=sharing. The thesis is available at https://jon.thesquareplanet.com/papers/phd-thesis.pdf.
What follows is the thesis abstract:
This thesis proposes a practical database system that lowers latency and increases supported load for read-heavy applications by using incrementally-maintained materialized views to cache query results. As opposed to state-of-the-art materialized view systems, the presented system builds the cache on demand, and evicts cache entries in response to a shifting workload.
The enabling technique the thesis introduces is partially stateful materialization, which allows entries in materialized views to be missing. The thesis proposes upqueries as a mechanism to fill such missing state on demand using dataflow, and implements them in the materialized view system Noria. The thesis then discusses additional mechanisms needed to establish eventual consistency for partially stateful dataflow.
Noria with partial materialization saves application developers from implementing their own ad hoc caching mechanisms to speed up their database accesses. Instead, the caching is built into the database, and is transparent to the application. Experimental results suggest that the presented system increases supported application load by up to 20x over MySQL and performs similarly to an optimized key-value store cache. Partial state also reduces memory use by up to 2/3 compared to traditional materialized views.
Видео Thesis: Partial State in Dataflow-Based Materialized Views канала Jon Gjengset
The person who introduces me in the video is Robert Morris, my thesis advisor. The Q&A has been cut out, as I want to edit it a bit and mix in questions from the public presentation I gave on YouTube the day before. I will link the Q&A video here once it is out. You can find the slides at https://jon.thesquareplanet.com/slides/thesis.pdf and at https://docs.google.com/presentation/d/1w2PlmqUIeue8VcNhBqOC0V3GTKQs9zBccrhd9tfAang/edit?usp=sharing. The thesis is available at https://jon.thesquareplanet.com/papers/phd-thesis.pdf.
What follows is the thesis abstract:
This thesis proposes a practical database system that lowers latency and increases supported load for read-heavy applications by using incrementally-maintained materialized views to cache query results. As opposed to state-of-the-art materialized view systems, the presented system builds the cache on demand, and evicts cache entries in response to a shifting workload.
The enabling technique the thesis introduces is partially stateful materialization, which allows entries in materialized views to be missing. The thesis proposes upqueries as a mechanism to fill such missing state on demand using dataflow, and implements them in the materialized view system Noria. The thesis then discusses additional mechanisms needed to establish eventual consistency for partially stateful dataflow.
Noria with partial materialization saves application developers from implementing their own ad hoc caching mechanisms to speed up their database accesses. Instead, the caching is built into the database, and is transparent to the application. Experimental results suggest that the presented system increases supported application load by up to 20x over MySQL and performs similarly to an optimized key-value store cache. Partial state also reduces memory use by up to 2/3 compared to traditional materialized views.
Видео Thesis: Partial State in Dataflow-Based Materialized Views канала Jon Gjengset
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
What is Distributed Caching? Explained with Redis!Sh*t Data Scientists Say (Parody)Crust of Rust: ChannelsRust at speed — building a fast concurrent databaseThe Shortest Ever Papers - Numberphile11 Secrets to Memorize Things Quicker Than OthersMap of Computer ScienceHow to learn to code (quickly and easily!)Apps are dead... what's the next big thing?Final Thesis Defense: Mico GalangCrust of Rust: Functions, Closures, and Their TraitsTop 5 Programming Languages to Learn in 2020 to Get a Job Without a College DegreeAI VS ML VS DL VS Data ScienceMaterialized views in oracle - Part 12022-01-01 Q&A/AMA/WhatchamacallitClickHouse and the Magic of Materialized ViewsThe hardest part of microservices is your dataHow to SpeakBut what is a neural network? | Chapter 1, Deep learning