The cost rolling back transactions (postgres/mysql)
The cost of a long-running update transaction that eventually failed in Postgres (or any other database for that matter.
In Postgres, any DML transaction touching a row creates a new version of that row. if the row is referenced in indexes, those need to be updated with the new tuple id as well. There are exceptions with optimization such as heap only tuples (HOT) where the index doesn’t need to be updated but that doesn’t always happens.
If the transaction rolls back, then the new row versions created by this transaction (millions in my case) are now invalid and should NOT be read by any new transaction. You have two solutions to address this, do you clean all dead rows eagerly on transaction rollback? Or do you do it lazily as a post process?
Postgres does the lazy approach, a command called vacuum which is called periodically Postgres attempts to remove those dead rows and free up space in the page.
Whats the harm of leaving those dead rows in? Its not really correctness issues at all, in fact transactions know not to read those dead rows by checking the state of the transaction that created them. This is however expensive, the check to see of the transaction that created this row is committed or rolled-back. Also the fact that those dead rows live in disk pages with alive rows makes an IO not efficient as the database has to filter out dead rows. For example, a page may have contained 1000 rows, but only 1 live row and 999 dead rows, the database will make that IO but only will get a single row of it. Repeat that and you end up making more IOs. More IOs = slower performance.
Other databases do the eager approach and won’t let you even start the database before rolling back is successfully complete, using undo logs. Which one is right and which one is wrong? Here is the fun part! Nothing is wrong or right, its all decisions that we engineers make. Its all fundamentals. Its up to you to understand and pick. Anything can work. You can make anything work if you know what you are dealing with.
If you want to learn about the fundamentals of databases and demystify it check out my udemy course https://database.husseinnasser.com
Become a Member on YouTube
https://www.youtube.com/channel/UC_ML5xP23TOWKUcc-oAE_Eg/join
🔥 Members Only Content
https://www.youtube.com/playlist?list=UUMO_ML5xP23TOWKUcc-oAE_Eg
Support my work on PayPal
https://bit.ly/33ENps4
🧑🏫 Courses I Teach
https://husseinnasser.com/courses
🏭 Backend Engineering Videos in Order
https://backend.husseinnasser.com
💾 Database Engineering Videos
https://www.youtube.com/playlist?list=PLQnljOFTspQXjD0HOzN7P2tgzu7scWpl2
🎙️Listen to the Backend Engineering Podcast
https://husseinnasser.com/podcast
Gears and tools used on the Channel (affiliates)
🖼️ Slides and Thumbnail Design
Canva
https://partner.canva.com/c/2766475/647168/10068
🎙️ Mic Gear
Shure SM7B Cardioid Dynamic Microphone
https://amzn.to/3o1NiBi
Cloudlifter
https://amzn.to/2RAeyLo
XLR cables
https://amzn.to/3tvMJRu
Focusrite Audio Interface
https://amzn.to/3f2vjGY
📷 Camera Gear
Canon M50 Mark II
https://amzn.to/3o2ed0c
Micro HDMI to HDMI
https://amzn.to/3uwCxK3
Video capture card
https://amzn.to/3f34pyD
AC Wall for constant power
https://amzn.to/3eueoxP
Stay Awesome,
Hussein
Видео The cost rolling back transactions (postgres/mysql) канала Hussein Nasser
In Postgres, any DML transaction touching a row creates a new version of that row. if the row is referenced in indexes, those need to be updated with the new tuple id as well. There are exceptions with optimization such as heap only tuples (HOT) where the index doesn’t need to be updated but that doesn’t always happens.
If the transaction rolls back, then the new row versions created by this transaction (millions in my case) are now invalid and should NOT be read by any new transaction. You have two solutions to address this, do you clean all dead rows eagerly on transaction rollback? Or do you do it lazily as a post process?
Postgres does the lazy approach, a command called vacuum which is called periodically Postgres attempts to remove those dead rows and free up space in the page.
Whats the harm of leaving those dead rows in? Its not really correctness issues at all, in fact transactions know not to read those dead rows by checking the state of the transaction that created them. This is however expensive, the check to see of the transaction that created this row is committed or rolled-back. Also the fact that those dead rows live in disk pages with alive rows makes an IO not efficient as the database has to filter out dead rows. For example, a page may have contained 1000 rows, but only 1 live row and 999 dead rows, the database will make that IO but only will get a single row of it. Repeat that and you end up making more IOs. More IOs = slower performance.
Other databases do the eager approach and won’t let you even start the database before rolling back is successfully complete, using undo logs. Which one is right and which one is wrong? Here is the fun part! Nothing is wrong or right, its all decisions that we engineers make. Its all fundamentals. Its up to you to understand and pick. Anything can work. You can make anything work if you know what you are dealing with.
If you want to learn about the fundamentals of databases and demystify it check out my udemy course https://database.husseinnasser.com
Become a Member on YouTube
https://www.youtube.com/channel/UC_ML5xP23TOWKUcc-oAE_Eg/join
🔥 Members Only Content
https://www.youtube.com/playlist?list=UUMO_ML5xP23TOWKUcc-oAE_Eg
Support my work on PayPal
https://bit.ly/33ENps4
🧑🏫 Courses I Teach
https://husseinnasser.com/courses
🏭 Backend Engineering Videos in Order
https://backend.husseinnasser.com
💾 Database Engineering Videos
https://www.youtube.com/playlist?list=PLQnljOFTspQXjD0HOzN7P2tgzu7scWpl2
🎙️Listen to the Backend Engineering Podcast
https://husseinnasser.com/podcast
Gears and tools used on the Channel (affiliates)
🖼️ Slides and Thumbnail Design
Canva
https://partner.canva.com/c/2766475/647168/10068
🎙️ Mic Gear
Shure SM7B Cardioid Dynamic Microphone
https://amzn.to/3o1NiBi
Cloudlifter
https://amzn.to/2RAeyLo
XLR cables
https://amzn.to/3tvMJRu
Focusrite Audio Interface
https://amzn.to/3f2vjGY
📷 Camera Gear
Canon M50 Mark II
https://amzn.to/3o2ed0c
Micro HDMI to HDMI
https://amzn.to/3uwCxK3
Video capture card
https://amzn.to/3f34pyD
AC Wall for constant power
https://amzn.to/3eueoxP
Stay Awesome,
Hussein
Видео The cost rolling back transactions (postgres/mysql) канала Hussein Nasser
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
RFC 9000 - QUIC #minuteengineering #quicThe Huge Flaw HTTP 1.0 Had...High severity flaw can crash your WebServer when using OpenSSL - Let us discussThe Cloudflare mTLS vulnerability - A Deep Dive AnalysisYour Backend Might not Be Ready for HTTP/2 - Watch This Before Implementing itcURL TLS 1.3 session ticket proxy host mixup VulnerabilityPublish/Subscribe Backend Systems Explained In 60 Seconds #shorts_husseinYugabyteDB supports read committed isolationNew course alert - Unlocking Backend PerformanceChrome follows FireFox steps - Rolling back SameSite cookie changeEnvoy Proxy Fixes Two Zero Day vulnerabilities (UDP Proxy, TCP Proxy) - CVE-2020-35470This Python And NodeJS IP Address Validation Vulnerability is Severe, Lets discussAmazon Aurora Supports Postgres 14Implementing Offline/Online Status of a Chatting Backend System #shortsByteDance makes Linux kernel reboots faster10 YouTube Backend, Protocols, Networking, Database Channels to Follow in 2021 (I watch them all)How to Run a Tech YouTube Channel and Enjoy it Without Burning outcURL creator @DanielStenberg threatened - The entitlement towards OSS needs to STOP!nulls are weird, Postgres15 makes them slightly betterPostgres System Columns Explained (ctid, xmin,xmax)Postgres Instances hacked and used to mine crypto - Let us discuss how is that possible