SREcon17 Americas - I'm Putting Sloths on the Map
I'm Putting Sloths on the Map
Preetha Appan, Indeed.com
At Indeed, we strive to build systems that can withstand problems with an unreliable network. We want to anticipate and prevent failures, rather than just reacting to them. Our applications run on the private cloud, sharing infrastructure with other services on the same host. The interconnectedness of our system and resource infrastructure introduces challenges when inducing failures that simulate a slow or lossy network. We need the ability to slow down the network for one service or data source and test how this impacts other applications that use it—without causing side effects on applications in the same host.
In this talk, we’ll describe Sloth, a Go tool for inducing network failures. Sloth is a daemon that runs on every host in our infrastructure, including database and index servers. Sloth works by adding and removing complex traffic shaping rules via unix’s tc and iptables. Sloth is implemented with access control and audit logging to ensure its usability without compromising security. It provides a web UI for manual testing and offers an API to embed destructive testing into integration tests. We will discuss specific examples of how using Sloth, we discovered and fixed problems in monitoring, graceful degradation, and usability.
View the full SREcon17 Americas Program at https://www.usenix.org/conference/srecon17americas/program
Видео SREcon17 Americas - I'm Putting Sloths on the Map канала USENIX
Preetha Appan, Indeed.com
At Indeed, we strive to build systems that can withstand problems with an unreliable network. We want to anticipate and prevent failures, rather than just reacting to them. Our applications run on the private cloud, sharing infrastructure with other services on the same host. The interconnectedness of our system and resource infrastructure introduces challenges when inducing failures that simulate a slow or lossy network. We need the ability to slow down the network for one service or data source and test how this impacts other applications that use it—without causing side effects on applications in the same host.
In this talk, we’ll describe Sloth, a Go tool for inducing network failures. Sloth is a daemon that runs on every host in our infrastructure, including database and index servers. Sloth works by adding and removing complex traffic shaping rules via unix’s tc and iptables. Sloth is implemented with access control and audit logging to ensure its usability without compromising security. It provides a web UI for manual testing and offers an API to embed destructive testing into integration tests. We will discuss specific examples of how using Sloth, we discovered and fixed problems in monitoring, graceful degradation, and usability.
View the full SREcon17 Americas Program at https://www.usenix.org/conference/srecon17americas/program
Видео SREcon17 Americas - I'm Putting Sloths on the Map канала USENIX
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
LISA16 - Building a Billion User Load BalancerUSENIX Security '20 - Everything Old is New Again: Binary Security of WebAssemblyOSDI '21 - Marius: Learning Massive Graph Embeddings on a Single MachineUSENIX ATC '21 - A Case Study of Processing-in-Memory in off-the-Shelf SystemsNSDI '22 - SCALE: Automatically Finding RFC Compliance Bugs in DNS NameserversNSDI '21 - One Protocol to Rule Them All: Wireless Network-on-Chip using Deep Reinforcement LearningUSENIX ATC '21 - FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes...USENIX Security '22 - Under the Hood of DANE Mismanagement in SMTPNSDI '21 - When Cloud Storage Meets RDMALISA21 - Can Infrastructure as Code Apply to Bare Metal?NSDI '21 - Ownership: A Distributed Futures System for Fine-Grained TasksUSENIX Security '19 - Small World with High Risks: A Study of Security Threats in the npm EcosystemSREcon19 Americas - Pragmatic AutomationUSENIX ATC '19 - Evaluating File System Reliability on Solid State DrivesUSENIX Security '20 - Timeless Timing Attacks: Exploiting Concurrency to Leak Secrets over RemoteUSENIX Security '20 - Datalog DisassemblyUSENIX Security '22 - Poison Forensics: Traceback of Data Poisoning Attacks in Neural NetworksUSENIX Security '22 - Lumos: Identifying and Localizing Diverse Hidden IoT Devices...NSDI '22 - Runtime Programmable SwitchesUSENIX Security '21 - Injection Attacks Reloaded: Tunnelling Malicious Payloads over DNSLISA16 - Network-Based LUKS Volume Decryption with Tang