Все видео Новые видео Популярные видео Категории видео

Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

Systems @Scale 2019 - Disaster Recovery at Facebook Scale

Shruti Padmanabha, Research Scientist, Facebook
Justin Meza, Research Scientist, Facebook
https://code.fb.com/core-data/systems-scale/
Facebook operates dozens of data centers globally, each of which serves thousands of interdependent microservices to provide seamless experiences to billions of users across the family of Facebook products. At this scale, seemingly rare occurrences, from hurricanes looming over a data center to lightning striking a switchboard, have threatened the site’s health. These events cause large-scale machine failures at the scope of a data center or significant portions of it, which cannot be addressed by traditional fault-tolerance mechanisms designed for individual machine failures. Handling these failures requires us to develop solutions across the stack, from placing hardware and spare capacity across fault domains to being able to shift traffic smoothly away from affected fault domains to rearchitecting large-scale distributed systems in a fault domain-aware manner. In this talk, Shruti and Justin will describe principles Facebook follows for designing reliable software, tools we built to mitigate and respond to failures, and our continuous testing and validation process.

Видео Systems @Scale 2019 - Disaster Recovery at Facebook Scale канала Justin Miller

Показать

Комментарии отсутствуют