Загрузка страницы

Working with time-series data at scale by Javier Ramírez

As individuals, we use time series data in everyday life all the time; If you’re trying to improve your health, you may track how many steps you take daily, and relate that to your body weight or size over time to understand how well you’re doing.

This is clearly a small-scale example, but on the other end of the spectrum, large-scale time series use cases abound in our current technological landscape. Be it tracking the price of a stock or cryptocurrency that changes every millisecond, performance and health metrics of a video streaming application, sensors for reading temperature, pressure and humidity, or the information generated from millions of IoT devices.

Modern digital applications require collecting, storing, and analyzing time series data at extreme scale, and with performance that a relational database simply cannot provide. We have all seen very creative solutions built to work around this problem, but as throughput needs increase, scaling them becomes a major challenge.

To get the job done, developers end up landing, transforming, and moving data around repeatedly, using multiple components pipelined together. Looking at these solutions really feels like looking at Rube Goldberg machines. It’s staggering to see how complex architectures become in order to satisfy the needs of these workloads.

Most importantly, all of this is something that needed to be built, managed, and maintained, and it still doesn’t meet very high scale and performance needs. Many time series applications can generate enormous volumes of data. One common example here is video streaming.

The act of delivering high quality video content is a very complex process. Understanding load latency, video frame drops, and user activity is something that needs to happen at massive scale and in real time. This process alone can generate several GBs of data every second, while easily running hundreds of thousands, sometimes over a million, queries per hour.

A relational database certainly isn’t the right choice here. Which is exactly why we built Timestream at AWS. Timestream started out by decoupling data ingestion, storage, and query such that each can scale independently. The design keeps each sub-system simple, making it easier to achieve unwavering reliability, while also eliminating scaling bottlenecks, and reducing the chances of correlated system failures which becomes more important as the system grows.

At the same time, in order to manage overall growth, the system is cell based – rather than scale the system as a whole, we segment the system into multiple smaller copies of itself so that these cells can be tested at full scale, and a system problem in one cell can’t affect activity in any of the other cells. In this session, I will introduce the problem of time-series, I will take a look at some architectures that have been used it the past to work around the problem, and I will then introduce Amazon Timestream, a purpose-built database to process and analyze time-series data at scale.

In this session I will describe the time-series problem, discuss the architecture of Amazon Timestream, and demo how it can be used to ingest and process time-series data at scale as a fully managed service. I will also demo how it can be easily integrated with open source tools like Apache Flink or Grafana.

Видео Working with time-series data at scale by Javier Ramírez канала Big Things Conference
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
16 декабря 2021 г. 13:53:07
00:41:11
Другие видео канала
Adopt AI in your organization by Aarthi SrinivasanAdopt AI in your organization by Aarthi SrinivasanInterview to Loren Shure at Big Data Spain 2016Interview to Loren Shure at Big Data Spain 2016How to integrate Big Data onto an analytical portal by Isaac Ciprés at Big Data Spain 2015How to integrate Big Data onto an analytical portal by Isaac Ciprés at Big Data Spain 2015Can an intelligent system exist without awareness? by Marco  BaenaCan an intelligent system exist without awareness? by Marco BaenaBigInsights and streams: IBM Hadoop solution by LUIS REINA at Big Data Spain 2014BigInsights and streams: IBM Hadoop solution by LUIS REINA at Big Data Spain 2014More people, less banking: Blockchain by  Salvador  CasqueroMore people, less banking: Blockchain by Salvador CasqueroIntroduction to Neo4j Workshop by JIM WEBBER at Big Data Spain 2014Introduction to Neo4j Workshop by JIM WEBBER at Big Data Spain 2014Deploying AI for Near Real-Time Manufacturing Decisions by Jim Stewart, Ph.D.& Heather Gorr, Ph.D.Deploying AI for Near Real-Time Manufacturing Decisions by Jim Stewart, Ph.D.& Heather Gorr, Ph.D.Developing Data Products by Jason Sundram at Big Data Spain 2015Developing Data Products by Jason Sundram at Big Data Spain 2015Entrevista a Francisco González Blanch, Desarrollo de producto en Madiva - Dare2DataEntrevista a Francisco González Blanch, Desarrollo de producto en Madiva - Dare2DataApache Mesos As The Foundation Of Your Big Data Cluster by Jörg Schad at Big Data Spain 2015Apache Mesos As The Foundation Of Your Big Data Cluster by Jörg Schad at Big Data Spain 2015Big Data as a game-changer by Rafael San Miguel & Dr. Javier Gómez Pavón at Big Data Spain 2015Big Data as a game-changer by Rafael San Miguel & Dr. Javier Gómez Pavón at Big Data Spain 2015AI @ Scale by Pablo Peris and Carlos de HuertaAI @ Scale by Pablo Peris and Carlos de HuertaGraphs for Analytics. The power of connections to understand the world by Josep TarruellaGraphs for Analytics. The power of connections to understand the world by Josep TarruellaSelf Sovereign Identity: Building the pillars of a new data economy by Daniel DíezSelf Sovereign Identity: Building the pillars of a new data economy by Daniel DíezFoundations of Data Teams by Jesse AndersonFoundations of Data Teams by Jesse AndersonWelcome to Big Things Conference 2021!!Welcome to Big Things Conference 2021!!Would you trust your model with your life? Research vs. reality in AI by Heather GorrWould you trust your model with your life? Research vs. reality in AI by Heather GorrHow I won the Alibaba self-driving LIDAR point cloud segmentation competition by Andrés TorrubiaHow I won the Alibaba self-driving LIDAR point cloud segmentation competition by Andrés TorrubiaUnexplainable AI: Why machines are acting in that way by Moisés Martínez at #BIGTH21Unexplainable AI: Why machines are acting in that way by Moisés Martínez at #BIGTH21Big data architecture for prediction and decision  by Rafael MuñozBig data architecture for prediction and decision by Rafael Muñoz
Яндекс.Метрика