Загрузка страницы

Change for the Better: Improving Predictions by Automating Drift Detection

Change for the Better: Improving Predictions by Automating Drift Detection by Peter Webb & GokhanAtinc at Big Things Conference 2021

A machine learning solution is only as good as its data. But real-world data does not always stay within the bounds of the training set, posing a significant challenge for the data scientist: how to detect and respond to drifting data? Drifting data poses three problems: detecting and assessing drift-related model performance degradation; generating a more accurate model from the new data; and deploying a new model into an existing machine learning pipeline.



Using a real-world predictive maintenance problem, we demonstrate a solution that addresses each of these challenges: data drift detection algorithms periodically evaluate observation variability and model prediction accuracy; high-fidelity physics-based simulation models precisely label new data; and integration with industry-standard machine learning pipelines supports continuous integration and deployment. We reduce the level of expertise required to operate the system by automating both drift detection and data labelling.



Process automation reduces costs and increases reliability. The lockdowns and social distancing of the last two years reveal another advantage: minimizing human intervention and interaction to reduce risk while supporting essential social services. As we emerge from the worst of this pandemic, accelerating adoption of machine autonomy increases the demand for the automation of human expertise.



Consider a fleet of electric vehicles used for autonomous package delivery. Their batteries degrade over time, increasing charging time and diminishing vehicle range. The batteries are large and expensive to replace, and relying on a statistical estimate of battery lifetime inevitably results in replacing some batteries too soon and some too late. A more cost-effective approach collects battery health and performance data from each vehicle and uses machine learning models to predict the remaining useful lifetime of each battery. But changes in the operating environment may introduce drift into health and performance data. External temperature, for example, affects battery maximum charge and discharge rate. And then the model predictions become less accurate.



Our solution streams battery data through Kafka to production and training subsystems: a MATLAB Production Server-deployed model that predicts each battery’s remaining useful lifetime and a thermodynamically accurate physical Simulink model of the battery that automatically labels the data for use in training new models. Since simulation-based labeling is much slower than model-based prediction, the simulation cannot be used in production. The production subsystem monitors the deployed model and the streaming data to detect drift. Drift-induced model accuracy degradation triggers the training system to create new models from the most current training sets. Newly trained models are uploaded to a model registry where the production system can retrieve and integrate them into the deployed machine learning pipeline.

Видео Change for the Better: Improving Predictions by Automating Drift Detection канала Big Things Conference
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
3 декабря 2021 г. 19:58:23
00:36:42
Другие видео канала
Adopt AI in your organization by Aarthi SrinivasanAdopt AI in your organization by Aarthi SrinivasanInterview to Loren Shure at Big Data Spain 2016Interview to Loren Shure at Big Data Spain 2016How to integrate Big Data onto an analytical portal by Isaac Ciprés at Big Data Spain 2015How to integrate Big Data onto an analytical portal by Isaac Ciprés at Big Data Spain 2015Can an intelligent system exist without awareness? by Marco  BaenaCan an intelligent system exist without awareness? by Marco BaenaBigInsights and streams: IBM Hadoop solution by LUIS REINA at Big Data Spain 2014BigInsights and streams: IBM Hadoop solution by LUIS REINA at Big Data Spain 2014More people, less banking: Blockchain by  Salvador  CasqueroMore people, less banking: Blockchain by Salvador CasqueroIntroduction to Neo4j Workshop by JIM WEBBER at Big Data Spain 2014Introduction to Neo4j Workshop by JIM WEBBER at Big Data Spain 2014Deploying AI for Near Real-Time Manufacturing Decisions by Jim Stewart, Ph.D.& Heather Gorr, Ph.D.Deploying AI for Near Real-Time Manufacturing Decisions by Jim Stewart, Ph.D.& Heather Gorr, Ph.D.Developing Data Products by Jason Sundram at Big Data Spain 2015Developing Data Products by Jason Sundram at Big Data Spain 2015Entrevista a Francisco González Blanch, Desarrollo de producto en Madiva - Dare2DataEntrevista a Francisco González Blanch, Desarrollo de producto en Madiva - Dare2DataApache Mesos As The Foundation Of Your Big Data Cluster by Jörg Schad at Big Data Spain 2015Apache Mesos As The Foundation Of Your Big Data Cluster by Jörg Schad at Big Data Spain 2015Big Data as a game-changer by Rafael San Miguel & Dr. Javier Gómez Pavón at Big Data Spain 2015Big Data as a game-changer by Rafael San Miguel & Dr. Javier Gómez Pavón at Big Data Spain 2015AI @ Scale by Pablo Peris and Carlos de HuertaAI @ Scale by Pablo Peris and Carlos de HuertaGraphs for Analytics. The power of connections to understand the world by Josep TarruellaGraphs for Analytics. The power of connections to understand the world by Josep TarruellaSelf Sovereign Identity: Building the pillars of a new data economy by Daniel DíezSelf Sovereign Identity: Building the pillars of a new data economy by Daniel DíezFoundations of Data Teams by Jesse AndersonFoundations of Data Teams by Jesse AndersonWelcome to Big Things Conference 2021!!Welcome to Big Things Conference 2021!!Would you trust your model with your life? Research vs. reality in AI by Heather GorrWould you trust your model with your life? Research vs. reality in AI by Heather GorrHow I won the Alibaba self-driving LIDAR point cloud segmentation competition by Andrés TorrubiaHow I won the Alibaba self-driving LIDAR point cloud segmentation competition by Andrés TorrubiaUnexplainable AI: Why machines are acting in that way by Moisés Martínez at #BIGTH21Unexplainable AI: Why machines are acting in that way by Moisés Martínez at #BIGTH21Big data architecture for prediction and decision  by Rafael MuñozBig data architecture for prediction and decision by Rafael Muñoz
Яндекс.Метрика