Загрузка страницы

Bridging Model-based Safety and Mode-free RL through System Identification of Linear Models

Bridging model-based safety and model-free reinforcement learning (RL) for dynamic robots is appealing since model-based methods are able to provide formal safety guarantees, while RL-based methods are able to exploit the robot agility by learning from the full-order system dynamics. However, current approaches to tackle this problem are mostly restricted to simple systems.

In this work, we propose a new method to combine model-based safety with model-free reinforcement learning by explicitly finding a low-dimensional model of the system controlled by a RL policy and applying stability and safety guarantees on that simple model. We use a complex bipedal robot Cassie, which is a high dimensional nonlinear system with hybrid dynamics and underactuation, and its RL-based walking controller as an example. We show that a low-dimensional dynamical model is sufficient to capture the dynamics of the closed-loop system. We demonstrate that this model is linear, asymptotically stable, and is decoupled across control input in all dimensions.

We further exemplify that such linearity exists even when using different RL control policies. Such results point out an interesting direction to understand the relationship between RL and optimal control: whether RL tends to linearize the nonlinear system during training in some cases. Furthermore, we illustrate that the found linear model is able to provide guarantees by safety-critical optimal control framework. We use Model Predictive Control with Control Barrier Functions on an example of autonomous navigation using Cassie while taking advantage of the agility provided by the RL-based controller.

Видео Bridging Model-based Safety and Mode-free RL through System Identification of Linear Models канала Hybrid Robotics
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
17 мая 2022 г. 0:21:36
00:00:35
Другие видео канала
Safety-Critical Geometric Control for Systems on ManifoldsSafety-Critical Geometric Control for Systems on ManifoldsRule-Based Safety-Critical Control Design using CBFs with Application to Autonomous Lane ChangeRule-Based Safety-Critical Control Design using CBFs with Application to Autonomous Lane Change400m dash - RL for Versatile, Dynamic, and Robust Bipedal Locomotion Control400m dash - RL for Versatile, Dynamic, and Robust Bipedal Locomotion ControlSupplementary Walking Experiments - RL for Versatile, Dynamic, and Robust Bipedal Locomotion ControlSupplementary Walking Experiments - RL for Versatile, Dynamic, and Robust Bipedal Locomotion ControlOptimal Robust Safety-Critical Control for Dynamic RoboticsOptimal Robust Safety-Critical Control for Dynamic RoboticsHierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal RobotHierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal RobotDynamic Legged Manipulation of a Ball through Multi-Contact OptimizationDynamic Legged Manipulation of a Ball through Multi-Contact OptimizationDynamic Walking on Stepping Stones with Gait Library and Control Barrier FunctionsDynamic Walking on Stepping Stones with Gait Library and Control Barrier FunctionsGaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with UncertaintyGaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with UncertaintyDeep Visual Perception for Dynamic Walking on Discrete TerrainDeep Visual Perception for Dynamic Walking on Discrete TerrainSupplementary Running Experiments - RL for Versatile, Dynamic, and Robust Bipedal Locomotion ControlSupplementary Running Experiments - RL for Versatile, Dynamic, and Robust Bipedal Locomotion ControlDynamic Walking on Randomly-Varying Discrete Terrain with One-step PreviewDynamic Walking on Randomly-Varying Discrete Terrain with One-step PreviewDynamic Walking on Randomly-Varying Discrete Terrain with One-step PreviewDynamic Walking on Randomly-Varying Discrete Terrain with One-step PreviewCompetitive Car Racing with Multiple VehiclesCompetitive Car Racing with Multiple VehiclesReinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion ControlReinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion ControlSafe Teleoperation of Dynamic UAVs through Control Barrier FunctionsSafe Teleoperation of Dynamic UAVs through Control Barrier FunctionsGeometric L1 Adaptive Attitude Control for Quadrotor UAVGeometric L1 Adaptive Attitude Control for Quadrotor UAVMotion Planning and Feedback Control for Bipedal Robots riding a SnakeboardMotion Planning and Feedback Control for Bipedal Robots riding a Snakeboard3D Dynamic Walking on Stepping Stones with Control Barrier Functions3D Dynamic Walking on Stepping Stones with Control Barrier FunctionsRobust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement LearningRobust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement LearningDifferential Flatness based Direct Collocation for a Quadrotor with a Cable-Suspended PayloadDifferential Flatness based Direct Collocation for a Quadrotor with a Cable-Suspended Payload
Яндекс.Метрика