2023 Fall Robotics Colloquium: Coline Devin (Google Deepmind)
Title: RoboCat: A Self-Improving Agent for Robotic Manipulation
Speaker: Coline Devin (Google Deepmind)
Date: Friday, November 17, 2023
Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned decision transformer capable of consuming action labelled visual experience. This data spans a large repertoire of motor control skills from simulated and real robotic arms with varying sets of observations and actions. With RoboCat, we demonstrate the ability to generalise to new tasks and robots, both zero-shot as well as through adaptation using only 100–1000 examples for the target task. We also show how a trained model itself can be used to generate data for subsequent training iterations, thus providing a basic building block for an autonomous improvement loop. We investigate the agent’s capabilities, with large-scale evaluations both in simulation and on three different real robot embodiments. We find that as we grow and diversify its training data, RoboCat not only shows signs of cross-task transfer, but also becomes more efficient at adapting to new tasks.
Biography: Coline Devin is a senior research scientist at Google DeepMind. She received her PhD in Computer Science from UC Berkeley, advised by Sergey Levine, Trevor Darrell, and Pieter Abbeel. She is an NSF Graduate Research Fellow and has published work at NeurIPS, ICLR, CoRL, ICRA, and IROS.
This video is closed captioned.
Видео 2023 Fall Robotics Colloquium: Coline Devin (Google Deepmind) канала Paul G. Allen School
Speaker: Coline Devin (Google Deepmind)
Date: Friday, November 17, 2023
Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned decision transformer capable of consuming action labelled visual experience. This data spans a large repertoire of motor control skills from simulated and real robotic arms with varying sets of observations and actions. With RoboCat, we demonstrate the ability to generalise to new tasks and robots, both zero-shot as well as through adaptation using only 100–1000 examples for the target task. We also show how a trained model itself can be used to generate data for subsequent training iterations, thus providing a basic building block for an autonomous improvement loop. We investigate the agent’s capabilities, with large-scale evaluations both in simulation and on three different real robot embodiments. We find that as we grow and diversify its training data, RoboCat not only shows signs of cross-task transfer, but also becomes more efficient at adapting to new tasks.
Biography: Coline Devin is a senior research scientist at Google DeepMind. She received her PhD in Computer Science from UC Berkeley, advised by Sergey Levine, Trevor Darrell, and Pieter Abbeel. She is an NSF Graduate Research Fellow and has published work at NeurIPS, ICLR, CoRL, ICRA, and IROS.
This video is closed captioned.
Видео 2023 Fall Robotics Colloquium: Coline Devin (Google Deepmind) канала Paul G. Allen School
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![2019 ADSI Summer Workshop: Algorithmic Foundations of Learning and Control, Mengdi Wang](https://i.ytimg.com/vi/9ulRfhTzqTg/default.jpg)
![Allen School Colloquia: Martin Maas (UC Berkeley)](https://i.ytimg.com/vi/4qtSigNzT4E/default.jpg)
![UW Allen School Colloquium: Arvind Narayanan (Princeton University)](https://i.ytimg.com/vi/WY7bBWD6-Zs/default.jpg)
![Fall 2019 Natural Language Processing: Sam Bowman (NYU)](https://i.ytimg.com/vi/1CoLguOswLo/default.jpg)
![Emeritus Lecture: Oren Etzioni (Oct 2022)](https://i.ytimg.com/vi/Bgq4saV6giQ/default.jpg)
![Software for Embedded Systems, CSE 466, Winter 2008](https://i.ytimg.com/vi/MPh2gTjSq40/default.jpg)
![Allen School Colloquium: Accessibility Research Group at the Allen School](https://i.ytimg.com/vi/iMj9yz24gP8/default.jpg)
![Robots that Talk and Learn (Shiwali Mohan, Palo Alto Research Canter)](https://i.ytimg.com/vi/rJuAC1PUtLs/default.jpg)
![Animation Capstone, CSE460, Winter 2017](https://i.ytimg.com/vi/_dDc0qSZ_bM/default.jpg)
![Allen School Colloquium: ICTD Research Group at the Allen School](https://i.ytimg.com/vi/nOI3nj_Ne_U/default.jpg)
![UW CSE Robotics: Sean Andrist, "Gaze Mechanisms for Situated Interaction with Embodied Agents"](https://i.ytimg.com/vi/GcsmlcjSC0M/default.jpg)
![Allen School Colloquia: Jeff Nivala (University of California - Santa Cruz/Harvard Medical School)](https://i.ytimg.com/vi/DdnQdvC7Rvw/default.jpg)
![Underwater Acoustic Tracking (Ivan Masmitja, Universitat Politècnica de Catalunya)](https://i.ytimg.com/vi/uz0MOzsFcsY/default.jpg)
![UW Allen School Colloquium: Jacob Steinhardt (Stanford University)](https://i.ytimg.com/vi/Pdk3RidEjMY/default.jpg)
![Innovations in Theoretical Computer Science 2020 Session 4](https://i.ytimg.com/vi/kcYXpvsTIaY/default.jpg)
![Allen School Colloquium: Andrea Wei Coladangelo (UC Berkeley)](https://i.ytimg.com/vi/_9ZXOHkMxTc/default.jpg)
![Allen School Colloquia: Fabrication/Access](https://i.ytimg.com/vi/Q9YIajUst6g/default.jpg)
![2019 ADSI Summer Workshop: Algorithmic Foundations of Learning and Control, Pablo Parrilo](https://i.ytimg.com/vi/83R63nfYgJM/default.jpg)
![Fall 2018 Robotics Seminar, Brittany Duncan](https://i.ytimg.com/vi/ycjgjRFFgkA/default.jpg)
![HCI Capstone, CSE 441, Spring 2015](https://i.ytimg.com/vi/xz_rx9hj6o4/default.jpg)
![I Am CSE Alumni Overview](https://i.ytimg.com/vi/WgoDGoHnS40/default.jpg)