Philipp Krähenbühl - Point-based object detection
August 11th, 2020. MIT CSAIL
Abstract:
Objects are commonly thought of as axis-aligned boxes in an image. Even before deep learning, the best performing object detectors classified rectangular image regions. On one hand, this approach conveniently reduces detection to image classification. On the other hand, it has to deal with a nearly exhaustive list of image regions that do not contain any objects. In this talk, I'll present an alternative representation of objects: as points. I'll show how to build an object detector from a keypoint detector of object centers. The presented approach is both simpler and more efficient (faster and/or more accurate) than equivalent box-based detection systems. Our point-based detector easily extends to other tasks, such as object tracking, monocular or Lidar 3D detection, and pose estimation.
Most detectors, including ours, are usually trained on a single dataset and then evaluated in that same domain. However, it is unlikely that any user of an object detection system only cares about 80 COCO classes or 23 nuScenes vehicle categories in isolation. More likely than not, object classes needed in a down-stream system are either spread over different data-sources or not annotated at all. In the second part of this talk, I'll present a framework for learning object detectors on multiple different datasets simultaneously. We automatically learn the relationship between different objects annotations in different datasets and automatically merge them into common taxonomy. The resulting detector then reasons about the union of object classes from all datasets at once. This detector is also easily extended to unseen classes by fine-tuning it on a small dataset with novel annotations.
Bio:
Philipp is an Assistant Professor in the Department of Computer Science at the University of Texas at Austin. He received his Ph.D. in 2014 from the CS Department at Stanford University and then spent two wonderful years as a PostDoc at UC Berkeley. His research interests lie in Computer Vision, Machine learning, and Computer Graphics. He is particularly interested in deep learning, image understanding, and vision and action.
Видео Philipp Krähenbühl - Point-based object detection канала Vision & Graphics Seminar at MIT
Abstract:
Objects are commonly thought of as axis-aligned boxes in an image. Even before deep learning, the best performing object detectors classified rectangular image regions. On one hand, this approach conveniently reduces detection to image classification. On the other hand, it has to deal with a nearly exhaustive list of image regions that do not contain any objects. In this talk, I'll present an alternative representation of objects: as points. I'll show how to build an object detector from a keypoint detector of object centers. The presented approach is both simpler and more efficient (faster and/or more accurate) than equivalent box-based detection systems. Our point-based detector easily extends to other tasks, such as object tracking, monocular or Lidar 3D detection, and pose estimation.
Most detectors, including ours, are usually trained on a single dataset and then evaluated in that same domain. However, it is unlikely that any user of an object detection system only cares about 80 COCO classes or 23 nuScenes vehicle categories in isolation. More likely than not, object classes needed in a down-stream system are either spread over different data-sources or not annotated at all. In the second part of this talk, I'll present a framework for learning object detectors on multiple different datasets simultaneously. We automatically learn the relationship between different objects annotations in different datasets and automatically merge them into common taxonomy. The resulting detector then reasons about the union of object classes from all datasets at once. This detector is also easily extended to unseen classes by fine-tuning it on a small dataset with novel annotations.
Bio:
Philipp is an Assistant Professor in the Department of Computer Science at the University of Texas at Austin. He received his Ph.D. in 2014 from the CS Department at Stanford University and then spent two wonderful years as a PostDoc at UC Berkeley. His research interests lie in Computer Vision, Machine learning, and Computer Graphics. He is particularly interested in deep learning, image understanding, and vision and action.
Видео Philipp Krähenbühl - Point-based object detection канала Vision & Graphics Seminar at MIT
Показать
Комментарии отсутствуют
Информация о видео
18 августа 2020 г. 1:44:56
01:02:40
Другие видео канала
![CornerNet: Detecting Objects as Paired Keypoints](https://i.ytimg.com/vi/aJnvTT1-spc/default.jpg)
![CornerNet: Detecting Objects as Paired Keypoints (Paper Explained)](https://i.ytimg.com/vi/CA8JPbJ75tY/default.jpg)
![Current Approaches and Future Directions for Point Cloud Object Detection in Intelligent Agents](https://i.ytimg.com/vi/xFFCQVwYeec/default.jpg)
![Jon Barron - Understanding and Extending Neural Radiance Fields](https://i.ytimg.com/vi/HfJpQCBTqZs/default.jpg)
![Zachary Teed - Optimization Inspired Neural Networks for Multiview 3D](https://i.ytimg.com/vi/ul6pXRGKmco/default.jpg)
![Bootstrapping Main Ideas!!!](https://i.ytimg.com/vi/Xz0x-8-cgaQ/default.jpg)
![Gedas Bertasius - Video Understanding with Modern Language Models](https://i.ytimg.com/vi/vmHTdNOf3NI/default.jpg)
![Football Video Analysis Using Deep Learning](https://i.ytimg.com/vi/_f-oX7ca3Ik/default.jpg)
![PR-241: Objects as Points](https://i.ytimg.com/vi/mDdpwe2xsT4/default.jpg)
![Object Tracking Using Deep SORT and YOLOv4 | Multi Object Tracking](https://i.ytimg.com/vi/kBahrCeaoDQ/default.jpg)
![Understanding the Particle Filter | | Autonomous Navigation, Part 2](https://i.ytimg.com/vi/NrzmH_yerBU/default.jpg)
![How to apply MaskRCNN for Medical Images: Explanations and Code](https://i.ytimg.com/vi/4fNNQuHKh1o/default.jpg)
![[Tutorial] Extracting Geometric Features of Point cloud using CloudCompare](https://i.ytimg.com/vi/K9D42eDNJy4/default.jpg)
![CV3DST - One-stage Object Detectors](https://i.ytimg.com/vi/J9LSeOGoNW0/default.jpg)
![What is a Point Cloud?](https://i.ytimg.com/vi/PL6wD8jczkE/default.jpg)
![Bolei Zhou - Inverting Latent Space of GANs for Real Image Editings](https://i.ytimg.com/vi/zyBQ9obuqfQ/default.jpg)
![Stanford Seminar - Self-Supervised Pseudo-Lidar Networks](https://i.ytimg.com/vi/SLEK2vAgjOI/default.jpg)
![How To Train an Object Detection Neural Network Using TensorFlow (GPU) on Windows 10](https://i.ytimg.com/vi/Rgpfk6eYxJA/default.jpg)
![CV3DST - Object tracking](https://i.ytimg.com/vi/QtAYgtBnhws/default.jpg)
![Object Detection and Tracking](https://i.ytimg.com/vi/pG0B9hTiPRQ/default.jpg)