Загрузка страницы

Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Sele...

Authors: Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, Stan Z. Li Description: Object detection has been dominated by anchor-based detectors for several years. Recently, anchor-free detectors have become popular due to the proposal of FPN and Focal Loss. In this paper, we first point out that the essential difference between anchor-based and anchor-free detection is actually how to define positive and negative training samples, which leads to the performance gap between them. If they adopt the same definition of positive and negative samples during training, there is no obvious difference in the final performance, no matter regressing from a box or a point. This shows that how to select positive and negative training samples is important for current object detectors. Then, we propose an Adaptive Training Sample Selection (ATSS) to automatically select positive and negative samples according to statistical characteristics of object. It significantly improves the performance of anchor-based and anchor-free detectors and bridges the gap between them. Finally, we discuss the necessity of tiling multiple anchors per location on the image to detect objects. Extensive experiments conducted on MS COCO support our aforementioned analysis and conclusions. With the newly introduced ATSS, we improve state-of-the-art detectors by a large margin to 50.7% AP without introducing any overhead. The code is available at https://github.com/sfzhang15/ATSS.

Видео Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Sele... канала ComputerVisionFoundation Videos
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
21 июля 2020 г. 2:56:42
00:05:01
Другие видео канала
Unsupervised Representation Learning for Gaze EstimationUnsupervised Representation Learning for Gaze EstimationSyn2Real Transfer Learning for Image Deraining Using Gaussian ProcessesSyn2Real Transfer Learning for Image Deraining Using Gaussian ProcessesLearning to Dress 3D People in Generative ClothingLearning to Dress 3D People in Generative ClothingLearning Physics-Guided Face Relighting Under Directional LightLearning Physics-Guided Face Relighting Under Directional LightWACV20: Keynote Talk: Maja Pantic, Imperial College London and SAICWACV20: Keynote Talk: Maja Pantic, Imperial College London and SAICOrthogonal Convolutional Neural NetworksOrthogonal Convolutional Neural Networks232 - Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding232 - Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding1276 - ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning1276 - ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised LearningMatch or No Match: Keypoint Filtering Based on Matching ProbabilityMatch or No Match: Keypoint Filtering Based on Matching ProbabilityHandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation From a Single Depth MapHandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation From a Single Depth MapDSGN: Deep Stereo Geometry Network for 3D Object DetectionDSGN: Deep Stereo Geometry Network for 3D Object DetectionDeepLPF: Deep Local Parametric Filters for Image EnhancementDeepLPF: Deep Local Parametric Filters for Image Enhancement324 - Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically Meani324 - Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically MeaniInverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a...Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a...368 - DB-GAN: Boosting Object Recognition Under Strong Lighting Conditions368 - DB-GAN: Boosting Object Recognition Under Strong Lighting ConditionsNeural Pose Transfer by Spatially Adaptive Instance NormalizationNeural Pose Transfer by Spatially Adaptive Instance NormalizationALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday TasksALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday TasksHigh-Frequency Component Helps Explain the Generalization of Convolutional Neural NetworksHigh-Frequency Component Helps Explain the Generalization of Convolutional Neural NetworksBlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo NetworksBlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks1257 - Multimodal Prototypical Networks for Few-shot Learning1257 - Multimodal Prototypical Networks for Few-shot Learning1369 - CenterFusion:Center-based Radar and Camera Fusionfor 3D Object Detection1369 - CenterFusion:Center-based Radar and Camera Fusionfor 3D Object Detection
Яндекс.Метрика