From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Keypoint-based matching is a fundamental component of modern 3D vision systems, such as Structure-from-Motion (SfM) and SLAM. However, most existing learning-based methods are trained on image pairs, which does not explicitly optimize for the long-term trackability of keypoints across extended sequences under challenging viewpoint and illumination changes. In this paper, we reframe keypoint detection as a sequential decision-making problem. We introduce a novel, end-to-end reinforcement learning framework, dubbed TAP-Point, that optimizes keypoint selection directly on image sequences. Our core innovation is a trackaware reward mechanism that jointly encourages the repeatability and distinctiveness of keypoints across multiple views, guided by a policy gradient method. To further enhance matching robustness, our descriptor branch is built upon a more advanced DINOV3 feature extractor, yielding highly discriminative descriptors. Extensive evaluations on sparse matching benchmarks, including relative pose estimation and 3D reconstruction, demonstrate that TAP-Point significantly outperforms all current state-of-the-art keypoint detection and description techniques.

Видео From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection канала liwen yang

Комментарии отсутствуют

Информация о видео

15 мая 2026 г. 7:58:08

00:04:31

liwen yang

Правообладателям

Жалоба на материал Недопустимый материал Нарушение авторских прав

Комментарии

Другие видео канала