Click here to flash read.
Recent works on 6D object pose estimation focus on learning keypoint
correspondences between images and object models, and then determine the object
pose through RANSAC-based algorithms or by directly regressing the pose with
end-to-end optimisations. We argue that learning point-level discriminative
features is overlooked in the literature. To this end, we revisit Fully
Convolutional Geometric Features (FCGF) and tailor it for object 6D pose
estimation to achieve state-of-the-art performance. FCGF employs sparse
convolutions and learns point-level features using a fully-convolutional
network by optimising a hardest contrastive loss. We can outperform recent
competitors on popular benchmarks by adopting key modifications to the loss and
to the input data representations, by carefully tuning the training strategies,
and by employing data augmentations suitable for the underlying problem. We
carry out a thorough ablation to study the contribution of each modification.