Click here to flash read.
Few-shot instance segmentation extends the few-shot learning paradigm to the
instance segmentation task, which tries to segment instance objects from a
query image with a few annotated examples of novel categories. Conventional
approaches have attempted to address the task via prototype learning, known as
point estimation. However, this mechanism depends on prototypes (\eg mean of
$K-$shot) for prediction, leading to performance instability. To overcome the
disadvantage of the point estimation mechanism, we propose a novel approach,
dubbed MaskDiff, which models the underlying conditional distribution of a
binary mask, which is conditioned on an object region and $K-$shot information.
Inspired by augmentation approaches that perturb data with Gaussian noise for
populating low data density regions, we model the mask distribution with a
diffusion probabilistic model. We also propose to utilize classifier-free
guided mask sampling to integrate category information into the binary mask
generation process. Without bells and whistles, our proposed method
consistently outperforms state-of-the-art methods on both base and novel
classes of the COCO dataset while simultaneously being more stable than
existing methods. The source code is available at:
https://github.com/minhquanlecs/MaskDiff.
No creative common's license