DocDeshadower: Frequency-aware Transformer for Document Shadow Removal. (arXiv:2307.15318v1 [cs.CV])
Click here to flash read.
The presence of shadows significantly impacts the visual quality of scanned
documents. However, the existing traditional techniques and deep learning
methods used for shadow removal have several limitations. These methods either
rely heavily on heuristics, resulting in suboptimal performance, or require
large datasets to learn shadow-related features. In this study, we propose the
DocDeshadower, a multi-frequency Transformer-based model built on Laplacian
Pyramid. DocDeshadower is designed to remove shadows at different frequencies
in a coarse-to-fine manner. To achieve this, we decompose the shadow image into
different frequency bands using Laplacian Pyramid. In addition, we introduce
two novel components to this model: the Attention-Aggregation Network and the
Gated Multi-scale Fusion Transformer. The Attention-Aggregation Network is
designed to remove shadows in the low-frequency part of the image, whereas the
Gated Multi-scale Fusion Transformer refines the entire image at a global scale
with its large perceptive field. Our extensive experiments demonstrate that
DocDeshadower outperforms the current state-of-the-art methods in both
qualitative and quantitative terms.