Click here to flash read.
While self-supervised speech representation learning (SSL) models serve a
variety of downstream tasks, these models have been observed to overfit to the
domain from which the unlabelled data originates. To alleviate this issue, we
propose PADA (Pruning Assisted Domain Adaptation) and zero out redundant
weights from models pre-trained on large amounts of out-of-domain (OOD) data.
Intuitively, this helps to make space for the target-domain ASR finetuning. The
redundant weights can be identified through various pruning strategies which
have been discussed in detail as a part of this work. Specifically, we
investigate the effect of the recently discovered Task-Agnostic and Task-Aware
pruning on PADA and propose a new pruning paradigm based on the latter, which
we call Cross-Domain Task-Aware Pruning (CD-TAW). CD-TAW obtains the initial
pruning mask from a well fine-tuned OOD model, which makes it starkly different
from the rest of the pruning strategies discussed in the paper. Our proposed
CD-TAW methodology achieves up to 20.6% relative WER improvement over our
baseline when fine-tuned on a 2-hour subset of Switchboard data without
language model (LM) decoding. Furthermore, we conduct a detailed analysis to
highlight the key design choices of our proposed method.