Click here to flash read.
Detecting negatives (such as non-entailment relationships, unanswerable
questions, and false claims) is an important and challenging aspect of many
natural language understanding tasks. Though manually collecting challenging
negative examples can help models detect them, it is both costly and
domain-specific. In this work, we propose Self-labeled Counterfactuals for
Extrapolating to Negative Examples (SCENE), an automatic method for
synthesizing training data that greatly improves models' ability to detect
challenging negative examples. In contrast with standard data augmentation,
which synthesizes new examples for existing labels, SCENE can synthesize
negative examples zero-shot from only positive ones. Given a positive example,
SCENE perturbs it with a mask infilling model, then determines whether the
resulting example is negative based on a self-training heuristic. With access
to only answerable training examples, SCENE can close 69.6% of the performance
gap on SQuAD 2.0, a dataset where half of the evaluation examples are
unanswerable, compared to a model trained on SQuAD 2.0. Our method also extends
to boolean question answering and recognizing textual entailment, and improves
generalization from SQuAD to ACE-whQA, an out-of-domain extractive QA
benchmark.