Ignorance is Bliss: Exploring Defenses Against Invariance-Based Attacks on Neural Machine Translation Systems
Article Type
Research Article
Publication Title
IEEE Transactions on Artificial Intelligence
Abstract
This article addresses an invariance-based attack on the transformer, a state-of-the-art neural machine translation (NMT) system. Such attacks make multiple changes to the source sentence with the goal of keeping the predicted translation unchanged. Since the gold translation is not available for the adversarial sentences, tackling invariance-based attacks is a challenging task. We propose two contrasting defense strategies for the same, learn to deal and learn to ignore. In learn to deal, NMT system is trained not to predict the same translation for a clean text and its noisy counterpart, whereas in learn to ignore, NMT system is trained to output a dummy sentence in the target language whenever it encounters a noisy text. The experiments on two language pairs, English-German (en-de) and English-French (en-fr), show that learn to deal strategy reduces the attack success rate from 84.0% to 62.2% for en-de and from 84.6% to 73.8% for en-fr, whereas learn to ignore strategy reduces the attack success rate from 84.0% to 27.2% for en-de and from 84.6% to 37.0% for en-fr.
First Page
518
Last Page
525
DOI
10.1109/TAI.2021.3123931
Publication Date
8-1-2022
Recommended Citation
Chaturvedi, Akshay; Chakrabarty, Abhisek; Utiyama, Masao; Sumita, Eiichiro; and Garain, Utpal, "Ignorance is Bliss: Exploring Defenses Against Invariance-Based Attacks on Neural Machine Translation Systems" (2022). Journal Articles. 3013.
https://digitalcommons.isical.ac.in/journal-articles/3013