Ignorance is Bliss: Exploring Defenses Against Invariance-Based Attacks on Neural Machine Translation Systems

Article Type

Research Article

Publication Title

IEEE Transactions on Artificial Intelligence

Abstract

This article addresses an invariance-based attack on the transformer, a state-of-the-art neural machine translation (NMT) system. Such attacks make multiple changes to the source sentence with the goal of keeping the predicted translation unchanged. Since the gold translation is not available for the adversarial sentences, tackling invariance-based attacks is a challenging task. We propose two contrasting defense strategies for the same, learn to deal and learn to ignore. In learn to deal, NMT system is trained not to predict the same translation for a clean text and its noisy counterpart, whereas in learn to ignore, NMT system is trained to output a dummy sentence in the target language whenever it encounters a noisy text. The experiments on two language pairs, English-German (en-de) and English-French (en-fr), show that learn to deal strategy reduces the attack success rate from 84.0% to 62.2% for en-de and from 84.6% to 73.8% for en-fr, whereas learn to ignore strategy reduces the attack success rate from 84.0% to 27.2% for en-de and from 84.6% to 37.0% for en-fr.

First Page

518

Last Page

525

DOI

10.1109/TAI.2021.3123931

Publication Date

8-1-2022

This document is currently not available here.

Share

COinS