Date of Submission
6-12-2019
Date of Award
12-12-2019
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)
Supervisor
Garain, Utpal (CVPRU-ISI)
Abstract (Summary of the Work)
Nowadays Deep Neural Network based solutions are deployed to solve numerous tasks. Thus, it has become absolutely important to study the robustness of these systems. Machine Translation is one of the popular applications of Deep Neural Networks. This thesis studies the robustness of Neural Machine Translation systems by generating adversarial examples with the objective to fool the model. Whenever there is a change in the source, i.e. when a word in the input sentence is replaced by an unrelated word, the translation system is supposed to reflect the changes while doing translation. These unwanted invariance learned by the model is undesirable. With intention to exploit this undesirable property learned by a Neural Machine Translation system we design an attack called: Invariance-based targeted attack. This attack introduces multiple changes(replacement of words) to the original input sentence, keeping the translation unchanged. In-order to facilitate the explanation of the design of the attack we introduce two methods: (i) Min-Grad method: To identify the position where a replacement of the word makes the least change in the translation, and (ii) Soft-Attn method: To search for a new word to replace, given a list of choices. The initial part of the report explain the preliminary explorations we did in-order to get some insights on how to do the problem formulation. These experiments are run on LSTM based models with single replacement policy. Using the learning from the first part we extend the experiments to Transformer and BLSTM based models, which are considered as the state-of-the-art systems for machine translation.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
P., Abijith K., "Adversarial Attack on Neural Machine Translation System" (2019). Master’s Dissertations. 400.
https://digitalcommons.isical.ac.in/masters-dissertations/400
Included in
Artificial Intelligence and Robotics Commons, Computer Engineering Commons, Cybersecurity Commons, Information Security Commons, OS and Networks Commons, Theory and Algorithms Commons
Comments
Master's dissertation submitted in 2019 (Roll No - CS1730)