Natural Language for Visual Question Answering.
Date of Submission
December 2018
Date of Award
Winter 12-12-2019
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)
Supervisor
Garain, Utpal (CVPR-Kolkata; ISI)
Abstract (Summary of the Work)
Visual reasoning with compositional natural language instructions, as described in the newly-released Cornell Natural Language Visual Reasoning (NLVR) dataset[1], is a challenging task, where the model needs to have the ability to create an accurate mapping between the diverse phrases and the several objects placed in complex arrangements in the image. Natural language questions are inherently compositional, and can be answered by reasoning about their decomposition into modular sub-problems. In the recently proposed End-to-End Module Networks (N2NMNs)[2] the network tries to learn to predict question specific network architecture composed of set of predefined modules. The model learns to generate network structures (by imitating expert demonstrations) while simultaneously learning network parameters. We have implemented the N2NMN model for NLVR task. By visualizing the N2NMN model on the NLVR dataset, we have found that the model is unable to find out correspondence between image feature and textual feature. We have proposed modification in the N2NMN model to capture better mapping between image and textual feature.
Control Number
ISI-DISS-2018-396
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
DOI
http://dspace.isical.ac.in:8080/jspui/handle/10263/6962
Recommended Citation
Sarkar, Debleena, "Natural Language for Visual Question Answering." (2019). Master’s Dissertations. 388.
https://digitalcommons.isical.ac.in/masters-dissertations/388
Comments
ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843743