Natural Language for Visual Question Answering.

Date of Submission

December 2018

Date of Award

Winter 12-12-2019

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science

Department

Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)

Supervisor

Garain, Utpal (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

Visual reasoning with compositional natural language instructions, as described in the newly-released Cornell Natural Language Visual Reasoning (NLVR) dataset[1], is a challenging task, where the model needs to have the ability to create an accurate mapping between the diverse phrases and the several objects placed in complex arrangements in the image. Natural language questions are inherently compositional, and can be answered by reasoning about their decomposition into modular sub-problems. In the recently proposed End-to-End Module Networks (N2NMNs)[2] the network tries to learn to predict question specific network architecture composed of set of predefined modules. The model learns to generate network structures (by imitating expert demonstrations) while simultaneously learning network parameters. We have implemented the N2NMN model for NLVR task. By visualizing the N2NMN model on the NLVR dataset, we have found that the model is unable to find out correspondence between image feature and textual feature. We have proposed modification in the N2NMN model to capture better mapping between image and textual feature.

Comments

ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843743

Control Number

ISI-DISS-2018-396

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/6962

This document is currently not available here.

Share

COinS