Master’s Dissertations

Part 1. An Explainer for Information Retrieval Research. Part 2. Open Domain Complex Question Answering.

Sourav Saha, Indian Statistical InstituteFollow

Date of Submission

December 2020

Date of Award

Winter 12-12-2021

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science

Department

Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)

Supervisor

Mitra, Mandar (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

This thesis is organised in two parts. First, an explainability in Information retrieval (IR) research where we focus on the performance of the IR models. We present a toolkit I-REX to illustrate the performance and explainability of IR systems. It is an interactive interface built on top of Lucene and gives a white box view of any proposed method. It is implemented as a web based and as well as shell based interface to provide an intuitive explanations and performance of IR systems. The baseline retrieval models such as LM, BM25 and DFR, and a set of well-defined features enable debugging the performance of retrieval experiments such as ad-hoc IR or query expansion. Next we worked on an open domain complex factoid Question Answering (QA). Creating annotated data in QA problem requires lot of resources and it is very time consuming. The available datasets are often domain specific and most of the times created for some specific languages. Therefore we mainly focus on answering the questions in an unsupervised way. As a benchmark data we used the data provided by Lu et al. (Quest). It mainly focuses on complex questions which cannot be answered by knowledge graphs (KGs) directly. Our architecture uses corpus signals over the various documents along with the traditional QA pipeline to answer the complex questions. We proposed a set of modified evaluation protocols to overcome some serious pitfalls in the evaluation measure used in Quest. We also compared the performances of our architecture with another neural benchmark model DrQA. Experiments on this benchmark datasets have shown that our model significantly outperforms Quest and DrQA. We find this very encouraging since DrQA is trained on SQuAD, TREC Questions, WebQuestion, WikiMovies while our proposed method is unsupervised in nature.

Comments

ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28842708

Control Number

ISI-DISS-2020-19

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/7173

Recommended Citation

Saha, Sourav, "Part 1. An Explainer for Information Retrieval Research. Part 2. Open Domain Complex Question Answering." (2021). Master’s Dissertations. 19.
https://digitalcommons.isical.ac.in/masters-dissertations/19

This document is currently not available here.

COinS

Master’s Dissertations

Part 1. An Explainer for Information Retrieval Research. Part 2. Open Domain Complex Question Answering.

Date of Submission

Date of Award

Institute Name (Publisher)

Document Type

Degree Name

Subject Name

Department

Supervisor

Abstract (Summary of the Work)

Comments

Control Number

Creative Commons License

DOI

Recommended Citation

Browse

Search

Author Corner

Links

Master’s Dissertations

Part 1. An Explainer for Information Retrieval Research. Part 2. Open Domain Complex Question Answering.

Author (Researcher Name)

Date of Submission

Date of Award

Institute Name (Publisher)

Document Type

Degree Name

Subject Name

Department

Supervisor

Abstract (Summary of the Work)

Comments

Control Number

Creative Commons License

DOI

Recommended Citation

Share

Browse

Search

Author Corner

Links