Increasing Retrieval Efficiency in Noisy Corpora.

Date of Submission

December 2018

Date of Award

Winter 12-12-2019

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science


Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)


Mitra, Mandar (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

In this thesis we tried to catch the word variations in the noisy corpus. Initially we tried to solve the problem using string similarity and context similarity in the Generalized Language Model. But then this model was unable to improve the retrieval performance as seen experimentally. On delving into the depth of the problem as to why the model was not performing well we came up with a simple and effective approach to solve the problem. This is a simple Query Expansion based method which is used to increase the retrieval performance.


ProQuest Collection ID:

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


This document is currently not available here.