Extension of SMART with Divergence From Randomness (DFR) Model.

Date of Submission

December 2010

Date of Award

Winter 12-12-2011

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science


Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)


Mitra, Mandar (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

The SMART information retrieval system is a sophisticated open-source text processing system based on the vector space model, developed over the last thirty five years. The SMART system automatically generates vectors for any given text collection and a set of queries and then uses the notion of vector similarity in computing the ranks of document vectors.[9]The divergence from randomness (DFR) model of information retrieval was proposed in the year 2002 by Amati and Rijsbergen.[4] Amati reported his experimental findings in his PhD. thesis, which showed that his framework produces different nonparametric models forming baseline alternatives to the standard tf − idf model. The DFR model is widely used as a benchmark model for testing the performance of other information retrieval models. Keeping in mind the need of the IR community for an open-source implementation of the DFR model, we decided to extend the SMART system by implementing the DFR modeled IR policy within its framework. This, we hope would enable IR researchers to do newer experiments on the DFR model, and possibly improve it.


ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843111

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.



This document is currently not available here.