The Bibliographic Citation Recommendation Problem.

Date of Submission

December 2015

Date of Award

Winter 12-12-2016

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science


Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)


Mitra, Mandar (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

An essential step in authoring a research paper is inclusion of appropriate references or citations. Incorporating relevant references increase academic weight of the paper by presenting links to similar contributions and highlighting the novelty of the work under discussion. This work is becoming increasingly more demanding with increase in volume of published works. Recommender systems for bibliographic citations aim to ease the burden on the author by suggesting possible references globally as well as contextually. More than 200 papers have been published in last two decades exploring various approaches to the problem. In spite of this, no definitive results are available about what approaches work best. Conflicting reports have been published regarding the relative effectiveness of content-based and collaborative filtering based techniques. Arguably the most important reason for this lack of consensus is the dearth of standardized test collections and evaluation protocols, such as those provided by TREC-like forums; forcing research workers to use their own data sets for experiments. A practice that makes objective comparison of techniques a near impossibility. Recent publication of “CiteseerX: A scholarly big data set” makes available raw material for addressing the problem, pending making it into a standard test-evaluation framework. We discuss in this report our efforts in designing a test collection with a well defined evaluation protocol by solving problems with the data set, supplementing the data set with standard queries and their relevance judgments. We also report performances of some standard proposed recommendation approached on our test setup.


ProQuest Collection ID:

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


This document is currently not available here.