Categorization and Automatic Linking of Web-Pages.

Date of Submission

December 2000

Date of Award

Winter 12-12-2001

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science


Computer and Statistical Services Centre (CSSC)


Bagchi, Aditya (CSSC-Kolkata; ISI)

Abstract (Summary of the Work)

The abundant availability of information in electronic form, especially in hyper-text format, has made Text Categorization and Automatic Hyper-text Linking very important.In this dissertation an algorithm for Hierarchical Text Categorization is presented. The algorithm makes use of the similarity computations based on keyword matching. A study on the application of the algorithm to the documents collected from Google Directory-Science (http://directory, is presented.In this dissertation an algorithm for Automatic Hyper-text Link Typing of HTML pages is also presented. The algorithm makes use of the similarity of part-pairs, obtained after dividing the documents into parts and comparing each part-pair individually. A set of measures to find the link type between any two documents is proposed.


ProQuest Collection ID:

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


This document is currently not available here.
