Categorization and Automatic Linking of Web-Pages.

Date of Submission

December 2000

Date of Award

Winter 12-12-2001

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science

Department

Computer and Statistical Services Centre (CSSC)

Supervisor

Bagchi, Aditya (CSSC-Kolkata; ISI)

Abstract (Summary of the Work)

The abundant availability of information in electronic form, especially in hyper-text format, has made Text Categorization and Automatic Hyper-text Linking very important.In this dissertation an algorithm for Hierarchical Text Categorization is presented. The algorithm makes use of the similarity computations based on keyword matching. A study on the application of the algorithm to the documents collected from Google Directory-Science (http://directory, google.com) is presented.In this dissertation an algorithm for Automatic Hyper-text Link Typing of HTML pages is also presented. The algorithm makes use of the similarity of part-pairs, obtained after dividing the documents into parts and comparing each part-pair individually. A set of measures to find the link type between any two documents is proposed.

Comments

ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843334

Control Number

ISI-DISS-2000-142

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/6312

This document is currently not available here.

Share

COinS