Master’s Dissertations

Large Scale Hierarchical Text Classification.

Gourab Saha, Indian Statistical InstituteFollow

Date of Submission

December 2013

Date of Award

Winter 12-12-2014

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science

Department

Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)

Supervisor

Parui, Swapan Kumar (CVPR-Kolkata; ISI)

Abstract (Summary of the Work)

Due to the growing amount of textual data, automatic methods for organizing the data are needed. Automatic text classication is one of this methods. It automatically assigns documents to a set of classes based on the textual content of the document.Large-scale multi-labeled text classification is an emerging field because real web data have about several millions of samples and about half a million of non-exclusive categories. But this is a challenging task in that it is hard for a single algorithm to achieve both performance and scalability at the same time.Normally, the set of classes is hierarchically structured but most of todays classication approaches ignore hierarchical structures, thereby loosing valuable human knowledge.This thesis exploits the hierarchical organization of classes to improve accuracy and reduce computational complexity.Experiments are performed on Track 1 medium size wikipidia data set from ECML/PKDD 2012 discovery challenge. A top-down hierarchical classification method has been proposed using local classifier at each intermediate node.

Comments

ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843079

Control Number

ISI-DISS-2013-303

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/6460

Recommended Citation

Saha, Gourab, "Large Scale Hierarchical Text Classification." (2014). Master’s Dissertations. 66.
https://digitalcommons.isical.ac.in/masters-dissertations/66

This document is currently not available here.

COinS

Master’s Dissertations

Large Scale Hierarchical Text Classification.

Date of Submission

Date of Award

Institute Name (Publisher)

Document Type

Degree Name

Subject Name

Department

Supervisor

Abstract (Summary of the Work)

Comments

Control Number

Creative Commons License

DOI

Recommended Citation

Browse

Search

Author Corner

Links

Master’s Dissertations

Large Scale Hierarchical Text Classification.

Author (Researcher Name)

Date of Submission

Date of Award

Institute Name (Publisher)

Document Type

Degree Name

Subject Name

Department

Supervisor

Abstract (Summary of the Work)

Comments

Control Number

Creative Commons License

DOI

Recommended Citation

Share

Browse

Search

Author Corner

Links