Journal Articles

Handling data irregularities in classification: Foundations, trends, and future challenges

Swagatam Das, Indian Statistical Institute, Kolkata
Shounak Datta, Indian Statistical Institute, Kolkata
Bidyut B. Chaudhuri, Indian Statistical Institute, Kolkata

Article Type

Research Article

Publication Title

Pattern Recognition

Abstract

Most of the traditional pattern classifiers assume their input data to be well-behaved in terms of similar underlying class distributions, balanced size of classes, the presence of a full set of observed features in all data instances, etc. Practical datasets, however, show up with various forms of irregularities that are, very often, sufficient to confuse a classifier, thus degrading its ability to learn from the data. In this article, we provide a bird's eye view of such data irregularities, beginning with a taxonomy and characterization of various distribution-based and feature-based irregularities. Subsequently, we discuss the notable and recent approaches that have been taken to make the existing stand-alone as well as ensemble classifiers robust against such irregularities. We also discuss the interrelation and co-occurrences of the data irregularities including class imbalance, small disjuncts, class skew, missing features, and absent (non-existing or undefined) features. Finally, we uncover a number of interesting future research avenues that are equally contextual with respect to the regular as well as deep machine learning paradigms.

First Page

674

Last Page

693

DOI

10.1016/j.patcog.2018.03.008

Publication Date

9-1-2018

Recommended Citation

Das, Swagatam; Datta, Shounak; and Chaudhuri, Bidyut B., "Handling data irregularities in classification: Foundations, trends, and future challenges" (2018). Journal Articles. 1267.
https://digitalcommons.isical.ac.in/journal-articles/1267

This document is currently not available here.

COinS

Journal Articles

Handling data irregularities in classification: Foundations, trends, and future challenges

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Browse

Search

Author Corner

Links

Journal Articles

Handling data irregularities in classification: Foundations, trends, and future challenges

Authors

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner

Links