Conference Articles

Clustering High-dimensional Data with Ordered Weighted ℓ1 Regularization

Chandramauli Chakraborty, Indian Statistical Institute, Kolkata
Sayan Paul, Indian Statistical Institute, Kolkata
Saptarshi Chakraborty, University of California, Berkeley
Swagatam Das, Indian Statistical Institute, Kolkata

Document Type

Conference Article

Publication Title

Proceedings of Machine Learning Research

Abstract

Clustering complex high-dimensional data is particularly challenging as the signal-to-noise ratio in such data is significantly lower than their classical counterparts. This is mainly because most of the features describing a data point have little to no information about the natural grouping of the data. Filtering such features is, thus, critical in harnessing meaningful information from such large-scale data. Many recent methods have attempted to find feature importance in a centroid-based clustering setting. Though empirically successful in classical low-dimensional settings, most perform poorly, especially on microarray and single-cell RNA-seq data. This paper extends the merits of weighted center-based clustering through the Ordered Weighted ℓ1 (OWL) norm for better feature selection. Appealing to the elegant properties of block coordinate-descent and Frank-Wolf algorithms, we are not only able to maintain computational efficiency but also able to outperform the state-of-the-art in high-dimensional settings. The proposal also comes with finite sample theoretical guarantees, including a rate of O (√k log p/n), under model-sparsity, bridging the gap between theory and practice of weighted clustering.

First Page

7176

Last Page

7189

Publication Date

1-1-2023

Recommended Citation

Chakraborty, Chandramauli; Paul, Sayan; Chakraborty, Saptarshi; and Das, Swagatam, "Clustering High-dimensional Data with Ordered Weighted ℓ1 Regularization" (2023). Conference Articles. 588.
https://digitalcommons.isical.ac.in/conf-articles/588

This document is currently not available here.

COinS

Conference Articles

Clustering High-dimensional Data with Ordered Weighted ℓ1 Regularization

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Recommended Citation

Browse

Search

Author Corner

Links

Conference Articles

Clustering High-dimensional Data with Ordered Weighted ℓ1 Regularization

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner

Links