Pre-selection in cluster Lasso methods for correlated variable selection in high-dimensional linear models
Document Type
Conference Article
Publication Title
CEUR Workshop Proceedings
Abstract
We consider variable selection problems in high dimensional sparse regression models with strongly correlated variables. To handle correlated variables, the concept of clustering or grouping variables and then pursuing model fitting is widely accepted. When the dimension is very high, finding an appropriate group structure is as dfficult as the original problem. We propose to use Elastic-net as a pre-selection step for Cluster Lasso methods (i.e. Cluster Group Lasso and Cluster Representative Lasso). The Elastic-net selects correlated relevant variables, but it fails to reveal the correlation structure among the active variables. We use cluster Lasso methods to address shortcoming of the Elastic-net, and the Elasticnet is used to provide reduced feature set for the cluster Lasso methods. We theoretically explore, the group selection consistency of the proposed combination of algorithms under various conditions, i.e. Irrepresentable Condition (IC), Elastic-net Irrepresentable Condition (EIC) and Group Irrepresentable Condition (GIC). We support the theory using simulated and real dataset examples.
First Page
43
Last Page
55
Publication Date
1-1-2017
Recommended Citation
Gauraha, Niharika and Parui, Swapan, "Pre-selection in cluster Lasso methods for correlated variable selection in high-dimensional linear models" (2017). Conference Articles. 318.
https://digitalcommons.isical.ac.in/conf-articles/318