Post lasso stability selection for high dimensional linear models

Document Type

Conference Article

Publication Title

ICPRAM 2017 - Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods


Lasso and sub-sampling based techniques (e.g. Stability Selection) are nowadays most commonly used methods for detecting the set of active predictors in high-dimensional linear models. The consistency of the Lassobased variable selection requires the strong irrepresentable condition on the design matrix to be fulfilled, and repeated sampling procedures with large feature set make the Stability Selection slow in terms of computation time. Alternatively, two-stage procedures (e.g. thresholding or adaptive Lasso) are used to achieve consistent variable selection under weaker conditions (sparse eigenvalue). Such two-step procedures involve choosing several tuning parameters that seems easy in principle, but difficult in practice. To address these problems efficiently, we propose a new two-step procedure, called Post Lasso Stability Selection (PLSS). At the first step, the Lasso screening is applied with a small regularization parameter to generate a candidate subset of active features. At the second step, Stability Selection using weighted Lasso is applied to recover the most stable features from the candidate subset. We show that under mild (generalized irrepresentable) condition, this approach yields a consistent variable selection method that is computationally fast even for a very large number of variables. Promising performance properties of the proposed PLSS technique are also demonstrated numerically using both simulated and real data examples.

First Page


Last Page




Publication Date



Open Access, Hybrid Gold

This document is currently not available here.