Connectedness-based subspace clustering
Knowledge and Information Systems
An algorithm for density-based subspace clustering of given data is proposed here. Unlike the existing density-based subspace clustering algorithms which find clusters using spatial proximity, existence of common high-density regions is the condition for grouping of features here. The proposed method is capable of finding subspace clusters based on both linear and nonlinear relationships between features. Unlike existing density-based subspace clustering algorithms, the values of parameters for density estimation need not be provided by the user. These values are calculated for each pair of features using data distribution in space corresponding to the particular pair of features. This allows proposed approach to find subspace clusters where relationship between different features exists at different scales. The performance of proposed algorithm is compared with other subspace clustering methods using artificial and real-life datasets. The proposed method is seen to find subspace clusters embedded in 5 artificial datasets with greater G score. It is also seen that the proposed method is able to find subspace clusters corresponding to known classes in 4 real-life datasets, with greater accuracy.
Jain, Namita and Murthy, C. A., "Connectedness-based subspace clustering" (2019). Journal Articles. 978.