Abstract
Sparse subspace clustering (SSC) focuses on revealing the structure and distribution of high dimensional data from an algebraic perspective. It is a two-phase clustering technique, performing sparse representation of the high dimensional data and subsequently cutting the induced affinity graph, which cannot achieve an optimal or expected clustering result. To address this challenge, this paper proposes an approach to subspace representation collaborative clustering (SRCC) for incomplete high dimensional data. In the proposed model, both phases of sparse subspace representation and clustering are integrated into a unified optimization, in which a fuzzy partition matrix is introduced as a bridge to cluster the extracted sparse representation features of the data. At the same time, the missing entries are adaptively imputed along with the two phases. To generalize SRCC to a semi-supervised case, an adjacency matrix of incomplete data is constructed with the ideas of ‘Must-link’ and ‘Cannot-link’. Meanwhile, a semi-supervised indicator matrix is introduced to promote discriminative capacity of revealing global and local structures of incomplete data and enhance the performance of clustering. The semi-supervised sparse representation collaborative clustering (S3RCC) is modeled. Extensive experiments on lots of real-world benchmark datasets demonstrate the superior performance of the proposed two models on imputation and clustering of incomplete data compared to the state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The Yale face dataset: http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html
The Extended Yale B (EYB) face dataset: http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html
The JAFFE face dataset: http://www.kasrl.org/jaffe.html
The AR face dataset: http://www2.ece.ohio-state.edu/aleix/ARdatabase.html
The COIL20 object dataset: https://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php
The COIL100 object dataset: https://cave.cs.columbia.edu/repository/COIL-100
The Tomlins-2006-v1 dataset: https://schlieplab.org/Static/Supplements/CompCancer/datasets.htm
The Su-2001 dataset: https://schlieplab.org/Static/Supplements/CompCancer/datasets.htm
References
Zhang X, Xue X, Sun H, Liu Z, Guo L, Guo X (2021) Robust multiple kernel subspace clustering with block diagonal representation and low-rank consensus kernel. Knowl-Based Syst 227:107243
Chen J, Mao H, Wang Z, Zhang X (2021) Low-rank representation with adaptive dictionary learning for subspace clustering. Knowl-Based Syst 223(13):107053
Deng T, Ye D, Ma M, Fujita H, Xiong L (2020) Low-rank local tangent space embedding for subspace clustering. Inform Sci 508:1–21
Vivekanandan K, Praveena N (2021) Hybrid convolutional neural network (CNN) and long-short term memory (LSTM) based deep learning model for detecting shilling attack in the social-aware network. J Ambient Intell Humanized Comput 12(1):1197–1210
Tepe C, Demir M (2022) Real-Time classification of EMG Myo armband data using support vector machine. IRBM 43(4):300–308
Munusamy S, Murugesan P (2020) Modified dynamic fuzzy C-means clustering algorithm-Application in dynamic customer segmentation. Appl Intell 50(6):1922–1942
Hu J, Yin H, Wei G, Song Y (2022) An improved FCM clustering algorithm with adaptive weights based on PSO-TVAC algorithm. Appl Intell 52(8):9521–9536
Barath D, Matas J (2022) Graph-cut RANSAC: Local optimization on spatially coherent structures. IEEE Trans Pattern Anal Mach Intell 44(9):4961–4974
Ren Z, Sun Q, Wu B, Zhang X, Yan W (2020) Learning latent low-rank and sparse embedding for robust image feature extraction. IEEE Trans Image Process 29:2094–2107
Ren Z, Sun Q (2021) Simultaneous global and local graph structure preserving for multiple kernel clustering. IEEE Trans Neural Netw Learn Syst 32(5):1839–1851
Wei L, Ji F, Liu H, Zhou R, Zhu C, Zhang X (2022) Subspace clustering via structured sparse relation representation. IEEE Trans Neural Netw Learn Syst 33(9):4610–4623
Menon V, Muthukrishnan G, Kalyani S (2020) Subspace clustering without knowing the number of clusters: A parameter free approach. IEEE Trans Signal Process 68:5047–5062
Zeng S, Duan X, Li H, Bai J, Tang Y, Wang Z (2023) A sparse framework for robust possibilistic \(k\)-subspace clustering. IEEE Trans Fuzzy Syst 31(4):1124–1138
De Ford D, Pauls S (2019) Spectral clustering methods for multiplex networks. Phys A: Stat Mech Appl 553:121949
Elhamifar E, Vidal R (2013) Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781
Sun W, Peng J, Yang G, Du Q (2020) Fast and latent low-rank subspace clustering for hyperspectral band selection. IEEE Trans Geosci Remote Sens 58(6):3906–3915
Fu Z, Zhao Y, Chang D, Wang Y (2020) A hierarchical weighted low-rank representation for image clustering and classification. Pattern Recog 112(7):107736
You C, Palade V, Wu X (2019) Robust structure low-rank representation in latent space. Eng Appl Artif Intell 77:117–124
Yang C, Robinson D, Vidal R (2015) Sparse subspace clustering with missing entries. In: Proceedings of the 32nd international conference on machine learning, pp 2463–2472
Bhojanapalli S, Jain P (2014) Universal matrix completion. In: Proceedings of the 31st international conference on machine learning, pp 1881–1889
Fan J, Chow T (2017) Sparse subspace clustering for data with missing entries and high-rank matrix completion. Neural Netw 93:36–44
Sefidian A, Daneshpour N (2019) Missing value imputation using a novel grey based fuzzy C-means, mutual information based feature selection, and regression model. Expert Syst Appl 115:68–94
Xu Z, Liu Y, Li C (2021) Distributed semi-supervised learning with missing data. IEEE Trans Cybern 51(12):6165–6178
Wang L, Chan R, Zeng T (2021) Probabilistic semi-supervised learning via sparse graph structure learning. IEEE Trans Neural Netw Learn Syst 32(2):853–867
Li S, Li W, Hu J, Li Y (2022) Semi-supervised bi-orthogonal constraints dual-graph regularized NMF for subspace clustering. Appl Intell 52(3):3227–3248
Wang Z, Wang S, Bai L, Wang W, Shao Y (2022) Semi-supervised fuzzy clustering with fuzzy pairwise constraints. IEEE Trans Fuzzy Syst 30(9):3797–3811
Mey A, Loog M (2023) Improved generalization in semi-supervised learning: A survey of theoretical results. IEEE Trans Pattern Anal Mach Intell 45(4):4747–4767
Gan H, Yang Z, Zhou R (2023) Adaptive safety-aware semi-supervised clustering. Expert Syst Appl 212:118751
Fang X, Xu Y, Li X, Lai Z, Wong W (2016) Robust semi-supervised subspace clustering via non-negative low-rank representation. IEEE Trans Cybern 46(8):1828–1838
Wang W, Yang C, Chen H, Feng X (2018) Unified discriminative and coherent semi-supervised subspace clustering. IEEE Trans Image Process 27(5):2461–2470
Peng S, Ser W, Chen B, Lin Z (2021) Robust semi-supervised nonnegative matrix factorization for image clustering. Pattern Recog 111(3):107683
Jing X, Yan Z, Shen Y, Pedrycz W, Yang J (2022) A group-based distance learning method for semi-supervised fuzzy clustering. IEEE Trans Cybern 52(5):3083–3096
Wen J, Zhang Z, Zhang Z, Fei L, Wang M (2021) Generalized incomplete multi-view clustering with flexible locality structure diffusion. IEEE Trans Cybern 51(1):101–114
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structure by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184
Xing Z, Wen M, Peng J, Feng J (2021) Discriminative semi-supervised non-negative matrix factorization for data clustering. Eng Appl Artif Intell 103(1):104289
Hathaway R, Bezdek J (2001) Fuzzy C-means clustering of incomplete data. IEEE Trans Syst Man Cybern 31(5):735–744
Guo Z, Han J, Gong X, Liu L, Zhou R, Wu Y (2022) ADMM-based method for estimating magnetotelluric impedance in the time domain. IEEE Trans Geosci Remote Sens 60:1–16
Fu L, Yang J, Chen C, Zhang C (2022) Low-rank tensor approximation with local structure for multi-view intrinsic subspace clustering. Inform Sci 606:877–891
Pedrycz W, Amato A, Lecce V, Piuri V (2008) Fuzzy clustering with partial supervision in organization and classification of digital images. IEEE Trans Fuzzy Syst 16(4):1008–1026
Salehi F, Keyvanpour M, Sharif A (2021) SMKFC-ER: Semi-supervised multiple kernel fuzzy clustering based on entropy and relative entropy. Inform Sci 547:667–688
Wang J, Yang Z, Liu X, Li B, Yi J, Nie F (2022) Projected fuzzy C-means with probabilistic neighbors. Inform Sci 607:553–571
Acknowledgements
This paper was supported by the National Natural Science Foundation of China with grant 12171115.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Deng, T., Wang, J., Jia, Q. et al. Semi-supervised sparse representation collaborative clustering of incomplete data. Appl Intell 53, 31077–31105 (2023). https://doi.org/10.1007/s10489-023-05168-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05168-1