Paper
14 April 2023 Contrast clustering based on representation learning
Zhida Wang
Author Affiliations +
Proceedings Volume 12613, International Conference on Computer Vision, Application, and Algorithm (CVAA 2022); 1261316 (2023) https://doi.org/10.1117/12.2673276
Event: International Conference on Computer Vision, Application, and Algorithm (CVAA 2022), 2022, Chongqing, China
Abstract
As an important research field of machine learning, clustering aims to partition the samples of unsupervised data into clusters. Traditional clustering methods, like K-means, are concise and simple to be used. But they have very limited performance on various real-world data. Recently, clustering methods based on neural networks have been proposed to improve the processing of complicated and large-scale data. However, these methods are time-consuming and still difficult to extract effective features, which limits this application. Besides, most of the clustering methods cannot be applied to online clustering. To solve these problems, we propose Representation Contrast Clustering (RCC) method in this paper. To enhance the ability to extract features, we propose to introduce contrast learning into clustering, which makes it possible to extract clustering-friendly and effective features from complex data. Without labels, contrast learning is comparable to supervised learning in extracting features. Moreover, it designed a “pre-training & fine-tuning” structure, which can save clustering time and be used for online clustering. In the pre-training phase, a contrast learning framework combining data augmentation and neural networks is used to extract a cluster-friendly representation. In the fine-tuning phase, the representations are clustered by the strategy of “label-as-representation”. Experimental results show our proposed RCC method achieves state-of-the-art performance on most datasets. For example, RCC’s NMI of 0.764 on the cifar10 dataset exceeds the known best methods by 6 percentage points, while the ACC of 0.855 is at least 6 percentage points higher than others. In summary, the RCC method does not require data integrity in either pre-training or fine-tuning stages, so RCC can run on large data and be used for online clustering. This method is very efficient and requires very little training time for a downstream task, i.e. clustering after pre-training is completed, which also achieves state-of-the-art performance on most datasets in the clustering experiments.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhida Wang "Contrast clustering based on representation learning", Proc. SPIE 12613, International Conference on Computer Vision, Application, and Algorithm (CVAA 2022), 1261316 (14 April 2023); https://doi.org/10.1117/12.2673276
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Feature extraction

Education and training

Data modeling

Neural networks

Chemical elements

Matrices

Back to Top