Paper
15 August 2011 Relation chain based clustering analysis
Cheng-ning Zhang, Ming-yang Zhao, Hai-bo Luo
Author Affiliations +
Abstract
Clustering analysis is currently one of well-developed branches in data mining technology which is supposed to find the hidden structures in the multidimensional space called feature or pattern space. A datum in the space usually possesses a vector form and the elements in the vector represent several specifically selected features. These features are often of efficiency to the problem oriented. Generally, clustering analysis goes into two divisions: one is based on the agglomerative clustering method, and the other one is based on divisive clustering method. The former refers to a bottom-up process which regards each datum as a singleton cluster while the latter refers to a top-down process which regards entire data as a cluster. As the collected literatures, it is noted that the divisive clustering is currently overwhelming both in application and research. Although some famous divisive clustering methods are designed and well developed, clustering problems are still far from being solved. The k - means algorithm is the original divisive clustering method which initially assigns some important index values, such as the clustering number and the initial clustering prototype positions, and that could not be reasonable in some certain occasions. More than the initial problem, the k - means algorithm may also falls into local optimum, clusters in a rigid way and is not available for non-Gaussian distribution. One can see that seeking for a good or natural clustering result, in fact, originates from the one's understanding of the concept of clustering. Thus, the confusion or misunderstanding of the definition of clustering always derives some unsatisfied clustering results. One should consider the definition deeply and seriously. This paper demonstrates the nature of clustering, gives the way of understanding clustering, discusses the methodology of designing a clustering algorithm, and proposes a new clustering method based on relation chains among 2D patterns. In this paper, a new method called relation chain based clustering is presented. The given method demonstrates that arbitrary distribution shape and density are not the essential factors for clustering research, in another words, clusters described by some particular expressions should be considered as a uniform mathematical description which is called "relation chain" emphasized in this paper. The relation chain indicates the relation between each pair of the spatial points and gives the evaluation of the connection between the pair-wise points. This relation chain based clustering algorithm initially assigns the neighborhood evaluation radius of the points, then assesses the clustering result based on inner-cluster variance of each cluster while increasing the radius, adjusting the radius properly and finally gives the clustering result. Some experiments are conducted using the proposed method and the hidden data structure is well explored.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Cheng-ning Zhang, Ming-yang Zhao, and Hai-bo Luo "Relation chain based clustering analysis", Proc. SPIE 8196, International Symposium on Photoelectronic Detection and Imaging 2011: Space Exploration Technologies and Applications, 81960G (15 August 2011); https://doi.org/10.1117/12.899499
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data mining

Algorithm development

Associative arrays

Data modeling

Mathematical modeling

Prototyping

Current controlled current source

RELATED CONTENT

Next generation data harmonization
Proceedings of SPIE (May 15 2015)
A step toward the foundations of data mining
Proceedings of SPIE (March 21 2003)
IRIS: our prototype rule generation system
Proceedings of SPIE (February 25 1999)

Back to Top