Abstract:
We propose a new hierarchical clustering algorithm based on competition theory to solve the issue of non-convex and other complex clustering for massive data analysis with efficient computation. First, we separate the data into a number of sub-clusters according to a given rudimentary clustering radius. Then, on the basis of the first-level clustering, we establish a criterion for strengthening the inter-cluster association weight based on the idea of data competition depending on the data density between the sub-clusters. Finally, the sub-clusters with qualified association weights are grouped into resultant clusters to solve complex clustering problems, such as non-convex clustering. The clustering accuracy and anti-noise capability of the new hierarchical clustering algorithm are superior to those of the traditional
K-means algorithm and density-based DBSCAN clustering algorithms. Given the low complexity of the algorithm, the proposed algorithm can be used in clustering analysis of big data.