基于竞争思想的分级聚类算法

Hierarchical Clustering Algorithm Based on Competitive Learning

  • 摘要: 为了解决海量数据分析中的非凸状等复杂聚类问题,同时兼顾聚类算法运算速度,提出了一种新的基于竞争思想的快速分级聚类算法.首先,根据给定邻域半径对数据进行第1级分类;然后,在第1级聚类的基础上,基于数据竞争的思想,以簇间数据密度为依据,设立第1级聚类生成的小簇之间小簇联系性权重的增加准则;最后,依据该准则计算有联系的小簇之间联系权重,对达到权重阈值的小簇进行合并,从而解决非凸状等复杂聚类问题.仿真实验表明,算法的聚类精度和抗噪声能力均优于传统的K-means算法和基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法.由于算法复杂度较低,算法对于大数据的聚类分析将会具有更好的适用性.

     

    Abstract: We propose a new hierarchical clustering algorithm based on competition theory to solve the issue of non-convex and other complex clustering for massive data analysis with efficient computation. First, we separate the data into a number of sub-clusters according to a given rudimentary clustering radius. Then, on the basis of the first-level clustering, we establish a criterion for strengthening the inter-cluster association weight based on the idea of data competition depending on the data density between the sub-clusters. Finally, the sub-clusters with qualified association weights are grouped into resultant clusters to solve complex clustering problems, such as non-convex clustering. The clustering accuracy and anti-noise capability of the new hierarchical clustering algorithm are superior to those of the traditional K-means algorithm and density-based DBSCAN clustering algorithms. Given the low complexity of the algorithm, the proposed algorithm can be used in clustering analysis of big data.

     

/

返回文章
返回