李智冈, 吕莉, 谭德坤, 康平, 樊棠怀. 基于加权核密度估计与微簇合并的密度峰值聚类算法[J]. 信息与控制, 2024, 53(3): 302-314. DOI: 10.13976/j.cnki.xk.2024.3013
引用本文: 李智冈, 吕莉, 谭德坤, 康平, 樊棠怀. 基于加权核密度估计与微簇合并的密度峰值聚类算法[J]. 信息与控制, 2024, 53(3): 302-314. DOI: 10.13976/j.cnki.xk.2024.3013
LI Zhigang, LYU Li, TAN Dekun, KANG Ping, FAN Tanghuai. Density Peaks Clustering Algorithm Based on Weighted Kernel Density Estimation and Micro-cluster Merging[J]. INFORMATION AND CONTROL, 2024, 53(3): 302-314. DOI: 10.13976/j.cnki.xk.2024.3013
Citation: LI Zhigang, LYU Li, TAN Dekun, KANG Ping, FAN Tanghuai. Density Peaks Clustering Algorithm Based on Weighted Kernel Density Estimation and Micro-cluster Merging[J]. INFORMATION AND CONTROL, 2024, 53(3): 302-314. DOI: 10.13976/j.cnki.xk.2024.3013

基于加权核密度估计与微簇合并的密度峰值聚类算法

Density Peaks Clustering Algorithm Based on Weighted Kernel Density Estimation and Micro-cluster Merging

  • 摘要: 密度峰值聚类(DPC)算法作为一种基于密度的聚类算法,因其简单高效而得到广泛应用,但DPC算法易将一个高密度类簇划分为多个类簇且极易产生分配连带错误。对此,提出了基于加权核密度估计与微簇合并的密度峰值聚类算法(WEMCM-DPC),利用核密度估计和加权K近邻重新定义局部密度,缩小高密度类簇和稀疏类簇的局部密度差异,使类簇中心的识别更加准确;提出了新的微簇间相似性度量准则,减少数据集中过于稀疏或密集样本对其他样本的影响,为微簇合并提供了依据,并且改善了DPC算法的分配连带错误,使聚类结果更加准确。密度分布不均数据集和真实数据集的实验结果表明,WEMCM-DPC算法的聚类结果优于DPC和4个改进算法。

     

    Abstract: The density peaks clustering (DPC) algorithm is a widely used density-based clustering algorithm because of its simplicity and efficiency. However, although the DPC algorithm can easily divide a high-density cluster into multiple clusters, it is very easy to generate assignment linkage errors. In this regard, we propose a DPC algorithm based on weighted kernel density estimation and microcluster merging (WEMCM-DPC) that redefines the local density using kernel density estimation and weighted K-nearest neighbors and reduces high-density clusters. The local density difference of sparse clusters improves cluster center identification. A new similarity measure between microclusters is proposed that can reduce the influence of too sparse or dense samples in data on other samples, provide a basis for the merging of microclusters and improving the allocation error of the DPC algorithm, and improve accuracy of the clustering results. The WEMCM-DPC algorithm has been found to outperform the DPC and the four improved algorithms in clustering performance, as demonstrated by experimental data on datasets with uneven density distributions and real datasets.

     

/

返回文章
返回