WANG Kang, MA Zongfang, TIAN Hongpeng, SONG Lin. Belief Density Peak Clustering Algorithm for Uncertain Data[J]. INFORMATION AND CONTROL, 2022, 51(3): 349-360. DOI: 10.13976/j.cnki.xk.2022.1203
Citation: WANG Kang, MA Zongfang, TIAN Hongpeng, SONG Lin. Belief Density Peak Clustering Algorithm for Uncertain Data[J]. INFORMATION AND CONTROL, 2022, 51(3): 349-360. DOI: 10.13976/j.cnki.xk.2022.1203

Belief Density Peak Clustering Algorithm for Uncertain Data

More Information
  • Received Date: May 23, 2021
  • Revised Date: November 29, 2021
  • Accepted Date: July 20, 2021
  • Available Online: December 01, 2022
  • Published Date: June 19, 2022
  • The density peak clustering algorithm is simple and efficient and does not require iterative calculations. It has the advantages of setting the number of clusters in advance, but it is easy to produce a "domino"effect when dividing non-centered samples. Moreover, it cannot accurately partition the samples and noise in the overlapping area. To solve the above problems, the belief density peak clustering algorithm for uncertain data is proposed. First, the algorithm uses the K-nearest neighbors of non-class center samples to determine the degree of belief of the samples belonging to different clusters based on the density peak clustering algorithm so as to obtain the cluster center samples and partition the samples into a meta-cluster with the largest degree of belief to obtain the preliminary clustering results of K-nearest neighbors. Then, the upper quantile of the density is calculated to obtain the density threshold and credal partition under the framework of evidence reasoning, and isolated samples whose density is less than the threshold are classified into the noise cluster. Afterward, the samples in the overlapping part are partitioned into the composite cluster composed of related single clusters. The degree of belief strongly supports the classification of samples belonging to a certain cluster into the corresponding single cluster. The algorithm introduces the composite cluster and noise cluster to accurately show the uncertainty of the sample under the existing attribute information. Experimental results show that this algorithm can achieve better clustering performance compared with other algorithms on artificial and UCI datasets.

  • [1]
    陈叶旺, 申莲莲, 钟才明, 等. 密度峰值聚类算法综述[J]. 计算机研究与发展, 2020, 57(2): 378-394. https://www.cnki.com.cn/Article/CJFDTOTAL-JFYZ202002013.htm

    Chen Y W, Shen L L, Zhong C M, et al. A review of density peak clustering algorithms[J]. Journal of Computer Research and Development, 2020, 57(2): 378-394. https://www.cnki.com.cn/Article/CJFDTOTAL-JFYZ202002013.htm
    [2]
    丁伟, 王宏伟, 崔龙, 等. 基于聚类分析和运动描述语言的扑翼飞行机器人行为规划[J]. 信息与控制, 2021, 50(1): 102-112. doi: 10.13976/j.cnki.xk.2021.0211

    Ding W, Wang H W, Cui L, et al. Behavior planning of flapping-wing flying robot based on cluster analysis and motion description language[J]. Information and Control, 2021, 50(1): 102-112. doi: 10.13976/j.cnki.xk.2021.0211
    [3]
    章永来, 周耀鉴. 聚类算法综述[J]. 计算机应用, 2019, 39(7): 1869-1882. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201907002.htm

    Zhang Y L, Zhou Y J. Review of clustering algorithms[J]. Computer Applications, 2019, 39(7): 1869-1882. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201907002.htm
    [4]
    Alex R, Alessandro L. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191): 1492-1496. doi: 10.1126/science.1242072
    [5]
    金辉, 钱雪忠. 自然最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2019, 13(4): 711-720. https://www.cnki.com.cn/Article/CJFDTOTAL-KXTS201904019.htm

    Jin H, Qian X Z. Density peak clustering algorithm based on natural nearest neighbour optimization[J]. Computer Science and Exploration, 2019, 13(4): 711-720. https://www.cnki.com.cn/Article/CJFDTOTAL-KXTS201904019.htm
    [6]
    Li C Z, Zhang Y N. Density peak clustering based on relative density optimization[J]. Mathematical Problems in Engineering, 2020. DOI: 10.1155/2020/2816102.
    [7]
    Kamali T, Stashuk D W, Discovering density-based clustering structures using neighborhood distance entropy consistency[J]. IEEE Transactions on Computational Social Systems, 2020, 7(4): 1069-1080. doi: 10.1109/TCSS.2020.3003538
    [8]
    丁志成, 葛洪伟. 优化分配策略的密度峰值聚类算法[J]. 计算机科学与探索, 2020, 14(5): 792-802. https://www.cnki.com.cn/Article/CJFDTOTAL-KXTS202005009.htm

    Ding Z C, Ge H W. A density peak clustering algorithm for optimal allocation strategy[J]. Computer Science and Exploration, 2020, 14(5): 792-802. https://www.cnki.com.cn/Article/CJFDTOTAL-KXTS202005009.htm
    [9]
    Jiang J H, Hao D H, Chen Y J, et al. GDPC: Gravitation-based density peaks clustering algorithm[J]. Physica A: Statistical Mechanics and its Applications, 2018, 502: 345-355. doi: 10.1016/j.physa.2018.02.084
    [10]
    贾露, 张德生, 吕端端. 物理学优化的密度峰值聚类算法[J]. 计算机工程与应用, 2020, 56(13): 47-53. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202013006.htm

    Jia L, Zhang D S, Lü D D. Physical optimization of density peak clustering algorithm[J]. Computer Engineering and Applications, 2020, 56(13): 47-53. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202013006.htm
    [11]
    Yu D H, Liu G J, Guo M Z, et al. Density peaks clustering based on weighted local density sequence and nearest neighbor assignment[J]. IEEE Access, 2019, 7: 34301-34317. doi: 10.1109/ACCESS.2019.2904254
    [12]
    Liu R, Wang H, Yu X M. Shared-nearest-neighbor-based clustering by fast search and find of density peaks[J]. Information Sciences, 2018, 450: 200-226. doi: 10.1016/j.ins.2018.03.031
    [13]
    朱庆峰, 葛洪伟. K近邻相似度优化的密度峰聚类[J]. 计算机工程与应用, 2019, 55(2): 148-153. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG201902024.htm

    Zhu Q F, Ge H W. Density peaks clustering optimized by K-nearest neighbor's similarity[J]. Computer Engineering and Applications, 2019, 55(2): 148-153. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG201902024.htm
    [14]
    王鸿飞, 刘海斌, 邓鑫洋, 等. 基于幂均算子和证据理论的故障诊断方法[J]. 信息与控制, 2019, 48(5): 567-572. doi: 10.13976/j.cnki.xk.2019.8531

    Wang H F, Liu H B, Deng X Y, et al. Fault diagnosis method based on power-average operator and evidence theory[J]. Information and Control, 2019, 48(5): 567-572. doi: 10.13976/j.cnki.xk.2019.8531
    [15]
    Shafer G. A mathematical theory of evidence turns 40[J]. International Journal of Approximate Reasoning, 2016, 79: 7-25. doi: 10.1016/j.ijar.2016.07.009
    [16]
    Ma Z F, Tian H P, Liu Z C, et al. A new incomplete pattern belief classification method with multiple estimations based on KNN[J]. Applied Soft Computing, 2020. DOI: 10.1016/j.asoc.2020.106175.
    [17]
    Ma Z F, Liu Z, Zhang Y R, et al. Credal transfer learning with multi-estimation for missing data[J]. IEEE Access, 2020, 8: 70316-70328. doi: 10.1109/ACCESS.2020.2983319
    [18]
    曹可劲, 赵宗贵, 江汉. 不确定性证据聚类问题讨论[J]. 信息与控制, 2006, 35(1): 55-58. http://xk.sia.cn/CN/

    Cao K J, Zhao Z G, Jiang H. Discussion on the nonspecific evidence clustering problem[J]. Information and Control, 2006, 35(1): 55-58. http://xk.sia.cn/CN/
    [19]
    段中兴, 毕瀚元, 张作伟. 基于D-S证据理论的不完整数据混合分类算法[J]. 信息与控制, 2020, 49(4): 455-463. doi: 10.13976/j.cnki.xk.2020.9359

    Duan Z X, Bi H Y, Zhang Z W. A hybrid classification algorithm for incomplete data based on d-s evidence theory[J]. Information and Control, 2020, 49(4): 455-463. doi: 10.13976/j.cnki.xk.2020.9359
    [20]
    柯小路, 马荔瑶, 李子懿, 等. 证据推理规则的性质研究及方法修正[J]. 信息与控制, 2016, 45(2): 165-170. doi: 10.13976/j.cnki.xk.2016.0165

    Ke X L, Ma L Y, Li Z Y, et al. Property research and approach modification of evidential reasoning rule[J]. Information and Control, 2016, 45(2): 165-170. doi: 10.13976/j.cnki.xk.2016.0165
    [21]
    Du M J, Ding S F, Jia H J. Study on density peaks clustering based on k-nearest neighbors and principal component analysis[J]. Knowledge-Based Systems, 2016. DOI: 10.1016/j.knosys.2016.02.001.
    [22]
    Shafer G. A Mathematical theory of evidence[M]. Princeton, USA: Princeton University Press, 1976: 56-85.
    [23]
    Dempster A P. Upper and lower probabilities induced by multi-valued mapping[J]. Annals of Mathematical Statistics, 1967, 38(4): 325-339.
    [24]
    Smets P. Analyzing the combination of conflicting belief functions[J]. Information Fusion, 2007, 8(4): 387-412. doi: 10.1016/j.inffus.2006.04.003
    [25]
    Su Z G, Denoeux T. BPEC: Belief-peaks evidential clustering[J]. IEEE Transactions on Fuzzy Systems, 2019, 27(1): 111-123. doi: 10.1109/TFUZZ.2018.2869125
    [26]
    丁志成, 葛洪伟, 周竞. 基于KL散度的密度峰值聚类算法[J]. 重庆邮电大学学报(自然科学版), 2019, 31(3): 367-374. https://www.cnki.com.cn/Article/CJFDTOTAL-CASH201903011.htm

    Ding Z C, Ge H W, Zhou J. Density peak clustering algorithm based on KL divergence[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2019, 31(3): 367-374. https://www.cnki.com.cn/Article/CJFDTOTAL-CASH201903011.htm
    [27]
    Liu Z G, Pan Q, Dezert J, et al. Credal C-means clustering method based on belief functions[J]. Knowledge-Based Systems, 2015, 74(5): 119-132.
  • Related Articles

    [1]ZHANG Wei, ZHENG Hongxuan. Multi-strategy Competitive Swarm Optimization Algorithm Based on Fusion Index Partition[J]. INFORMATION AND CONTROL, 2025, 54(1): 97-109. DOI: 10.13976/j.cnki.xk.2024.3459
    [2]LI Zhigang, LYU Li, TAN Dekun, KANG Ping, FAN Tanghuai. Density Peaks Clustering Algorithm Based on Weighted Kernel Density Estimation and Micro-cluster Merging[J]. INFORMATION AND CONTROL, 2024, 53(3): 302-314. DOI: 10.13976/j.cnki.xk.2024.3013
    [3]WU Xiaodong, XIONG Weili. kNN Fault Detection Based on Multi-block Information Extraction and Mahalanobis Distance[J]. INFORMATION AND CONTROL, 2021, 50(3): 287-296. DOI: 10.13976/j.cnki.xk.2021.0279
    [4]ZHANG Cheng, ZHENG Xiaofang, GUO Qingxiu, FENG Liwei, DAI Xunian, LI Yuan. Fault Detection Method Based on Neighborhood Preserving Embedding-weighted k-nearest Neighbors and its Application in Semiconductor Etching Process[J]. INFORMATION AND CONTROL, 2019, 48(6): 738-744. DOI: 10.13976/j.cnki.xk.2019.9234
    [5]KE Xiaolu, MA Liyao, LI Ziyi, WANG Yong. Property Research and Approach Modification of Evidential Reasoning Rule[J]. INFORMATION AND CONTROL, 2016, 45(2): 165-170. DOI: 10.13976/j.cnki.xk.2016.0165
    [6]GUO Jinyu, CHEN Haibin, LI Yuan. kNN Fault Detection Method for Batch Process Based on Principal Sample Modeling Upgraded Online[J]. INFORMATION AND CONTROL, 2014, 43(4): 495-500. DOI: 10.13976/j.cnki.xk.2014.0495
    [7]ZHANG Yuxian, LI Song, DONG Xiao. Multiple Neural Network Model Based on Data Partition Using Feature Clustering[J]. INFORMATION AND CONTROL, 2013, 42(6): 693-699. DOI: 10.3724/SP.J.1219.2013.00693
    [8]WANG Gaitang, LI Ping, SU Chengli. Soft-Sensing Model Based on Multiple K-Nearest Neighbour Regression Algorithm[J]. INFORMATION AND CONTROL, 2011, 40(5): 639-645.
    [9]WANG Saifang, DAI Fang, LIANG Bo, ZHANG Xiaoyu. A Path-based Clustering Algorithm of Partition[J]. INFORMATION AND CONTROL, 2011, 40(1): 141-144.
    [10]LIN Jian-ning, WU Hui-zhong. Research on a Trust Model Based on QoS Constraints[J]. INFORMATION AND CONTROL, 2007, 36(4): 427-433.
  • Cited by

    Periodical cited type(2)

    1. 赵嘉,马清,陈蔚昌,肖人彬,崔志华,潘正祥. 面向流形数据的加权自然近邻密度峰值聚类算法. 兰州大学学报(自然科学版). 2024(05): 652-660+669 .
    2. 张利,路颜萍,侯晴,张皓博. K近邻空间密度分布的模糊聚类算法. 辽宁大学学报(自然科学版). 2023(04): 289-301 .

    Other cited types(5)

Catalog

    Article views (162) PDF downloads (40) Cited by(7)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return