Abstract:
The density peak clustering algorithm is simple and efficient and does not require iterative calculations. It has the advantages of setting the number of clusters in advance, but it is easy to produce a "domino"effect when dividing non-centered samples. Moreover, it cannot accurately partition the samples and noise in the overlapping area. To solve the above problems, the belief density peak clustering algorithm for uncertain data is proposed. First, the algorithm uses the
K-nearest neighbors of non-class center samples to determine the degree of belief of the samples belonging to different clusters based on the density peak clustering algorithm so as to obtain the cluster center samples and partition the samples into a meta-cluster with the largest degree of belief to obtain the preliminary clustering results of
K-nearest neighbors. Then, the upper quantile of the density is calculated to obtain the density threshold and credal partition under the framework of evidence reasoning, and isolated samples whose density is less than the threshold are classified into the noise cluster. Afterward, the samples in the overlapping part are partitioned into the composite cluster composed of related single clusters. The degree of belief strongly supports the classification of samples belonging to a certain cluster into the corresponding single cluster. The algorithm introduces the composite cluster and noise cluster to accurately show the uncertainty of the sample under the existing attribute information. Experimental results show that this algorithm can achieve better clustering performance compared with other algorithms on artificial and UCI datasets.