基于类簇规模不均衡度量的粗糙模糊K-means聚类算法

Improved Rough Fuzzy K-means Clustering based on Imbalanced Measure of Cluster Sizes

  • 摘要: 粗糙模糊K-means (RFKM)聚类综合利用了粗糙集和模糊集的优势互补,是一种有效的聚类分析算法,但现有的RFKM算法大多只考虑了簇内样本空间分布的模糊度量,忽略了类簇规模的不均衡特征对聚类结果的影响,对类簇规模不均衡的数据集进行聚类分析时,适应性较差.为了能够从算法层面直接对类簇规模不均衡的数据集有效地进行聚类分析,引入了对类簇规模不均衡程度的自适应度量,并提出了一种基于类簇规模不均衡度量的粗糙模糊K-means聚类算法.通过人工数据集和UCI标准数据集验证了算法的有效性.

     

    Abstract: Rough fuzzy K-means (RFKM) algorithm, which combines the advantages of rough sets and fuzzy sets, is an effective method to deal with boundary fuzzy data. Most of the existing RFKM and improved algorithms consider only the imbalanced space distribution of samples within the cluster while ignoring the impact of imbalanced cluster sizes on clustering results. Thus, these algorithms may have poor adaptability when faced with imbalanced datasets. To effectively address this problem at an algorithmic level, we introduce a measure of the degree of imbalanced cluster size. Thereafter, on the basis of the current RFKM algorithm, we develop an improved RFKM clustering based on imbalanced measure of cluster sizes. The validity of the algorithm is demonstrated through experimental analysis on the artificial dataset and UCI standard datasets.

     

/

返回文章
返回