Abstract:
Rough fuzzy
K-means (RFKM) algorithm, which combines the advantages of rough sets and fuzzy sets, is an effective method to deal with boundary fuzzy data. Most of the existing RFKM and improved algorithms consider only the imbalanced space distribution of samples within the cluster while ignoring the impact of imbalanced cluster sizes on clustering results. Thus, these algorithms may have poor adaptability when faced with imbalanced datasets. To effectively address this problem at an algorithmic level, we introduce a measure of the degree of imbalanced cluster size. Thereafter, on the basis of the current RFKM algorithm, we develop an improved RFKM clustering based on imbalanced measure of cluster sizes. The validity of the algorithm is demonstrated through experimental analysis on the artificial dataset and UCI standard datasets.