Optimization of Text Clustering Based on EM Algorithm
-
Graphical Abstract
-
Abstract
A model named TCOM(text clustering optimization model) based on expectation-maximization(EM) algorithm is proposed to solve the problem that the existing text clustering algorithms can not achieve satisfac-(tory results.) This model describes the similarity distribution of the similar and non-similar pair of clusters,and pre-(sents) the importance distribution of the important and unimportant documents.The method based on TCOM optimizes the performance by merging different text clustered results.Experimental results show that clustering precision and recall are both improved,and its performance is higher than that of either hard clustering method or soft clustering method.
-
-