Abstract:
The web classification is the problem of automatically assigning electronic text documents to pre-specified categories. In this paper,we focus on the SOFM algorithm that is derived automatically using a technique based on frequencies of titles and frequencies of keywords,investigating the effect of such addition on text classification perform ance.Our investigation into keywords,selected on the basis of frequencies confirms that the addition of keywords does give better accuracy,and moreover,the larger the pro portion of key words'features added,the larger the gain.We adopt unsupervised SOFM network to classify appro ximately the web pages.After that,the modified LVQ metho disused to clearly classify the overlap area of each class.The results have shown it is quite pro mising.