不平衡数据知识挖掘:类分布对支持向量机分类的影响

Mining Knowledge from Unbalanced Data: Effect of Class Distribution on SVM Classification

  • 摘要: 基于标准支持向量机及其启发,提出并证明支持向量数(率)和边界支持向量数(率)的界,并分别推广到正例类和反例类.在此基础上,证明正例的分类精度依概率小于反例的分类精度.虚拟数据仿真和Benchmark数据仿真表明本文所提方法的有效性和结论的正确性.

     

    Abstract: Based on standard support vector machines(SVMs), the bound of both the support vector number(and rate) and boundary support vector number(and rate)is proposed and proved.Then the bounds are extended to positive class and negative class respectively.On the basis of the bounds,it is proved that the positive class yields poorer classification and predictive accuracy than the negative class does.Simulation results of both artificial data sets and benchmark data sets show that the conclusion and method in this paper is true and effective.

     

/

返回文章
返回