Abstract:
To solve the problem of sample labeling and concept drift in the process of data streams classification, we propose an instance-based transfer data streams classification model. First, we use support vector machine as the learning machine in this model. The support vectors constitute the source domain, and the current data block forms the target domain. Then, we select the real neighbors of the target domain from the source domain according to mutual neighbor concept; as a result, the occurrence of negative transfer can be neglected. Finally, we combine the target domain and the transfer sample to form a training set, and this enlarges the number of labeled sample and enhances the generalization ability of the classifier model. Through the analysis of theory and the experiment results, the method is found to be feasible and superior to the other learning methods in terms of classification accuracy.