直推式支持向量机(Transductive Support Vector Machine,TSVM)是标准的支持向量机算法在半监督学习问题上的一种扩展,但已有的TSVM算法存在训练速度慢、回溯式学习多、学习性能不稳定等缺点,针对这些问题提出一种改进的直推式支持向量...直推式支持向量机(Transductive Support Vector Machine,TSVM)是标准的支持向量机算法在半监督学习问题上的一种扩展,但已有的TSVM算法存在训练速度慢、回溯式学习多、学习性能不稳定等缺点,针对这些问题提出一种改进的直推式支持向量机算法———ITSVM,该算法较准确地确定了待训练的未标识样本中的正负样本数问题,有效解决了传统TSVM中过多的回溯式学习问题,同时该算法也无需利用过多的未标识训练样本,从而减轻了计算强度.实验表明,ITSVM相比TSVM在分类正确率、分类速度以及使用的样本规模上,都表现出了一定的优越性.展开更多
在直推式支持向量机(transductive support vector machine,TSVM)中,迭代过程中样本标注错误会导致错误传递,影响下一次迭代中样本标注准确度,使得错误不断地被积累,造成最终分类超平面的偏移。在不均衡数据集下,传统支持向量机(support...在直推式支持向量机(transductive support vector machine,TSVM)中,迭代过程中样本标注错误会导致错误传递,影响下一次迭代中样本标注准确度,使得错误不断地被积累,造成最终分类超平面的偏移。在不均衡数据集下,传统支持向量机(support vector machine,SVM)对样本分类的错误率较高,导致TSVM在每次迭代中标注样本准确度不高。针对此,本文提出一种不均衡数据集下的直推式学习算法,该算法依据各类支持向量的密度分布关系动态计算各类的惩罚因子,提高每次迭代中样本标注的准确度,算法在继承渐进赋值和动态调整规则的基础上,减少分类超平面的偏移。最后,在KDD CUP99数据集上的仿真实验结果表明该算法能够提高TSVM在不均衡数据下的分类性能,降低误警率和漏报率。展开更多
This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant model for semi-supervised class...This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant model for semi-supervised classification task. And then a variation of the expectation maximization (EM) algorithm was derived to solve the optimization problem, which leads to an iterative algorithm. Although our method is developed in probabilistic framework, there is no need to make assumption about the specific form of data distribution. Besides, the crucial updating formula has closed form. This method was evaluated for text categorization on two standard datasets, 20 news group and Reuters-21578. Experiments show that our approach outperforms the state-of-the-art graph-based transductive learning methods.展开更多
Recent years have witnessed an increasing interest in transfer learning. This paper deals with the classification problem that the target-domain with a different distribution from the source-domain is totally unlabele...Recent years have witnessed an increasing interest in transfer learning. This paper deals with the classification problem that the target-domain with a different distribution from the source-domain is totally unlabeled, and aims to build an inductive model for unseen data. Firstly, we analyze the problem of class ratio drift in the previous work of transductive transfer learning, and propose to use a normalization method to move towards the desired class ratio. Furthermore, we develop a hybrid regularization framework for inductive transfer learning. It considers three factors, including the distribution geometry of the target-domain by manifold regularization, the entropy value of prediction probability by entropy regularization, and the class prior by expectation regularization. This framework is used to adapt the inductive model learnt from the source-domain to the target-domain. Finally, the experiments on the real-world text data show the effectiveness of our inductive method of transfer learning. Meanwhile, it can handle unseen test points.展开更多
In many machine learning problems, a large amount of data is available but only a few of them can be labeled easily. This provides a research branch to effectively combine unlabeled and labeled data to infer the label...In many machine learning problems, a large amount of data is available but only a few of them can be labeled easily. This provides a research branch to effectively combine unlabeled and labeled data to infer the labels of unlabeled ones, that is, to develop transductive learning. In this article, based on Pattern classification via single sphere (SSPC), which seeks a hypersphere to separate data with the maximum separation ratio, a progressive transductive pattern classification method via single sphere (PTSSPC) is proposed to construct the classifier using both the labeled and unlabeled data. PTSSPC utilize the additional information of the unlabeled samples and obtain better classification performance than SSPC when insufficient labeled data information is available. Experiment results show the algorithm can yields better performance.展开更多
文摘直推式支持向量机(Transductive Support Vector Machine,TSVM)是标准的支持向量机算法在半监督学习问题上的一种扩展,但已有的TSVM算法存在训练速度慢、回溯式学习多、学习性能不稳定等缺点,针对这些问题提出一种改进的直推式支持向量机算法———ITSVM,该算法较准确地确定了待训练的未标识样本中的正负样本数问题,有效解决了传统TSVM中过多的回溯式学习问题,同时该算法也无需利用过多的未标识训练样本,从而减轻了计算强度.实验表明,ITSVM相比TSVM在分类正确率、分类速度以及使用的样本规模上,都表现出了一定的优越性.
基金supported by the Mechanism Socialist Method and Higher Intelligence Theory of the National Natural Science Fund Projects(60873001)
文摘This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant model for semi-supervised classification task. And then a variation of the expectation maximization (EM) algorithm was derived to solve the optimization problem, which leads to an iterative algorithm. Although our method is developed in probabilistic framework, there is no need to make assumption about the specific form of data distribution. Besides, the crucial updating formula has closed form. This method was evaluated for text categorization on two standard datasets, 20 news group and Reuters-21578. Experiments show that our approach outperforms the state-of-the-art graph-based transductive learning methods.
基金Supported by the National Science Foundation of China (Grant Nos. 60435010, 60675010)National High Technology Research and Development of China (Grant Nos. 2006AA01Z128, 2007AA01Z132)+1 种基金National Basic Research Priorities Programme (Grant No. 2007CB311004)National Science and Technology Support Plan (Grant No. 2006BAC08B06)
文摘Recent years have witnessed an increasing interest in transfer learning. This paper deals with the classification problem that the target-domain with a different distribution from the source-domain is totally unlabeled, and aims to build an inductive model for unseen data. Firstly, we analyze the problem of class ratio drift in the previous work of transductive transfer learning, and propose to use a normalization method to move towards the desired class ratio. Furthermore, we develop a hybrid regularization framework for inductive transfer learning. It considers three factors, including the distribution geometry of the target-domain by manifold regularization, the entropy value of prediction probability by entropy regularization, and the class prior by expectation regularization. This framework is used to adapt the inductive model learnt from the source-domain to the target-domain. Finally, the experiments on the real-world text data show the effectiveness of our inductive method of transfer learning. Meanwhile, it can handle unseen test points.
基金supported by the National Natural Science of China(6057407560705004).
文摘In many machine learning problems, a large amount of data is available but only a few of them can be labeled easily. This provides a research branch to effectively combine unlabeled and labeled data to infer the labels of unlabeled ones, that is, to develop transductive learning. In this article, based on Pattern classification via single sphere (SSPC), which seeks a hypersphere to separate data with the maximum separation ratio, a progressive transductive pattern classification method via single sphere (PTSSPC) is proposed to construct the classifier using both the labeled and unlabeled data. PTSSPC utilize the additional information of the unlabeled samples and obtain better classification performance than SSPC when insufficient labeled data information is available. Experiment results show the algorithm can yields better performance.