摘要
将术语关系抽取转化为分类问题,给出了基于机器学习的术语关系自动抽取流程。针对现有产生式和判定学习算法的缺点,提出了混合分类算法HC。该算法使得一部分特征值通过训练数据估计而来,另一部分特征值通过判定函数训练得到。实验结果表明,该算法优于原来的产生式学习算法和判断学习算法,在人工标注的小训练集上获得了较好的分类效果。
A term relation extraction approach was proposed. It was cast as a classification task. The hybrid classification algorithm combining the advantages of both naive bayes and perceptron was also presented. In this algorithm, a subset of the features was estimated from training data, and another subset of the features was trained by discriminative function. The experimental results showed that the proposed hybrid algorithm almost always outperforms the naive bayes algorithms and perceptron algorithms when the training set is small.
出处
《计算机科学》
CSCD
北大核心
2010年第2期189-191,215,共4页
Computer Science
基金
陕西省教育厅项目(09JK768
09JK774
09JK738)资助
关键词
机器学习
术语关系抽取
混合学习算法
Machine learning,Term relation extraction,Classification algorithm