期刊文献+

基于互信息规则剪枝的关联文本分类 被引量:1

On Classification of Associative Text Based on Rules Pruning of Mutual Information
下载PDF
导出
摘要 传统的关联文本分类算法产生的规则数量巨大,若不对规则剪枝会影响分类效率,而采用以前的剪枝方法又会使分类精度出现不同程度的下降.为此提出以互信息的方法对每个类的规则进行剪枝,挑选出分类能力强的规则构成分类器,对待分类文本进行分类.经过这个方法剪枝后的规则数量大幅减少,且能取得比规则集未修剪过的分类器和采用以前剪枝方法的ARC-BC算法更好的分类效果,大量的实验表明此方法是有效的. The traditional associative classifying algorithms of associative texts generate a huge mumber of rules. If the rules were not pruned, the efficiency of classification would be influenced. However, if the former pruning method were adopted, different degrees of accuracy of classification would appear. Therefore, an associative text classification algo- rithm-based on rules pruning of mutual information is presented to prune the rules of each class. The rules with high clas- sifying capacity are chosen to form classifiers to classify the texts being classified. The study illuminates that the mutual information-based rules pruning algorithm not only gets much less rules but is more helpful for improving the accuracy of the association categorization. The experimental results show the performance of this method is better than both ARC - BC algorithm and the algorithm which uses all rules.
出处 《南京师范大学学报(工程技术版)》 CAS 2008年第4期173-177,共5页 Journal of Nanjing Normal University(Engineering and Technology Edition)
基金 教育部留学回国人员启动基金 中科院软件所开放课题基金(SYSKF0701) 福州大学科技发展基金(2005-XQ-13) 福建省教育厅基金(JB06023)资助项目
关键词 互信息 规则剪枝 关联分类 mutual information, rules pruning, associative classification
  • 相关文献

参考文献6

  • 1[1]Liu B,Hsu W,Ma Y M.Integrating classification and association rule mining[C]//ACM Int'l Conf on Knowledge Discovery and Data Mining.New York:ACM Press,1998:80-86. 被引量:1
  • 2[2]Li W,Han J,Pei J.CMAR:Accurate and efficient classification based on multiple classification rules[C]//Cercone N.Proc of the 2001 IEEE Int'l Conf on Data Mining.California:IEEE Press,2001:369-376. 被引量:1
  • 3[3]Zalane O R,Antonie M L.Classifying text documents by associating terms with text categories[C]//Zhou X F.Proc of the 13th Australasian Database Conf.Melbourne:Australian Computer Society,2002:215-222. 被引量:1
  • 4[4]Agrawal R,Srikant R.Fast algorithms for mining association rules[C]//Bocca J B,Jarke M,Zaniolo C.Proc of the 20th Vary Large Data Bases Conference.Santiago,1994:487-499. 被引量:1
  • 5[5]Han J,Pei J,Yin Y W.Mining frequent patterns without candidate generation[J].Data Mining and Knowledge Discovery,2004,8(1):53-87. 被引量:1
  • 6[7]http://sewm.pku.edu.cn/QA/reference/ICTCLAS/FreeICTCLAS/[OL].中文自然语言处理开放平台网站,2006.http://sewm.pku.edu.cn/QA/rderenee/ICTCLAS/FresICTCLAS/[OL].The Site of Chinese Natural Langnage Processing Platform,2006.(in Chinese) 被引量:1

同被引文献7

  • 1CHANG K C C, CHO J. Accessing the Web: from search to integration [ C ]//Proc of ACM SIGMOD International Conference on Management of Data. Chicago : ACM Press, 2006 :804- 805. 被引量:1
  • 2HE Bin, CHEN-CHUAN C K. Statistical schema matching across Web query interfaces [ C ]//Proc of ACM SIGMOD International Conference on Management of Data. San Diego: ACM Press, 2006:217-228. 被引量:1
  • 3WU W, DOAN A, YU C. WebIQ: learning from the Web to match Deep Web query interfaces[ C ]//Proc of International Conference on Data Engineering. 2006:44. 被引量:1
  • 4WANG J, WEN J R, LOCHOVSKY F H. Instance-based schema matching for Web databases by domain-specific query probing[ C ]// Proc of VLDB. 2004:408-419. 被引量:1
  • 5HE Bin, CHEN-CHUAN C K, HAN Jia-wei. Discovering complex matchings across Web query interfaces: a correlation mining approach [C]//Proc of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. [ S. l. ] : ACM Press, 2004 : 148-157. 被引量:1
  • 6WU W, YU C, DOAN A, et al. An interactive clustering-based approach to integrating source query interfaces on the Deep Web[ C]// Proc of ACM SIGMOD International Conference on Management of Data. 2004:95-106. 被引量:1
  • 7HE B. BAMM extracted query schemas [ D ]. [ S.l. ] : Computer Science Department, University of Illinois,2002. 被引量:1

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部