期刊文献+

非共现数据的二元化加权转化算法

Weighting Binary Transformation Algorithm for Non Cooccurrence Data
下载PDF
导出
摘要 面向范畴数据的序列化信息瓶颈算法(CD-sIB)假设数据各个属性特征对二元化转化的贡献均匀,从而影响转化效果.文中提出二元化加权转化方法来反映非共现数据的特征.该方法通过突出非共现数据的代表性属性,从抑制非代表性(冗余)属性,从而获取最佳共现表示.文中提出随机分布数据的适用性和计算方法的无监督性两个非共现加权原则,并基于加权粒度概念构造二元化加权转化算法.实验结果表明,文中算法的聚类精度优于其它算法. The assumption that all data features are equally important in the categorical data-sequential information bottleneck (CD-sIB) lowers the transformation quality. A weighting binary transformation method is proposed to reveal the feature of non co-occurrence data by highlighting the representative features and depressing the redundancy features. Meanwhile, two weighting rules, the applicability of stochastically distributed data and the non supervision of weighting schemes, are introduced. Then, the weighted categorical data-sequential information bottleneck (WCD-sIB) algorithm is presented based on the weighting granularity concept. The experimental results show that the weighting binary transformation method generates good co-occurrence data representation, and the WCD-sIB algorithm is superior to the other algorithms.
作者 姬波 叶阳东
出处 《模式识别与人工智能》 EI CSCD 北大核心 2013年第6期584-591,共8页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金资助项目(No.61170223)
关键词 非共现数据 特征权重 信息瓶颈 面向范畴数据的序列化信息瓶颈(CD—sIB)算法 二元化转化 Non Co-occurrence Data, Feature Weighting, Information Bottleneck, Categorical Data-Sequential Information Bottleneck(CD-sIB) Algorithm, Binary Transformation
  • 相关文献

参考文献18

  • 1Bekkennan R, El-Yaniv R, Tishby N. Distributional Word Clusters vs Words for Text Categorization. Journal of Machine Learning Re?search, 2003, 3: 1183 -1208. 被引量:1
  • 2Slonim N. The Infonnation Bottleneck: Theory and Application. Ph. D Dissertation. Jerusalem, Israel: The Hebrew University of Je?rusalem, 2002. 被引量:1
  • 3叶阳东,何锡点,贾利民.面向范畴类型数据的sIB算法[J].电子学报,2009,37(10):2165-2172. 被引量:5
  • 4Seldin Y, Slonim N, Tishby N. Information Bottleneck for Non Co-Occurrence Data//Scholkopf B, Platt]C, Hoffman T, eds. Ad?vances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2007, XIX: 1241-1248. 被引量:1
  • 5Shamir O, Sabato S, Tishby N. Learning and Generalization with the Information Bottleneck. Theoretical Computer Science, 2010, 411(29/30): 2696-2711. 被引量:1
  • 6Yuan H Q, Ye Y D. Iterative sIB Algorithm. Pattern Recognition Letters, 2011 , 32 (4) : 606-614. 被引量:1
  • 7夏利民,谭立球,钟洪.基于信息瓶颈算法的图像语义标注[J].模式识别与人工智能,2008,21(6):812-818. 被引量:6
  • 8van Rijsbergen C J. A Theoretical Basis for the Use of Co-occurrence Data in Information Retrieval. Journal of Documentation, 1997, 33 (2): 106-119. 被引量:1
  • 9Peat H J, Willett P. The Limitations of Term Co-occurrence Data for Query Expansion in Document Retrieval Systems. Journal of the A?merican Society for Information Science, 1991,42(5): 378-383. 被引量:1
  • 10Andritsos P, Tsaparas P, Miller R J, et al. LIMBO: Scalable Clustering of Categorical Data // Proc of the 9th International Con?ference on Extending Database Technology. Heraklion, Greece, 2004: 531-532. 被引量:1

二级参考文献32

  • 1路晶,马少平.基于概念索引的图像自动标注[J].计算机研究与发展,2007,44(3):452-459. 被引量:10
  • 2钟洪,夏利民.基于本体的图像检索[J].计算机工程与应用,2007,43(17):37-40. 被引量:12
  • 3叶阳东,刘东,贾利民,LI Gang.一种自动确定参数的sIB算法[J].计算机学报,2007,30(6):969-978. 被引量:5
  • 4欧阳军林,夏利民.基于二值信息的颜色和形状特征的图像检索[J].小型微型计算机系统,2007,28(7):1262-1266. 被引量:10
  • 5Wang Lei, Liu Li, Latifu K. Automatic Image Annotation and Retrieval Using Subspace Clustering Algorithm//Proc of the 2nd ACM International Workshop on Muhimedia Databases. Washington, USA, 2004:100 - 108 被引量:1
  • 6Duygulu P, Barnard K, de Freitas N. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary //Proc of the 7th European Conference on Computer Vision. Compehagen, Denmark, 2002, Ⅳ: 97 - 112 被引量:1
  • 7Li Wei , Sun Maosong. Automatic Image Annotation Based on WordNet and Hierarchical Ensembles. // Proc of the 7th International Conference on Computational Linguistics and Intelligent Text Processing. Mexico City, Mexico, 2006 : 417 - 428 被引量:1
  • 8Slonim N, Tishby N. Agglomerative Information Bottleneck//Solla S A, Leen T K, Muller K R, et al. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 1999:617 - 623 被引量:1
  • 9Jain R, Kasturi R, Schunck B G. Machine Vision. New York, USA: Mc-Graw Hill, 1995 被引量:1
  • 10Wagstaff K, Cardie C, Rogers S, et al. Constrained k -Means Clustering with Background Knowledge//Proc of the 18th International Conference on Machine Learning. Williams College, USA, 2001: 577 - 584 被引量:1

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部