期刊文献+

基于投影寻踪降维的文本特征可视化 被引量:3

Projection-pursuit-based dimension reduction for visualization of text features
下载PDF
导出
摘要 利用遗传算法优化投影方向,投影寻踪模型将高维的文本特征数据投影到2~3维的低维可视化空间上,并根据高维数据在这个低维空间当中的投影特征值来反映其线性和非线性结构或特征,达到数据降维目的并实现文本数据特征可视化。不仅大大约简了文本挖掘过程的计算复杂性,还有助于在K-means聚类算法中确定初始中心点数目,提高算法精度。实验验证了这种方法应用于文本特征降维的有效性。 Using genetic algorithm to search for the optimal projecting direction, projection pursuit model was used to project text feature data from high-dimensional space into low-dimensional space (2 or 3 dimensions ), and the linear and nonlinear structures and features of the high-dimensional data were shown by its projecting feature value in the low dimensional space, therefore dimensionality was reduced and visualization for high-dimensional text feature data was realized. This method is not only cutting down the computing complexity in the process of text mining, but also helping to determine the number of initial center point for K-means algorithm, and improving the accuracy of the algorithm. Experiments demonstrate the efficiency of this method for text feature dimension reduction.
作者 高茂庭 陆鹏
出处 《计算机应用》 CSCD 北大核心 2008年第6期1411-1413,1416,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(60275020) 上海市教委科研项目(06FZ007) 上海海事大学重点学科建设项目(XL0101)
关键词 投影寻踪 降维 文本挖掘 遗传算法 projection pursuit dimension reduction text mining genetic algorithm
  • 相关文献

参考文献6

  • 1FODOR I K. A survey of dimension reduction techniques, LLNL TR UCRL-ID-148494 [ R]. 2002. 被引量:1
  • 2FRIEDMAN J H, TUKEY J W. A projection pursuit algorithm for exploratory, data analysis [J]. IEEE Transactions on Computer,1974, 23(9): 881 - 890. 被引量:1
  • 3ZHU DONG-HUA, PORTER A L. Automated extraction and visualization of information for technological intelligence and forecasting [J]. Technological Forecasting and Social Change, 2002, 69 (5) : 495 - 506. 被引量:1
  • 4王顺久,张欣莉,丁晶,侯玉.投影寻踪聚类模型及其应用[J].长江科学院院报,2002,19(6):53-55. 被引量:83
  • 5GAO MAO-TING, WANG ZHENG-OU. A new algorithm for text clustering based on projection pursuit [ C]// The 6th International Conference on Machine Learning and Cybernetics. Washington: IEEE Press, 2007:3401 -3405. 被引量:1
  • 6王小平,曹立明著..遗传算法 理论、应用与软件实现[M].西安:西安交通大学出版社,2002:344.

二级参考文献5

共引文献82

同被引文献30

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部