期刊文献+

Research on Feature Extraction Method of Social Network Text 被引量:2

下载PDF
导出
摘要 The development of various applications based on social network text is in full swing.Studying text features and classifications is of great value to extract important information.This paper mainly introduces the common feature selection algorithms and feature representation methods,and introduces the basic principles,advantages and disadvantages of SVM and KNN,and the evaluation indexes of classification algorithms.In the aspect of mutual information feature selection function,it describes its processing flow,shortcomings and optimization improvements.In view of its weakness in not balancing the positive and negative correlation characteristics,a balance weight attribute factor and feature difference factor are introduced to make up for its deficiency.The experimental stage mainly describes the specific process:the word segmentation processing,to disuse words,using various feature selection algorithms,including optimized mutual information,and weighted with TF-IDF.Under the two classification algorithms of SVM and KNN,we compare the merits and demerits of all the feature selection algorithms according to the evaluation index.Experiments show that the optimized mutual information feature selection has good performance and is better than KNN under the SVM classification algorithm.This experiment proves its validity.
出处 《Journal of New Media》 2021年第2期73-80,共8页 新媒体杂志(英文)
  • 相关文献

参考文献2

二级参考文献38

  • 1Taskar B, Abbeel P, Koller D. Discriminative probabilistic models for relational data. In Proc. the 18th Conf. Uncer- tainty in Artificial Intelligence, August 2002, pp.485-492. 被引量:1
  • 2Chakrabarti S, Dom B, Indyk P. Enhanced hypertext catego- rization using hyperlinks. In Proc. International Conference on Management of Data, June 1998, pp.307-318. 被引量:1
  • 3Neville J, Jensen D. Iterative classification in relational data. In Proc. AAAI 2000 Workshop on Learning Statistical Mod- els from Relational Data, July 2000, pp.13-20. 被引量:1
  • 4Getoor L, Diehl C P. Link mining: A survey. ACM SIGKDD Explorations Newsletter, 2005, 7(2): 3-12. 被引量:1
  • 5Ganiz M C, Kanitkar S, Chuah M C, Pottenger W M. Detec- tion of interdomain routing anomalies based on higher-order path analysis. In Proc. the 6th IEEE International Confer- ence on Data Mining, December 2006, pp.874-879. 被引量:1
  • 6Ganiz M C, Lytkin N, Pottenger W M. Leveraging higher or- der dependencies between features for text classification. In Proc. European Conference on Machine Learning and Prin- ciples and Practice of Knowledge Discovery in Databases, September 2009, pp.375-390. 被引量:1
  • 7Ganiz M C, George C, Pottenger W M. Higher order Naive Bayes: A novel non-IID approach to text classification. IEEE Trans. Knowledge and Data Engineering, 2011, 23(7): 1022- 1034. 被引量:1
  • 8Lytkin N. Variance-based clustering methods and higher or- der data transformations and their applications [Ph.D. The- sis]. Rutgers University, N J, 2009. 被引量:1
  • 9Edwards A, Pottenger W M. Higher order Q-Learning. In Proc. IEEE Syrup. Adaptive Dynamic Programming and Re- inforcement Learning, April 2011, pp.128-134. 被引量:1
  • 10Deerwester S C, Dumais S T, Landauer T K et al. Indexing by latent semantic analysis. Journal of the American Society for information Science, 1990, 41(6): 391-407. 被引量:1

共引文献2

同被引文献6

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部