摘要
为更好的实现图书馆文献管理,提出基于随机森林的图书馆馆藏文献自动分类方法。使用TFC权重算法提取文献特征,计算各特征权重。构建分类决策树,使用后剪枝算法控制文献初次分类精度。整合决策树结构生成文献分类器,结合边际函数完成随机森林文献分类算法。构建实验环节,实验结果表明:此方法具有较高的分类精度,可有效提升分类加速比和并行分类效果。
In order to better realize library document management,an automatic classification method of library collection documents based on random forest is proposed.TFC weight algorithm is used to extract literature features and calculate the weight of each feature.The classification decision tree is constructed,and the post pruning algorithm is used to control the accuracy of literature primary classification.The document classifier is generated by integrating the decision tree structure,and the random forest document classification algorithm is completed combined with the marginal function.The experimental link is constructed,the experimental results show that this method has high classification accuracy,can effectively improve the classification acceleration ratio and the parallel classification effect.
作者
王清
WANG Qing(Shandong Jianzhu University,Jinan 250101 China)
出处
《自动化技术与应用》
2022年第7期51-53,72,共4页
Techniques of Automation and Applications
关键词
决策树
图书馆管理
文本分类
随机森林
剪枝算法
加速比
decision tree
library management
text classification
random forest
pruning algorithm
acceleration ratio