摘要
针对传统分类标引系统算法模型准确率低、难以有效解决线性不可分数据的分类问题,引进了SVM模型,设计了基于SVM的书目数据智能分类检测系统,以西安航空学院图书馆书目数据为样本,通过数据预处理、TF-IDF特征提取、chi2特征降维、LinearSVC建模等,完成分类器的训练,在测试集上完成分类器的性能评估,并与逻辑回归、随机森林、朴素贝叶斯进行对比实验。实验结果表明,召回率为0.82、f1分数为0.82,精确率为0.83,准确率为0.85,高于其他机器学习模型,精度较高,泛化能力较强,具有良好的适用性。
Aiming at the low accuracy of the algorithm model of the traditional classification indexing system,difficulty in effective solving the classification problem of linear inseparable data,the article firstly introduced the SVM model and designed an SVM-based intelligent classification and detection system for bibliographic data.Through taking the bibliographic data of the library of Xi’an Aeronautical University as a sample,through data preprocessing,TF-IDF feature extraction,chi2 feature dimensionality reduction and Linear SVC modeling,etc.,the training of the classifier is completed,the performance evaluation of the classifier on the test set is completed,and comparative experiments are conducted with logistic regression,random forest and Naive Bayes.The experimental results show that the recall rate is 0.82,the f1 score is 0.82,the accuracy rate is 0.83,and the accuracy rate is 0.85,which is higher than other machine learning models,with higher accuracy,stronger generalization ability and good applicability.
作者
柴源
Chai Yuan(Library, Xi’an Aeronautical University, Xi’an 710077, China)
出处
《黑龙江科学》
2021年第24期18-21,共4页
Heilongjiang Science
基金
陕西省教育厅科研计划项目(19JK0334)
陕西省教育厅科研计划项目(20JK0199)。