摘要
夏秋季节的梗与叶片的色泽差异小,采用传统色选机难以实现精选。该文提出依据茶叶形态特征的多特征向量分选法,以期实现茶叶精选算法快速建模,提高分选精度。采集动态下落过程中的茶叶图像,开发基于图像处理的特征提取程序自动提取多组茶叶样本形态特征参数,采用随机森林算法判定特征权重并进行特征选择,建立逻辑回归、决策树和支持向量机3种不同分类算法对样本进行分类,验证特征的可分性,并分析不同分类算法对复杂茶叶样本分类效果的影响。试验结果表明:1)形态特征参数圆形度E的重要性权重最大,为0.467,最终将重要性阈值设定为0.05,选择圆形度E、矩形度R、线性度Len、周长C和紧凑度J 5种形态特征向量建立数据集;2)在测试数据集中,逻辑回归(logistic regression,LR)、决策树(decision tree,DT)和支持向量机(support vector machine,SVM)3种分类算法的平均准确率为0.924,说明所选特征具有明显的可分性;3)根据输出的混淆矩阵,3种分类算法中支持向量机算法识别效果最好,准确率和调和平均数(F1)得分分别为93.8%和94.7%。该方法可快速应用于其他类型茶叶精选和茶叶实际生产过程,有效提高茶叶品质。
The color between stalks and leaves of tea in summer and autumn is similar,which means the traditional color sorter is difficult to sort based on optical characteristics.To realize the rapid modeling of tea selection algorithm and improve the sorting accuracy,a method for sorting the fine and bad products of tea by multi-feature vectors based on the morphological characteristics was introduced in this paper.First,Wuyishan Dahongpao tea was selected as a test sample to collect images during the dynamic drop process.The blue element image was extracted,and single sample’s binary image and edge were obtained by analysis of whole image connection area.Then,feature extraction program was developed based on image processing algorithm to extract morphological feature parameters of the tea samples automatically.Four simple shape descriptors-the sample perimeter,area,the length and width of minimum bounding rectangle were extracted.On this basis,eight complex shape descriptors-circularity,rectangularity,linearity,slightness,diameter,diagonal of minimum bounding rectangle,compactness and centroid were calculated.In addition,the random forest algorithm was used to determine the above features weight,the feature was selected according to weight threshold.Finally,logistic regression(LR),decision tree(DT)and support vector machine(SVM)that three different classification algorithms were established to classify the samples,verify the validity of the features and analyze the effects of different classification algorithms on the classification of tea.The original data were normalized and randomly segmented 80%used for training,20%for testing.10-fold cross-validation was used to select the optimal parameters of the classification model,and the training dataset was randomly divided into 10 parts,of which9 parts were used for training,and the remaining 1 part was used for verification.According to the above machine learning system parameter optimization process to obtain the logical regression,decision tree and support vector machi
作者
吴正敏
曹成茂
王二锐
罗坤
张金炎
孙燕
Wu Zhengmin;Cao Chengmao;Wang Errui;Luo Kun;Zhang Jinyan;Sun Yan(College of Engineering,Anhui Agricultural University,Hefei 230036,China)
出处
《农业工程学报》
EI
CAS
CSCD
北大核心
2019年第11期315-321,共7页
Transactions of the Chinese Society of Agricultural Engineering
基金
安徽省科技重大专项(18030701195)
安徽省高校自然科学研究项目(KJ2016A233)联合资助
关键词
形态特征
决策树
支持向量机
逻辑回归
随机森林
茶叶
morphology
decision tree
support vector machine
logistic regression
random forest
tea