期刊文献+

基于人工智能大模型技术的果蔬农技知识智能问答系统 被引量:8

Agricultural Technology Knowledge Intelligent Question-Answering System Based on Large Language Model
下载PDF
导出
摘要 [目的/意义]乡村振兴战略给农业技术推广提出新的要求,使农业推广知识的供给形式有待进一步创新。以果蔬农技知识服务为需求导向,基于前沿大语言模型技术,面向新型农业知识导读和知识问答等农技推广服务,构建果蔬农技知识智能问答系统。[方法]基于草莓种植户需求分析,把草莓栽培农技知识划分为不同主题,形成知识对象识别和知识问答两种大模型下游任务,结合机器自动标注和人工标注的方法构建小样本高质量训练语料;通过对比已有的4种大语言模型:Baichuan2-13B-Chat、Chat GLM2-6B、Llama-2-13B-Chat、Chat GPT的性能表现,选择性能最优的模型作为基础模型,按照“优质语料+预训练大模型+微调”的研究思路,训练具有语义分析、上下文关联和生成能力,能够适应多种下游任务的深度神经网络,构建农业知识问答大模型;采用数据优化、检索增强生成技术等多种策略缓解大模型幻觉问题;研发果蔬农技知识智能问答系统,生成高精度、无歧义的农业知识答案,同时支持用户多轮问答。[结果和讨论]以精准率和召回率为命名实体识别任务的性能表现指标,参与测评的国内主流模型在微调后不同知识主题下的平均精准率均超过85%,平均召回率表现各异,其中知识实体类型的数量、标注语料数量等因素都会影响大模型性能;以幻觉率和语义相似度为知识问答任务的性能表现指标,数据优化、采用检索增强生成技术等策略以10%~40%的幅度有效降低大模型幻觉率,并有效提高大模型的语义相似度。[结论]在农业领域的命名实体识别和知识问答任务中,预训练大模型Chat GLM的表现性能最优。针对预训练大模型下游任务的微调和基于检索增强生成(Retrieval-Augmented Generation,RAG)技术的模型优化可以缓解大模型幻觉问题,显著提升大模型性能。大模型技术具有创新农技知识服务模式、 [Objective]The rural revitalization strategy presents novel requisites for the extension of agricultural technology.However,the conventional method encounters the issue of a contradiction between supply and demand.Therefore,there is a need for further innovation in the supply form of agricultural knowledge.Recent advancements in artificial intelligence technologies,such as deep learning and large-scale neural networks,particularly the advent of large language models(LLMs),render anthropomorphic and intelligent agricultural technology extension feasible.With the agricultural technology knowledge service of fruit and vegetable as the demand orientation,the intelligent agricultural technology question answering system was built in this research based on LLM,providing agricultural technology extension services,including guidance on new agricultural knowledge and question-and-answer sessions.This facilitates farmers in accessing high-quality agricultural knowledge at their convenience.[Methods]Through an analysis of the demands of strawberry farmers,the agricultural technology knowledge related to strawberry cultivation was categorized into six themes:basic production knowledge,variety screening,interplanting knowledge,pest diagnosis and control,disease diagnosis and control,and drug damage diagnosis and control.Considering the current situation of agricultural technology,two primary tasks were formulated:named entity recognition and question answering related to agricultural knowledge.A training corpus comprising entity type annotations and question-answer pairs was constructed using a combination of automatic machine annotation and manual annotation,ensuring a small yet high-quality sample.After comparing four existing Large Language Models(Baichuan2-13B-Chat,ChatGLM2-6B,Llama 2-13B-Chat,and ChatGPT),the model exhibiting the best performance was chosen as the base LLM to develop the intelligent question-answering system for agricultural technology knowledge.Utilizing a highquality corpus,pre-training of a Large Lang
作者 王婷 王娜 崔运鹏 刘娟 WANG Ting;WANG Na;CUI Yunpeng;LIU Juan(Agricultural Information Institute,Chinese Academy of Agricultural Sciences,Beijing 100081,China;Key Laborato‐ry of Big Agri-data,Ministry of agriculture and rural areas,Beijing 100081,China;Unit 96962,Beijing 102206,China)
出处 《智慧农业(中英文)》 CSCD 2023年第4期105-116,共12页 Smart Agriculture
基金 北京市数字农业创新团队项目(BAIC10-2023) 中国农业科学院基本科研业务费项目(JBYW-AII-2023-31) 国家重点研发计划项目(2022YFF0711902)。
关键词 大模型 生成式预训练变换器 农技知识 智能问答 命名实体识别 LLM generative pre-trained transformer agricultural technology knowledge intelligent question answering name entity identity
  • 相关文献

参考文献3

二级参考文献42

共引文献34

同被引文献46

引证文献8

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部