摘要
表示学习是一种将研究对象的内在信息表示为稠密低维实值向量的方法,其基本思路是找到对原始数据更好的表达。表示学习凭借其自动提取特征的能力,在处理大量人为先验理解有限的数据时表现出高效性。有监督以及无监督的表示学习模型在文本、图像、三维点云等植物表型数据的分析研究中获得了运用。随着近年来数据量的迅速增长以及基因组学研究的快速发展,植物表型研究数据具有高通量、高精度等特征,表示学习模型在海量高维植物表型数据的分析任务中获得了关注。本文简述了表示学习的相关概念和表示学习技术研究进展,对有监督和无监督的表示学习模型进行对比分析,阐述了植物表型数据概念及其处理方法,重点从植物种类识别、病虫害检测分析、产量预测、基因研究和形态结构表型数据计算等方面,探讨了表示学习在植物表型中的研究应用意义及其存在的问题。最后,指出表示学习在植物表型应用中的发展方向:开发能够适用于分析不同种植物表型数据的表示学习模型,实现高整合度、高通用性的目标;提高表示学习模型的实时性及准确度,以增强其实用性;多模态表型数据的表示学习可为学科的交叉数据分析研究提供统一的数据视图。
Representation learning is a method of representing the intrinsic information of research object as a dense low-dimensional real-valued vector.The main purpose is to find a better representation of the original data.Representation learning,with its ability to extract features automatically,shows high efficiency when dealing with a large amount of artificially limited prior data.Supervised and unsupervised representation learning models have been used in the analysis of plant phenotypic data such as text,images,and 3D point clouds.With the rapid growth of data in recent years and the rapid development of genomics research,plant phenotypic research data has features like high throughput and high accuracy.Representation learning models have gained attention in the analysis of massive high-dimensional plant phenotypic data.The related concepts of representation learning were briefly introduced,supervised and unsupervised representation learning models were compared and analyzed,plant phenotypic data concepts and processing methods were briefly introduced,which was mainly focused on plant species identification,pest detection and analysis,yield prediction,gene research and morphological structure phenotypic data calculation,etc..The significance of the research application of representation learning in plant phenotypes and its problems were also discussed.Finally,the application trends of representation learning in plant phenotypes were prospected:developing representation learning models that can be applied to the analysis of different plant phenotype data;improving the real-time and accuracy of representation learning models to enhance their practicality;designing multimodal phenotypic data representation learning models that provided consistent data views for phenotypic data analysis.
作者
袁培森
李润隆
任守纲
顾兴健
徐焕良
YUAN Peisen;LI Runlong;REN Shougang;GU Xingjian;XU Huanliang(College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China)
出处
《农业机械学报》
EI
CAS
CSCD
北大核心
2020年第6期1-14,共14页
Transactions of the Chinese Society for Agricultural Machinery
基金
国家自然科学基金项目(61502236,61806097)
中央高校基本科研业务费专项资金项目(KYZ201752)
大学生创新创业训练专项计划项目(S20190025)。
关键词
农业大数据
深度学习
表示学习
哈希学习
植物表型
agricultural big data
deep learning
representation learning
Hash learning
plant phenotypes