摘要
随着数据科学和材料科学的进步,人们如今可构建出较为准确的人工智能模型,用于材料性质预测.本文中,我们以170,714个无机晶体化合物的高通量第一性原理计算数据集为基础,训练得到了可精确预测无机化合物形成能的机器学习模型.相比于同类工作,本项研究以超大数据集为出发点,构建出无机晶体形成能的高精度泛化模型,可外推至广阔相空间,其中的Dense Net神经网络模型精度可以达到R^(2)=0.982和平均绝对误差(MAE)=0.072 eV atom^(-1).上述模型精度的提升源自一系列新型特征描述符,这些描述符可有效提取出原子与领域原子间的电负性和局域结构等信息,从而精确捕捉到原子间的相互作用.本文为新材料搜索提供了一种高效、低成本的结合能预测手段.
Harnessing recent advances in data science and materials engineering,it is feasible today to build reliable models for predicting materials properties.Here we employ a comprehensive dataset of 170,714 inorganic crystalline compounds obtained from high-throughput accurate quantum mechanics calculations,to train a machine learning model for the precise prediction of the formation energy of inorganic compounds.Distinct from previous studies,our model can be universally applied to a large phase space of inorganic materials as all the data is utilized for the training,and the model reaches a fairly good predictive ability(R^(2)=0.982 and mean absolute error=0.072 eV atom^(-1),Dense Net model).The improvement comes from several effective structure-dependent descriptors,which are carefully designed to take into account the information of the electronegativity difference between neighboring atoms and local atomic structure.This model provides a useful tool to predict the energy landscape of the compound systems in a fast and cost-effective manner.
作者
梁英宗
陈明威
王亚南
贾华显
芦腾龙
谢帆恺
蔡光辉
王宗国
孟胜
刘淼
Yingzong Liang;Mingwei Chen;Yanan Wang;Huaxian Jia;Tenglong Lu;Fankai Xie;Guanghui Cai;Zongguo Wang;Sheng Meng;Miao Liu(Songshan Lake Materials Laboratory,Dongguan 523808,China;Beijing National Laboratory for Condensed Matter Physics,Institute of Physics,Chinese Academy of Sciences,Beijing 100190,China;Tencent AI Lab,Tencent,Shenzhen 518075,China;Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China;Center of Materials Science and Optoelectronics Engineering,University of Chinese Academy of Sciences,Beijing 100049,China)
基金
the financial support from the Chinese Academy of Sciences(CAS-WX2021PY-0102,ZDBS-LY-SLH007,and XDB33020000)。