期刊文献+

Research on Text Mining of Syndrome Element Syndrome Differentiation by Natural Language Processing 被引量:5

运用自然语言处理对证素辨证学进行文本挖掘研究(英文)
下载PDF
导出
摘要 Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis environment based on Python language, and built a corpus based on the core chapters of SESD. The second step was to digitalize the corpus. The main steps included word segmentation, information cleaning and merging, document-entry matrix, dictionary compilation and information conversion. The third step was to mine and display the internal information of SESD corpus by means of word cloud, keyword extraction and visualization. Results NLP played a positive role in computer recognition and comprehension of SESD. Different chapters had different keywords and weights. Deficiency syndrome elements were an important component of SESD, such as "Qi deficiency""Yang deficiency" and "Yin deficiency". The important syndrome elements of substantiality included "Blood stasis""Qi stagnation", etc. Core syndrome elements were closely related. Conclusions Syndrome differentiation and treatment was the core of SESD. Using NLP to excavate syndromes differentiation could help reveal the internal relationship between syndromes differentiation and provide basis for artificial intelligence to learn syndromes differentiation. 目的运用自然语言处理对证素辨证学(SESD)核心内容进行文本挖掘与可视化展示。方法第一步,基于Python语言搭建文本挖掘与分析环境,以SESD的核心章节为基础,建立SESD语料库;第二步,对语料库进行数字化处理,主要步骤包括分词、信息清理与合并、文档-词条矩阵、相关词典编译和信息转换;第三步,通过词云、关键词提取和可视化等手段挖掘和展示SESD语料库的内在信息。结果自然语言处理(NLP)可以促进计算机对SESD的识别和理解,SESD不同章节的关键词和权重不同。虚性证素是SESD的重要组成部分,如“气虚”“阳虚”“阴虚”,重要的实性证素包括“血瘀”“气滞”等,各核心证素间的关系密切。结论辨证论治是SESD的核心,利用NLP挖掘SESD有助于揭示证素之间的内在联系,为人工智能学习SESD提供依据。
作者 DENG Wen-Xiang ZHU Jian-Ping LI Jing YUAN Zhi-Ying WU Hua-Ying YAO Zhong-Hua ZHANG Yi-Ge ZHANG Wen-An HUANG Hui-Yong 邓文祥;朱建平;李静;袁志鹰;吴华英;姚中华;张弋戈;张文安;黄惠勇(湖南中医药大学,湖南长沙410208;湖南中医药大学中医诊断研究所,湖南长沙410208;湖南省中医药管理局,湖南长沙410008;广州市佳医帮健康管理有限公司,广东广州510000)
出处 《Digital Chinese Medicine》 2019年第2期61-71,共11页 数字中医药(英文)
基金 the funding support from the National Natural Science Foundation of China (No. 81874429) Digital and Applied Research Platform for Diagnosis of Traditional Chinese Medicine (No. 49021003005) 2018 Hunan Provincial Postgraduate Research Innovation Project (No. CX2018B465) Excellent Youth Project of Hunan Education Department in 2018 (No. 18B241)
关键词 Syndrome element syndrome differentiation (SESD) Natural language processing (NLP) Diagnostics of TCM Artificial intelligence Text mining 证素辨证学 自然语言处理 中医诊断学 人工智能 文本挖掘
  • 相关文献

参考文献6

二级参考文献67

共引文献60

同被引文献59

引证文献5

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部