期刊文献+

中医药临床随机对照试验文献结构化信息的自动化提取及信息质量评价

Automatic extraction and information quality evaluation of structured information in clinical randomized controlled trials of traditional Chinese medicine
下载PDF
导出
摘要 目的为提高中医药临床随机对照试验(RCTs)文献中数据信息的利用率,本研究对纳入文献中存在的结构化信息进行自动化提取并对提取到的信息进行评价。方法对1986年1月至2020年12月中国知网、万方数据库和维普网中糖尿病、类风湿性关节炎、肥胖、膝骨关节炎、小儿腹泻、结直肠癌6个病种的中医药临床RCTs文献进行检索及梳理,随机纳入5506篇,运用光学字符识别技术对可携带文档格式的文献进行识别,转化成文本格式,并使用正则表达式对文献信息进行提取。从信息的提取率和准确率两方面进行评价。结果研究发现“资料”“方法”“试验参与者总数”“试验参与者年龄”“试验参与者例数”“疗程天数”“排除标准”“纳入标准”和“基金”9个字段的提取率分别为96.60%、93.30%、92.60%、42.23%、28.29%、80.20%、62.60%、46.00%、21.10%,9个字段的准确率分别为97.9%、98.9%、89.7%、100.0%、100.0%、94.5%、97.3%、89.0%、94.7%。结论中医药临床RCTs文献可以通过自动化方式对文献结构化信息进行完整性的识别与判断,提取出的结构化信息可以为中医药临床RCTs网络体系搭建提供数据支撑,在此基础上提出了中医药临床RCTs文献结构化写作设想。 Objective To improve the utilization rate of data information in the literature of randomized controlled trials of traditional Chinese medicine(RCTs),this subject automatically extracted structured information in the included literature and evaluated the extracted information.Methods From January 1986 to December 2020,the CNKI,Wanfang Data,and VIP were searched and sorted out the clinical RCT literatures,of six diseases including diabetes,rheumatoid arthritis,obesity,knee osteoarthritis,pediatric diarrhea and colorectal cancer.A total of 5506 articles were randomly included.Optical character recognition technology was used to identify documents in portable document format,convert them into text format,and regular expressions were used to extract information from the documents.From the information extraction rate and accuracy of two aspects were evaluated.Results The study found that the extraction rates of“data”,“method”,“total number of participants”,“age of participants”,“number of participants”,“duration of treatment”,“exclusion criteria”,“inclusion criteria”,and“fund”were 96.60%,93.30%,92.60%,42.23%,28.29%,80.20%,62.60%,46.00%and 21.10%,respectively,and the accuracy of nine fields were 97.9%,98.9%,89.7%,100.0%,100.0%,94.5%,97.3%,89.0%and 94.7%,respectively.Conclusion Traditional Chinese medicine clinical RCT literatures can identify and judge the integrity of the structured information of literatures by automatic means,and the extracted structured information can provide data support for the construction of clinical RCTs trial network system of traditional Chinese medicine.On this basis,the author puts forward the idea of structured writing of traditional Chinese medicine clinical RCT literatures.
作者 张雨楠 刘鹤源 黄哲 窦智丽 韩东燃 ZHANG Yunan;LIU Heyuan;HUANG Zhe;DOU Zhili;HAN Dongran(College of Life Science,Beijing University of Chinese Medicine,Beijing102488,China)
出处 《中国医药导报》 CAS 2023年第11期183-187,192,共6页 China Medical Herald
基金 国家重点研发计划项目(2019YFC1709801)。
关键词 中医药 随机对照试验 光学字符识别技术 结构化 科学写作 Traditional Chinese medicine Randomized controlled trial Optical character recognition Structured Scientific writing
  • 相关文献

参考文献19

二级参考文献137

共引文献148

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部