摘要
病情自述是网络疾病咨询中普遍的信息形式。为了从这些不规范的数据中发现隐含疾病知识和用户语义提出一种无监督学习方法构建知识图谱,并基于此进行疾病辅助诊断。从同一疾病的病情自述提取特征关键词,使用特征关键词的概率关联和语义关联构建特征关联网络。在特征关联网络中找出描述疾病时常用的特征团模式,基于特征团的语义关系构建知识图谱;从知识图谱上抽取结构化特征,利用结构化特征与病情自述文本的Jaccard系数完成病情自述的文本表示;利用SVM实现病情自述的分类识别,结果分类的微平均和宏平均都在80%以上。研究能够用于疾病结构化知识发现和用户意图分析,初步诊断病情自述疾病类型。
Disease readme is prevalent form in online consultation. An unsupervised method to construct knowledge graph was proposed to discover latent disease knowledge and user semantics and assist diagnosis. First of all,extract feature keywords from same disease readme and construct feature net with probability and semantic relation. After discovering the frequent feature clique used in describing disease from the feature net,the knowledge graph was constructed through the semantic relation between two feature cliques. Then,the structured feature was extracted from knowledge graph and the text representation of disease readme was completed by computing Jaccard Index between every structured feature and disease readme. At the end,the Micro-average and Macro-average both exceed 80% by classifying disease readme with SVM. Research can be used to discover disease structured knowledge,analyse user intends and assist diagnosis initially for disease readme.
出处
《计算机应用与软件》
北大核心
2018年第2期161-166,共6页
Computer Applications and Software
基金
国家自然科学基金项目(61332004
81502869)
苏州市科技计划项目产业技术创新专项(民生科技SS201509)
苏州市"科教兴卫"青年科技项目(KJXW2014060)
关键词
病情自述
知识图谱
结构化特征
文本表示
辅助诊断
Disease readme
Knowledge graph
Structured feature
Text representation
Auxiliary diagnosis