摘要
非相关文献知识发现是Swanson教授提出的一种情报学方法,用于挖掘隐藏在文献之间的隐秘联系.其知识发现过程包括两部分:开放式知识发现过程和闭合式知识发现过程.开发式知识发现过程是形成假设的过程,可表示为A→B→C;闭合式过程是检验假设的过程,可表示为A→B←C.本文以Medline为数据源,以Mesh字段中的主题词为内容分析单元,进行开放式知识发现的实践.文章共分两部分:第一部分模拟Swanson的雷诺病和鱼油、偏头痛和镁缺乏的知识发现的例子,分别以雷诺病(raynaud disease)和偏头痛(migraine)为来源主题词(source subject),寻找目标词鱼油和镁缺乏;第二部分是以2型糖尿病(diabetes mellitus,type 2)为来源主题词,进行主题词分析法的医学实践.研究表明,主题词作为内容分析单元在技术上容易实现,是可行的知识挖掘方法.在实践中还需要探索更好的主题词统计量,进行更精确的类别限定,进一步完善该方法在科研领域的应用.
Non-interactive literature-based knowledge discovery, an informatics method put forward by Professor Swanson, is used to mine implicit links among literatures. It includes open and closed knowledge discovery processes. During open process we can form a hypothesis and during closed process we test it. Open process can be described as A→B→C and closed as A→B←C. In this article we practiced open knowledge discovery process with Medline as data source and Mesh fieM as analytic unit. Our work includes 2 parts , in part 1 we carried out open knowledge discovery process , simulating Swanson' s classic works with "raynaud disease" and "migraine" as source subjects to find target subjects of "fish oils" and "magnesium deficiency" respectively. In part 2, we practiced subjects analysis method with "diabetes mellitus, type 2" as source subject to find some chemical substances which may play some role on the disease. Although with many technical advantages, subject analysis method needs better statistics and class extraction to make it more perfect and practical in the fields of informatics and scientific research.
出处
《情报学报》
CSSCI
北大核心
2007年第5期741-747,共7页
Journal of the China Society for Scientific and Technical Information