摘要
提出一种基于本体的Deep Web数据源发现方法,采用网页分类、表单内容分类、表单结构分类方式,确定符合某领域的DeepWeb查询接口。在网页分类和表单内容分类中引入本体的半自动构建和自动扩展模块,在表单结构分类中添加启发式规则。实验结果证明,该方法能有效提高Deep Web数据源的查全率和查准率。
This paper presents a Deep Web data sources discovery method based on ontology. It uses webpage classification, form structure classification and form content classification to find Deep Web querying interface in some fields. It proposes that semi-automatic construction and automatic extension of ontology are added to the webpage and form content classification, and heuristic rules are enriched in the form structure classification. Experimental results show that this method can improve the precision and recall of Deep Web database discovery effectively.
出处
《计算机工程》
CAS
CSCD
2012年第4期52-54,共3页
Computer Engineering
基金
国家自然科学基金资助项目(70671035)
关键词
深网
本体
数据源
半自动构建
分类模型
Deep Web
ontology
data sources
semi-automatic construction
classification model