摘要
智能教学系统通过搜索网页关键词获取教学资源时,由于存在许多具有相同关键词的垃圾网页的影响,使得教学资源较难从海量网页信息中快速挖掘出来,传统的关键词查找方法受垃圾网页的影响使得搜索量过大,造成智能教学资源获取的及时性不高。为此,提出Web信息抽取技术应用在智能教学资源挖掘中。根据教学资源获取要求批量获取相关Web网页,利用Xpath语言结合搜索请求和网页主题信息块特征对Web网页进行清洗,然后根据Web文本特征模型挖掘出教学所需的资源。仿真实验表明,这种方法能够有效克服垃圾网页地干扰,快速完成教学资源地挖掘,取得了满意的结果。
Research intelligent teaching system of teaching resources fast mining.When intelligent teaching system through the web keywords to search the teaching resources,because there are many with the same key words of garbage the influence of the web page,which is hard to teaching resources from huge web information quickly dug out.The traditional ways to search keywords by the municipal waste the influence of web search volume is too large,cause intelligent teaching resources of the gain of timeliness is not high.In order to solve this problem,this paper puts forward Web information extraction technology used in intelligent teaching resource mining.According to the teaching requirements for access to resources related Web page batch,Xpath language is used to union search requests and Web page subject information piece features on the Web page for cleaning,and then based on the Web text characteristic model dig out the teaching resources needed.The simulation experiment shows that this method can effectively avoid the interference of garbage web page,complete the teaching resources of the fast mining,and satisfactory results were obtained.
出处
《科技通报》
北大核心
2013年第4期21-22,25,共3页
Bulletin of Science and Technology
关键词
智能教学
垃圾网页
信息抽取
intelligent teaching
garbage web page
information extraction