摘要
针对现有的大部分网络服务分类机制基本上靠人工分类的缺陷,以及半自动分类技术准确率和查全率的效率较低等问题,进行了基于后缀树聚类算法的网络服务自动分类技术研究,同时提出概念与例子层次树结构来表示部分存在上下位关系或者同义关系的聚类标签,在后缀树聚类基础上对这些标签进行二次聚类。通过引入文本预处理和WordNet语义相似度计算的基础上来实现服务自动分类。实验结果表明,该服务自动分类算法具有较好的准备率和查全率,另外根据WordNet提取出抽象的聚类标签,有利于对日益剧增的网络服务进行抽象层次的分类,提高了海量网络服务分类的效率。
The majority of web service classification mechanisms basically rely on manual classification, as well as semi-automatic classification precision rate and recall rate are not effective enough. The approach of web service classification based on suffix tree clustering is mainly studied. Our approach proposes a hierarchical tree structure to represent some clustering labels which exist partly hyponymy or synonymy relationship, and at the same time to make a secondary clustering of these labels based on suffix tree clustering. The automatic classification has been achieved by adapting documentation preprocessing and semantic similarity calculation. The experimental results show that the service automatic classification algorithm has a higher precision and recall; on the other hand it can extract abstract cluster labels based on WordNet to cope with rapidly increasing web service in the level of classification of abstracting hierarchy, it also improves the efficiency of large number of services classification.
出处
《信息技术》
2013年第9期13-17,共5页
Information Technology
基金
上海市教委科研创新项目(12zz146)