期刊文献+

融合SOM功能聚类与DeepFM质量预测的API服务推荐方法 被引量:23

An API Service Recommendation Method via Combining Self-Organization Map-Based Functionality Clustering and Deep Factorization Machine-Based Quality Prediction
下载PDF
导出
摘要 由于越来越多的企业和组织纷纷将自己的业务、数据或资源封装成服务,并通过API的形式发布到互联网上,API服务的数量呈现倍增趋势.在此背景下,如何从这样一个大规模的API服务集合中,快速有效地找到满足开发者用户Mashup需求的API服务,已成为一个挑战性问题.为此,本文聚焦于“推荐合适的API服务以构建高质量Mashup应用”问题,以面向服务内容的功能聚类为基础,结合基于多维服务质量的评分预测,提出一种融合SOM功能聚类与DeepFM质量预测的API服务推荐方法,用于创建高质量的Mashup应用.该方法首先采用Wikipedia 作为外部语料库扩充API服务文档的内容并利用HDP模型建模其主题分布.通过WikiExtractor抽取出Wikipedia中的语料数据,并利用Word2vec工具训练该语料数据获得其词向量模型.利用训练好的Wikipedia词向量模型对API服务描述文档进行扩充.针对扩充后的API服务文档,使用HDP主题建模技术,挖掘出其隐含的主题信息,自动确定最优主题个数,以准确地度量API服务文档之间的语义相似度.然后,采用SOM神经网络进行面向主题的API服务聚类.在HDP主题建模之后,对获得的“API服务文档-主题”向量采用SOM神经网络聚类算法进行主题聚类,通过自组织过程,将众多的API服务划分到不同的功能类簇中,每一个功能类中包含多个具有相似功能的API服务.接下来,针对API服务类簇中所有具有相似功能的API服务,利用DeepFM模型建模和挖掘其多维QoS属性之间的复杂交互关系,预测并排序API服务的质量得分.DeepFM模型自动地提取出QoS数据中(包括流行度、共现次数等)的有效的特征组合关系(包括高阶特征和低阶特征组合关系),预测并排序每一个API服务相对于目标Mashup应用的质量得分,推荐得分靠前的 N 个API服务给开发者用户.最后,在真实Web服务数据集上进行了实验比较与分析,实验结果表明:本文� More and more enterprises and organizations encapsulate their business, data or resources as API services and publish then on Internet, and the number of API services is growing fast. In this context, to find API services quickly and effectively that meets Mashup requirements of developers from such a large collection of API services, has become a challenging problem. To address this problem, aiming to the issue of recommending appropriate API services to build high-quality Mashup applications, on top of service content-oriented functionality clustering and score prediction of quality of service with multi-dimension, an API service recommendation method via combining self-organization map-based functionality clustering and deep factorization machine - based quality prediction, is proposed in this paper to create novel Mashup applications with high-quality. This method, firstly uses Wikipedia as an external corpus to expand the contents of API service documents and models their topic distribution by adopting HDP model. WikiExtractor is used to extract corpus data from Wikipedia, and Word2vec tool is exploited to train the corpus data to obtain its word vector model. The trained Wikipedia word vector is regarded as the extension source of API service documents. As for the extended API service documents, hierarchical Dirichlet processes topic modeling technology is deployed to mine their implicit topic information, which automatically identifies the optimal number of topics to accurately measure semantic similarity between API service documents. Then, it exploits SOM neural network to cluster API services into various clusters with similar topic and functionality. After HDP topic modeling, the derived vector of API service document-topic is clustered with different topics by using the clustering algorithm of self-organization map-based neural network. That is to say, numerous API services are divided into different clusters through self-organizing process, each of which contains multiple API services with similar fu
作者 曹步清 肖巧翔 张祥平 刘建勋 CAO Bu-Qing;XIAO Qiao-Xiang;ZHANG Xiang-Ping;LIU Jian-Xun(School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201)
出处 《计算机学报》 EI CSCD 北大核心 2019年第6期1367-1383,共17页 Chinese Journal of Computers
基金 国家自然科学基金(61873316,61872139,61772193,61702181) 湖南省自然科学基金(2017JJ2098,2017JJ4036,2018JJ2139,2018JJ2136)资助~~
关键词 API推荐 Mashup应用 HDP主题模型 SOM神经网络 深度因子分解机 API recommendation Mashup application Hierarchical Dirichlet Processes topic model Self-Organizing Map-based neural network deep factorization machine
  • 相关文献

参考文献3

二级参考文献23

  • 1叶蕾,张斌.基于功能语义的Web服务发现方法[J].计算机研究与发展,2007,44(8):1357-1364. 被引量:24
  • 2L-J Zhang,J Zhang,H Cai.Services Computing[M].Beijing:Tsinghua University,2007. 被引量:1
  • 3Chen Liang,Hu Liukai,Zheng Zibin,et al.WTCluster:Utilizing tags for Web services clustering[A].Proceedings of International Conference on Service-Oriented Computing[C].Berlin:Springer,2011.204-218. 被引量:1
  • 4Elgazzar K,Hassan A E,Martin P.Clustering WSDL documents to bootstrap the discovery of web services[A].Proceedings of International Conference on Web Services[C].USA:Piscataway,2010.147-154. 被引量:1
  • 5Yu Q,Rege M.On service community learning:A co-clustering approach[A].Proceedings of IEEE International Conference on Web Services[C].USA:Piscataway,2010.283-290. 被引量:1
  • 6Liu Jianxiao,He Keqing,Wang Jian,et al.A clustering method for web service discovery[A].Proceedings of International Conference on Services Computing[C].USA:Piscataway,2011.729-730. 被引量:1
  • 7Cassar G,Barnaghi P,Moessner K.Probabilistic methods for service clustering[A].Proceedings of International Workshop on Semantic Web Service Matchmaking and Resource Retrieval[C].Shanghai:SRI,2010.4-20. 被引量:1
  • 8Blei D M,Ng A Y,Jordan M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3(2):993-1022. 被引量:1
  • 9Rosen-Zvi M,Griths T,Steyvers M,Smyth P.The author-topic model for authors and documents[A].Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence[C].USA:UAI,2004.487-494. 被引量:1
  • 10Wang Jian,Zhang Jia,Hung P C K,et al.Leveraging fragmental semantic data to enhance services discovery[A].Proceedings of the 13th International Conference on High Performance Computing and Communications[C].Piscataway,NJ:IEEE,2011.687-694. 被引量:1

共引文献41

同被引文献121

引证文献23

二级引证文献75

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部