摘要
服务发现研究面临两个主要问题:(1)Web服务数量大量增长,服务管理和匹配难度大;(2)API市场中用户常用的基于搜索引擎的服务发现,存在用户查询语义稀疏问题。针对这两个挑战,提出了一种基于聚类和高斯LDA的服务发现方法。该方法首先使用Doc2Vec将服务数据集映射为服务段落向量,接着用K-Means++聚类方法对服务向量聚类。然后,使用Word2Vec生成的上下文信息来扩展用户查询和丰富服务描述,然后将服务描述加载到高斯LDA中获取服务描述表示。最后,按照服务描述表示和扩展的服务查询之间的概率相关性对服务进行排序。实验结果表明,该服务发现模型在Precision@5,Recall@50,F-Measure@50实验结果优于TFIDF-K,LDA,Doc2Vec-K、GLDA-QE方法,提高了查询服务搜索的准确性。
There are two main problems in the research of service discovery:(1) The number of Web services is increasing rapidly, and the service management and matching are difficult;(2) The service discovery based on search engine commonly used by users in API market has the problem of user query semantic sparseness. Aiming at these two challenges, this paper proposes a service discovery method based on clustering and Gaussian LDA. Firstly, Doc2 Vec is used to map the service dataset to service paragraph vector, and then K-means++ clustering method is used to cluster the service vector. Then, the context information generated by Word2 Vec is used to extend the users’ query and enrich the service description, and then the service description is loaded into the Gaussian LDA to obtain the service description representation. Finally, the services are sorted according to the probability correlation between the service description representation and the extended service query. The experimental results show that the service discovery model in the Precision@5, Recall@50, F-Measure@50 are superior to the TFIDF-K, LDA, Doc2 Vec-K and GLDA-QE methods, which improves the accuracy of the query service search.
作者
唐菊
聂彤羽
TANG Ju;NIE Tongyu(Sichuan Instrument Industry School,Chongqing 400702,China;School of Bigdata and Software Engineering,Chongqing University 400044,China)
出处
《自动化与仪器仪表》
2022年第12期36-43,50,共9页
Automation & Instrumentation
关键词
服务发现
聚类
语义稀疏
高斯LDA
词嵌入
service discovery
clustering
semantic sparseness
Gaussian LDA
word embedding