摘要
针对移动互联网流量识别问题,基于多项性能评估指标,分析K-均值和谱聚类算法在不同特征集合或不同识别目标流量数据集上的聚类性能,并提出基于多特征集合的集成聚类方法。比较分析实验表明,相同聚类方法在不同特征集合或不同识别目标数据集上性能有所不同,集成聚类方法能够有效提高利用单个特征集合聚类方法的性能。进一步将集成聚类方法应用于App关联分析,分析结果可为移动App的划分和用户行为分析提供客观依据。
To handle the mobile traffic identification problem,based on multiple performance evaluation metrics,this paper analyzed the performance of K-means and spectral clustering algorithms on the data sets characterized by different feature sets or labeled with different class set,and proposed an ensemble clustering method from the aspects of combining the clustering results on the data sets with different feature sets.Experimental results show that the performance of the same clustering algorithm is different on the data sets with different feature sets or traffic classes,and the ensemble clustering method is able to improve the overall clustering performance.Further,this paper applied the ensemble clustering method on the correlation analysis of mobile Apps,and the results can support the decision on grouping Apps and analyzing user behaviors.
作者
吴志敏
刘珍
王若愚
陈洁桐
Wu Zhimin;Liu Zhen;Wang Ruoyu;Chen Jietong(School of Medical Information Engineering,Guangdong Pharmaceutical University,Guangzhou 510006,China;Information & Network Engineering & Research Center,South China University of Technology,Guangzhou 510041,China;Communication & Computer Network Laboratory of Guangdong,Guangzhou 510041,China)
出处
《计算机应用研究》
CSCD
北大核心
2019年第10期3101-3106,共6页
Application Research of Computers
基金
国家自然科学基金资助项目(61501128)
广东省自然科学基金资助项目(2017A030313345)
国家级大学生创新创业训练计划项目(201710573005)
广东药科大学创新强校工程项目
中央高校基本业务费资助项目(x2rj/D2174870)
关键词
移动App流量
流量统计特征
集成聚类
流量识别
mobile App traffic
traffic statistics features
ensemble clustering
traffic identification