摘要
由于人类活动的复杂性和多维性,活动模式的挖掘具有很大的挑战性。本文提出了一个基于时序活动序列计算用户之间的相似度,通过聚类分析来挖掘活动模式和社会人口学模式的方法,对模式进行趋势分析。实验结果表明,提出O(p(m-p))的相似度算法,可以有效地进行聚类。在此基础上,通过时序活动图和概率密度函数(PDF)图的可视化以及统计分析,挖掘出了活动及其社会人口学模式,然后通过对连续多年的数据挖掘,获取行为及其社会人口学模式的发展趋势,以此得到了相似的活动行为具有相似的社会人口学特征的结论。
Due to the complexity and multidimensional characteristics of human activities, there is an enormous challenge in mining activity patterns. In this paper, the similarity between users based on the time series activity sequences is calculated, then activity patterns and social demography patterns are mined through clustering analysis, and finally trend analysis is carried out on the patterns. The experimental results show that the similarity algorithm of O(p(m-p)) can perform clustering effectively. On this basis, activity patterns and socio-demographic patterns are mined through the visualization of time-series activity diagram, the visualization of probability density function(PDF)diagram, and the statistical analysis. Then the trends of the activity patterns and socio-demographic patterns are obtained based on the datasets from the past consecutive years. Finally, it is concluded that people with similar activities always have similar sociodemographic characteristics.
作者
宋玲
吕舜铭
刘洪鑫
吕强
牛小飞
刘新锋
SONG Ling;LYU Shunming;LIU Hongxin;LYU Qiang;NIU Xiaofei;LIU Xinfeng(College of Computer Science and Technology,Shandong Jianzhu University,Ji’nan 250101,China;State Grid Information&Telecommunication Branch,Beijing 100031,China;State Grid Institute of Technology,Ji’nan 250002,China)
出处
《应用科技》
CAS
2022年第2期40-48,共9页
Applied Science and Technology
基金
国家自然科学基金项目(62177031,51975332)
山东省自然科学基金项目(ZR202102190312)
山东省重大科技创新项目(2019JZZY010435)。
关键词
活动模式挖掘
社会人口学模式挖掘
时序活动序列
相似度计算
最长公共活动子序列
聚类
趋势分析
activity pattern mining
socio-demographic pattern mining
time series activity sequence
similarity computation
longest common activity subsequence
clustering
trend analysis