随着微博用户的不断增加,微博网络已成为用户进行信息交流的平台.针对由于博文长度受限,传统的社区发现算法无法有效解决微博网络的稀疏性等问题,提出了DC-DTM(discovery community by dynamic topic model)算法.DC-DTM算法首先将微博...随着微博用户的不断增加,微博网络已成为用户进行信息交流的平台.针对由于博文长度受限,传统的社区发现算法无法有效解决微博网络的稀疏性等问题,提出了DC-DTM(discovery community by dynamic topic model)算法.DC-DTM算法首先将微博网络映射为有向加权网络,网络中边的方向反映节点之间的关注关系,利用所提出的DTM(dynamic topic model)计算出节点之间的语义相似度,并将其作为节点间连边的权重.DTM是一种微博主题模型.该模型不仅能够挖掘博客的主题分布,而且能够计算出某一主题中用户的影响力大小.其次,利用所提出的复杂度较低的标签传播算法WLPA(weighted lebel propagation)进行微博网络的社区发现.该算法的初始化阶段将影响力大的用户节点作为初始节点,标签按照节点的影响力从大到小进行传播,避免了传统标签传播算法逆流现象的发生,提高了标签传播算法的稳定性.真实数据上的实验结果表明,DTM模型能够很好地对微博进行主题挖掘,DC-DTM算法能够有效地挖掘出微博网络的社区.展开更多
The objective of this study is to understand the current mental status of college students in China's Mainland. In this study, 60 thousand college students' microblog content from January 2014 to June 2014 was co...The objective of this study is to understand the current mental status of college students in China's Mainland. In this study, 60 thousand college students' microblog content from January 2014 to June 2014 was collected. An emotional energy level, which was developed by a psychologist David R Hawkins, was taken as a basis for ontology database to divide the student's emotion into three parts--positive, negative and neutral status. An ontology-based semantic analysis method was used to analyze the microblog data. The result shows that 46.38% of Sina microblog data reflects positive psychological status, and the ratios of neutral and negative psychological status are 19.77% and 33.85%, respectively. It means that almost one third microblog reflects some negative mentality. The semantic analysis of the big data suggests that most students have healthy mental status, and the negative status of the students should not be ignored.展开更多
文摘随着微博用户的不断增加,微博网络已成为用户进行信息交流的平台.针对由于博文长度受限,传统的社区发现算法无法有效解决微博网络的稀疏性等问题,提出了DC-DTM(discovery community by dynamic topic model)算法.DC-DTM算法首先将微博网络映射为有向加权网络,网络中边的方向反映节点之间的关注关系,利用所提出的DTM(dynamic topic model)计算出节点之间的语义相似度,并将其作为节点间连边的权重.DTM是一种微博主题模型.该模型不仅能够挖掘博客的主题分布,而且能够计算出某一主题中用户的影响力大小.其次,利用所提出的复杂度较低的标签传播算法WLPA(weighted lebel propagation)进行微博网络的社区发现.该算法的初始化阶段将影响力大的用户节点作为初始节点,标签按照节点的影响力从大到小进行传播,避免了传统标签传播算法逆流现象的发生,提高了标签传播算法的稳定性.真实数据上的实验结果表明,DTM模型能够很好地对微博进行主题挖掘,DC-DTM算法能够有效地挖掘出微博网络的社区.
基金Supported by the Natural Science Foundation of Hubei Province(2013CFB292)
文摘The objective of this study is to understand the current mental status of college students in China's Mainland. In this study, 60 thousand college students' microblog content from January 2014 to June 2014 was collected. An emotional energy level, which was developed by a psychologist David R Hawkins, was taken as a basis for ontology database to divide the student's emotion into three parts--positive, negative and neutral status. An ontology-based semantic analysis method was used to analyze the microblog data. The result shows that 46.38% of Sina microblog data reflects positive psychological status, and the ratios of neutral and negative psychological status are 19.77% and 33.85%, respectively. It means that almost one third microblog reflects some negative mentality. The semantic analysis of the big data suggests that most students have healthy mental status, and the negative status of the students should not be ignored.