Microblogging provides a new platform for com- municating and sharing information among Web users. Users can express opinions and record daily life using microblogs. Microblogs that are posted by users indicate their ...Microblogging provides a new platform for com- municating and sharing information among Web users. Users can express opinions and record daily life using microblogs. Microblogs that are posted by users indicate their interests to some extent. We aim to mine user interests via keyword extraction from microblogs. Traditional keyword extraction methods are usually designed for formal documents such as news articles or scientific papers. Messages posted by mi- croblogging users, however, are usually noisy and full of new words, which is a challenge for keyword extraction. In this paper, we combine a translation-based method with a frequency-based method for keyword extraction. In our ex- periments, we extract keywords for microblog users from the largest microblogging website in China, Sina Weibo. The re- suits show that our method can identify users' interests accu- rately and efficiently.展开更多
The objective of this study is to understand the current mental status of college students in China's Mainland. In this study, 60 thousand college students' microblog content from January 2014 to June 2014 was co...The objective of this study is to understand the current mental status of college students in China's Mainland. In this study, 60 thousand college students' microblog content from January 2014 to June 2014 was collected. An emotional energy level, which was developed by a psychologist David R Hawkins, was taken as a basis for ontology database to divide the student's emotion into three parts--positive, negative and neutral status. An ontology-based semantic analysis method was used to analyze the microblog data. The result shows that 46.38% of Sina microblog data reflects positive psychological status, and the ratios of neutral and negative psychological status are 19.77% and 33.85%, respectively. It means that almost one third microblog reflects some negative mentality. The semantic analysis of the big data suggests that most students have healthy mental status, and the negative status of the students should not be ignored.展开更多
Ethiopia has a mountainous landscape which can be divided into the Northwestern and Southeastern plateaus by the Main Ethiopian Rift and Afar Depression. Debre Sina area is located in Central Ethiopia along the escarp...Ethiopia has a mountainous landscape which can be divided into the Northwestern and Southeastern plateaus by the Main Ethiopian Rift and Afar Depression. Debre Sina area is located in Central Ethiopia along the escarpment where landslide problem is frequent due to steep slope, complex geology, rift tectonics, heavy rainfall and seismicity. In order to tackle this problem, preparing a landslide susceptibility map is very important. For this, GISbased frequency ratio(FR) and logistic regression(LR) models have been applied using landslide inventory and the nine landslide factors(i.e. lithology, land use, distance from river & fault, slope, aspect, elevation, curvature and annual rainfall). Database construction, weighting each factor classes or factors, preparing susceptibility map and validation were the major steps to be undertaken. Both models require a rasterized landslide inventory and landslide factor maps. The former was classified into training and validation landslides. Using FR model, weights for each factor classes were calculated and assigned so that all the weighted factor maps can be added to produce a landslide susceptibility map. In the case of LR model, the entire study area is firstly divided into landslide and non-landslide areas using the training landslides. Then, these areas are changed into landslide and non-landslide points so as to extract the FR maps of the nine landslide factors. Then a linear relationship is established between training landslides and landslide factors in SPSS. Based on this relationship, the final landslide susceptibility map is prepared using LR equation. The success-rate and prediction-rate of FR model were 74.8% and 73.5%, while in case of LR model these were 75.7% and 74.5% respectively. A close similarity in the prediction and validation rates showed that the model is acceptable. Accuracy of LR model is slightly better in predicting the landslide susceptibility of the area compared to FR model.展开更多
Based on user's in-degree distribution, traditional ranking algorithms of user's weight usually neglect the considerations of the differences among user's followers and the features of user's tweets. In order to a...Based on user's in-degree distribution, traditional ranking algorithms of user's weight usually neglect the considerations of the differences among user's followers and the features of user's tweets. In order to analyze the factors which impact on user's weight, under the analysis of the data collected from SINA Microblog network, this paper discovers that user influence and active degrees are the dominant factors for this issue. The proposed algorithm evaluates user influence by user's follower number, the influence of user's followers and the reciprocity between users. User's active degree is modeled by user's participation and the quality of user's tweets. The models are tested by different data groups to confirm the parameters for the final calculation. Eventually, this paper compares the computational results with the user's ranking order given by the SINA official application. The performance of this algorithm presents a stronger stability on the fluctuant range of the value of user's weight.展开更多
With the advent of the new media era,government social media have become an important paradigm for social governance.We perform a large-sample regression and reveal that the higher the quality of taxation bureaus’ope...With the advent of the new media era,government social media have become an important paradigm for social governance.We perform a large-sample regression and reveal that the higher the quality of taxation bureaus’operation of government social media,the lower the degree of local enterprises’tax avoidance,which works through reducing tax avoidance incentives and increasing the difficulty of committing tax avoidance.Moreover,government social media play a substitution effect on tax enforcement and administration.We also find that government social media should focus on strengthening its official,formal and professional characteristics.Given the significant recent changes in how enterprises handle taxation,the proportion of information that taxation bureaus post on system operation should be appropriately increased.展开更多
In response to the COVID-19,social media big data has played an important role in epidemic warning,tracking the source of infection,and public opinion monitoring,providing strong technical support for China’s epidemi...In response to the COVID-19,social media big data has played an important role in epidemic warning,tracking the source of infection,and public opinion monitoring,providing strong technical support for China’s epidemic prevention and control work.The paper used Sina Weibo posts related to COVID-19 hashtags as the data source,and built a BERT-CNN deep learning model to perform fine-grained and high-precision topic classificationon massive social media posts.Taking Shenzhen as a region of interest,we mined the“epidemic data bulletin”and“daily life impact”posts during the epidemic for spatial analysis.The results show that the confirmed communities and designated hospitals in Shenzhen as a whole present the characteristics of“sparse east and dense west”,and there is a strong positive spatial correlation between the number of confirmed cases and social media response.Specifically,Nanshan District,Futian District and Luohu District have more confirmed cases due to large population movements and dense transportation networks,and social media has responded more violently,and people’s lives have been greatly affected.However,Yantian District,Pingshan District and Dapeng New District showed opposite characteristics.The case study results further show that using deep learning methods to mine text information in social media is scientifically feasible for improving situational awareness and decision support during the COVID-19.展开更多
文摘Microblogging provides a new platform for com- municating and sharing information among Web users. Users can express opinions and record daily life using microblogs. Microblogs that are posted by users indicate their interests to some extent. We aim to mine user interests via keyword extraction from microblogs. Traditional keyword extraction methods are usually designed for formal documents such as news articles or scientific papers. Messages posted by mi- croblogging users, however, are usually noisy and full of new words, which is a challenge for keyword extraction. In this paper, we combine a translation-based method with a frequency-based method for keyword extraction. In our ex- periments, we extract keywords for microblog users from the largest microblogging website in China, Sina Weibo. The re- suits show that our method can identify users' interests accu- rately and efficiently.
基金Supported by the Natural Science Foundation of Hubei Province(2013CFB292)
文摘The objective of this study is to understand the current mental status of college students in China's Mainland. In this study, 60 thousand college students' microblog content from January 2014 to June 2014 was collected. An emotional energy level, which was developed by a psychologist David R Hawkins, was taken as a basis for ontology database to divide the student's emotion into three parts--positive, negative and neutral status. An ontology-based semantic analysis method was used to analyze the microblog data. The result shows that 46.38% of Sina microblog data reflects positive psychological status, and the ratios of neutral and negative psychological status are 19.77% and 33.85%, respectively. It means that almost one third microblog reflects some negative mentality. The semantic analysis of the big data suggests that most students have healthy mental status, and the negative status of the students should not be ignored.
文摘Ethiopia has a mountainous landscape which can be divided into the Northwestern and Southeastern plateaus by the Main Ethiopian Rift and Afar Depression. Debre Sina area is located in Central Ethiopia along the escarpment where landslide problem is frequent due to steep slope, complex geology, rift tectonics, heavy rainfall and seismicity. In order to tackle this problem, preparing a landslide susceptibility map is very important. For this, GISbased frequency ratio(FR) and logistic regression(LR) models have been applied using landslide inventory and the nine landslide factors(i.e. lithology, land use, distance from river & fault, slope, aspect, elevation, curvature and annual rainfall). Database construction, weighting each factor classes or factors, preparing susceptibility map and validation were the major steps to be undertaken. Both models require a rasterized landslide inventory and landslide factor maps. The former was classified into training and validation landslides. Using FR model, weights for each factor classes were calculated and assigned so that all the weighted factor maps can be added to produce a landslide susceptibility map. In the case of LR model, the entire study area is firstly divided into landslide and non-landslide areas using the training landslides. Then, these areas are changed into landslide and non-landslide points so as to extract the FR maps of the nine landslide factors. Then a linear relationship is established between training landslides and landslide factors in SPSS. Based on this relationship, the final landslide susceptibility map is prepared using LR equation. The success-rate and prediction-rate of FR model were 74.8% and 73.5%, while in case of LR model these were 75.7% and 74.5% respectively. A close similarity in the prediction and validation rates showed that the model is acceptable. Accuracy of LR model is slightly better in predicting the landslide susceptibility of the area compared to FR model.
基金supported by the National Natural Sciences Foundation of China under Grant No. 61172072the Beijing Natural Science Foundation under Grant No. 4112045the Fundamental Research Funds for the Central Universities under Grant No. 2011YJS215
文摘Based on user's in-degree distribution, traditional ranking algorithms of user's weight usually neglect the considerations of the differences among user's followers and the features of user's tweets. In order to analyze the factors which impact on user's weight, under the analysis of the data collected from SINA Microblog network, this paper discovers that user influence and active degrees are the dominant factors for this issue. The proposed algorithm evaluates user influence by user's follower number, the influence of user's followers and the reciprocity between users. User's active degree is modeled by user's participation and the quality of user's tweets. The models are tested by different data groups to confirm the parameters for the final calculation. Eventually, this paper compares the computational results with the user's ranking order given by the SINA official application. The performance of this algorithm presents a stronger stability on the fluctuant range of the value of user's weight.
基金financial support from National Natural Science Foundation of China(No.72073019)
文摘With the advent of the new media era,government social media have become an important paradigm for social governance.We perform a large-sample regression and reveal that the higher the quality of taxation bureaus’operation of government social media,the lower the degree of local enterprises’tax avoidance,which works through reducing tax avoidance incentives and increasing the difficulty of committing tax avoidance.Moreover,government social media play a substitution effect on tax enforcement and administration.We also find that government social media should focus on strengthening its official,formal and professional characteristics.Given the significant recent changes in how enterprises handle taxation,the proportion of information that taxation bureaus post on system operation should be appropriately increased.
基金Science&Technology Department of Sichuan Province(No.21ZDYF2090)。
文摘In response to the COVID-19,social media big data has played an important role in epidemic warning,tracking the source of infection,and public opinion monitoring,providing strong technical support for China’s epidemic prevention and control work.The paper used Sina Weibo posts related to COVID-19 hashtags as the data source,and built a BERT-CNN deep learning model to perform fine-grained and high-precision topic classificationon massive social media posts.Taking Shenzhen as a region of interest,we mined the“epidemic data bulletin”and“daily life impact”posts during the epidemic for spatial analysis.The results show that the confirmed communities and designated hospitals in Shenzhen as a whole present the characteristics of“sparse east and dense west”,and there is a strong positive spatial correlation between the number of confirmed cases and social media response.Specifically,Nanshan District,Futian District and Luohu District have more confirmed cases due to large population movements and dense transportation networks,and social media has responded more violently,and people’s lives have been greatly affected.However,Yantian District,Pingshan District and Dapeng New District showed opposite characteristics.The case study results further show that using deep learning methods to mine text information in social media is scientifically feasible for improving situational awareness and decision support during the COVID-19.