The user’s intent to seek online information has been an active area of research in user profiling.User profiling considers user characteristics,behaviors,activities,and preferences to sketch user intentions,interest...The user’s intent to seek online information has been an active area of research in user profiling.User profiling considers user characteristics,behaviors,activities,and preferences to sketch user intentions,interests,and motivations.Determining user characteristics can help capture implicit and explicit preferences and intentions for effective user-centric and customized content presentation.The user’s complete online experience in seeking information is a blend of activities such as searching,verifying,and sharing it on social platforms.However,a combination of multiple behaviors in profiling users has yet to be considered.This research takes a novel approach and explores user intent types based on multidimensional online behavior in information acquisition.This research explores information search,verification,and dissemination behavior and identifies diverse types of users based on their online engagement using machine learning.The research proposes a generic user profile template that explains the user characteristics based on the internet experience and uses it as ground truth for data annotation.User feedback is based on online behavior and practices collected by using a survey method.The participants include both males and females from different occupation sectors and different ages.The data collected is subject to feature engineering,and the significant features are presented to unsupervised machine learning methods to identify user intent classes or profiles and their characteristics.Different techniques are evaluated,and the K-Mean clustering method successfully generates five user groups observing different user characteristics with an average silhouette of 0.36 and a distortion score of 1136.Feature average is computed to identify user intent type characteristics.The user intent classes are then further generalized to create a user intent template with an Inter-Rater Reliability of 75%.This research successfully extracts different user types based on their preferences in online content,platforms,展开更多
This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, catego...This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.展开更多
In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the char...In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the characteristics of power-law function and exhibits strong similarity, and the user' s queries and clicked URLs present dramatic locality, which implies that query cache and 'hot click' cache can be employed to improve system performance. Then three typical cache replacement policies are compared, including LRU, FIFO, and LFU with attenuation. In addition, the distribution character-istics of web information are also analyzed, which demonstrates that the link popularity and replica pop-ularity of a URL have positive influence on its importance. Finally, variance between the link popularity and user popularity, and variance between replica popularity and user popularity are analyzed, which give us some important insight that helps us improve the ranking algorithms in a search engine.展开更多
Modem search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve click- through rate (CTR), Web search ranking, and query ...Modem search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve click- through rate (CTR), Web search ranking, and query rec- ommendations and suggestions. Although click-through logs can provide implicit feedback of users' click preferences, de- riving accurate absolute relevance judgments is difficult be- cause of the existence of click noises and behavior biases. Previous studies showed that user clicking behaviors are bi- ased toward many aspects such as "position" (user's attention decreases from top to bottom) and "trust" (Web site reputa- tions will affect user's judgment). To address these problems, researchers have proposed several behavior models (usually referred to as click models) to describe users? practical browsing behaviors and to obtain an unbiased estimation of result relevance. In this study, we review recent efforts to construct click models for better search ranking and propose a novel convolutional neural network architecture for build- ing click models. Compared to traditional click models, our model not only considers user behavior assumptions as input signals but also uses the content and context information of search engine result pages. In addition, our model uses pa- rameters from traditional click models to restrict the meaning of some outputs in our model's hidden layer. Experimental results show that the proposed model can achieve consider- able improvement over state-of-the-art click models based on the evaluation metric of click perplexity.展开更多
Users' behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine ...Users' behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine users. By adequately performing analysis on log data, researchers and Internet companies can get guidance to better search engines. In this paper, we perform our analysis based on approximately 750million entries of search requests obtained from log of a real commercial search engine. Several aspects of users' behavior are studied, including query length, ratio of query refining, recommendation access, and so on. Different information needs may lead to different behaviors, and we address this discussion in this paper. We firmly believe that these analyses would be helpful with respect of improving both effectiveness and efficiency of search engines.展开更多
We present a very different cause of search engine user behaviors ——fascination. It is generally identified as the initial effect of a product attribute on users' interest and purchase intentions. Considering th...We present a very different cause of search engine user behaviors ——fascination. It is generally identified as the initial effect of a product attribute on users' interest and purchase intentions. Considering the fact that in most cases the cursor is driven directly by a hand to move via a mouse (or touchpad), we use the cursor movement as the critical feature to analyze the personal reaction against the fascinating search results. This paper provides a deep insight into the goal-directed cursor movement that occurs within a remarkably short period of time (<30 milliseconds), which is the interval between a user's click-through and decision-making behaviors. Instead of the fundamentals, we focus on revealing the characteristics of the split-second cursor movement. Our empirical findings showed that a user may push or pull the mouse with a slightly greater strength when fascinated by a search result. As a result, the cursor slides toward the search result with an increased momentum. We model the momentum through a combination of translational and angular kinetic energy calculations. Based on Fitts' law, we implement goal-directed cursor movement identification. Supported by the momentum, together with other physical features, we built different fascination-based search result reranking systems. Our experiments showed that goal-directed cursor momentum is an effective feature in detecting fascination. In particular, they show feasibility in both the personalized and cross-media cases. In addition, we detail the advantages and disadvantages of both click-through rate and cursor momentum for re-ranking search results.展开更多
文摘The user’s intent to seek online information has been an active area of research in user profiling.User profiling considers user characteristics,behaviors,activities,and preferences to sketch user intentions,interests,and motivations.Determining user characteristics can help capture implicit and explicit preferences and intentions for effective user-centric and customized content presentation.The user’s complete online experience in seeking information is a blend of activities such as searching,verifying,and sharing it on social platforms.However,a combination of multiple behaviors in profiling users has yet to be considered.This research takes a novel approach and explores user intent types based on multidimensional online behavior in information acquisition.This research explores information search,verification,and dissemination behavior and identifies diverse types of users based on their online engagement using machine learning.The research proposes a generic user profile template that explains the user characteristics based on the internet experience and uses it as ground truth for data annotation.User feedback is based on online behavior and practices collected by using a survey method.The participants include both males and females from different occupation sectors and different ages.The data collected is subject to feature engineering,and the significant features are presented to unsupervised machine learning methods to identify user intent classes or profiles and their characteristics.Different techniques are evaluated,and the K-Mean clustering method successfully generates five user groups observing different user characteristics with an average silhouette of 0.36 and a distortion score of 1136.Feature average is computed to identify user intent type characteristics.The user intent classes are then further generalized to create a user intent template with an Inter-Rater Reliability of 75%.This research successfully extracts different user types based on their preferences in online content,platforms,
文摘This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.
基金This work was supported by the National Grand Fundamental Research of China ( Grant No. G1999032706).
文摘In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the characteristics of power-law function and exhibits strong similarity, and the user' s queries and clicked URLs present dramatic locality, which implies that query cache and 'hot click' cache can be employed to improve system performance. Then three typical cache replacement policies are compared, including LRU, FIFO, and LFU with attenuation. In addition, the distribution character-istics of web information are also analyzed, which demonstrates that the link popularity and replica pop-ularity of a URL have positive influence on its importance. Finally, variance between the link popularity and user popularity, and variance between replica popularity and user popularity are analyzed, which give us some important insight that helps us improve the ranking algorithms in a search engine.
文摘Modem search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve click- through rate (CTR), Web search ranking, and query rec- ommendations and suggestions. Although click-through logs can provide implicit feedback of users' click preferences, de- riving accurate absolute relevance judgments is difficult be- cause of the existence of click noises and behavior biases. Previous studies showed that user clicking behaviors are bi- ased toward many aspects such as "position" (user's attention decreases from top to bottom) and "trust" (Web site reputa- tions will affect user's judgment). To address these problems, researchers have proposed several behavior models (usually referred to as click models) to describe users? practical browsing behaviors and to obtain an unbiased estimation of result relevance. In this study, we review recent efforts to construct click models for better search ranking and propose a novel convolutional neural network architecture for build- ing click models. Compared to traditional click models, our model not only considers user behavior assumptions as input signals but also uses the content and context information of search engine result pages. In addition, our model uses pa- rameters from traditional click models to restrict the meaning of some outputs in our model's hidden layer. Experimental results show that the proposed model can achieve consider- able improvement over state-of-the-art click models based on the evaluation metric of click perplexity.
文摘Users' behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine users. By adequately performing analysis on log data, researchers and Internet companies can get guidance to better search engines. In this paper, we perform our analysis based on approximately 750million entries of search requests obtained from log of a real commercial search engine. Several aspects of users' behavior are studied, including query length, ratio of query refining, recommendation access, and so on. Different information needs may lead to different behaviors, and we address this discussion in this paper. We firmly believe that these analyses would be helpful with respect of improving both effectiveness and efficiency of search engines.
基金the National Natural Science Foundation of China (Grant Nos. 61672368, 61373097, and 61672367)the Research Foundation of the Ministry of Education and China Mobile (MCM20150602)the Science and Technology Plan of Jiangsu (BK20151222).
文摘We present a very different cause of search engine user behaviors ——fascination. It is generally identified as the initial effect of a product attribute on users' interest and purchase intentions. Considering the fact that in most cases the cursor is driven directly by a hand to move via a mouse (or touchpad), we use the cursor movement as the critical feature to analyze the personal reaction against the fascinating search results. This paper provides a deep insight into the goal-directed cursor movement that occurs within a remarkably short period of time (<30 milliseconds), which is the interval between a user's click-through and decision-making behaviors. Instead of the fundamentals, we focus on revealing the characteristics of the split-second cursor movement. Our empirical findings showed that a user may push or pull the mouse with a slightly greater strength when fascinated by a search result. As a result, the cursor slides toward the search result with an increased momentum. We model the momentum through a combination of translational and angular kinetic energy calculations. Based on Fitts' law, we implement goal-directed cursor movement identification. Supported by the momentum, together with other physical features, we built different fascination-based search result reranking systems. Our experiments showed that goal-directed cursor momentum is an effective feature in detecting fascination. In particular, they show feasibility in both the personalized and cross-media cases. In addition, we detail the advantages and disadvantages of both click-through rate and cursor momentum for re-ranking search results.