With the rapid development and wide use of Global Positioning System in technology tools, such as smart phones and touch pads, many people share their personal experience through their trajectories while visiting plac...With the rapid development and wide use of Global Positioning System in technology tools, such as smart phones and touch pads, many people share their personal experience through their trajectories while visiting places of interest. Therefore, trajectory query processing has emerged in recent years to help users find their best trajectories. However, with the huge amount of trajectory points and text descriptions, such as the activities practiced by users at these points, organizing these data in the index becomes tedious. Therefore, the parallel method becomes indispensable. In this paper, we have investigated the problem of distributed trajectory query processing based on the distance and frequent activities. The query is specified by start and final points in the trajectory, the distance threshold, and a set of frequent activities involved in the point of interest of the trajectory.As a result, the query returns the shortest trajectory including the most frequent activities with high support and high confidence. To simplify the query processing, we have implemented the Distributed Mining Trajectory R-Tree index(DMTR-Tree). For this method, we initially managed the large trajectory dataset in distributed R-Tree indexes.Then, for each index, we applied the frequent itemset Apriori algorithm for each point to select the frequent activity set. For the faster computation of the above algorithms, we utilized the cluster computing framework of Apache Spark with MapReduce as the programing model. The experimental results show that the DMTR-Tree index and the query-processing algorithm are efficient and can achieve the scalability.展开更多
Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-...Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-temporal data by a novel method adopting the concepts of clustering and sequential pattern mining. The algorithms used logically split the trajectory span area into clusters and then apply the k-means algorithm over this clusters until the squared error minimizes. The new method applies the threshold to obtain active clusters and arranges them in descending order based on number of trajectories passing through. From these active clusters, inter cluster patterns are found by a sequential pattern mining technique. The process is repeated until all the active clusters are linked. The clusters thus linked in sequence are the frequent trajectories. A set of experiments conducted using real datasets shows that the proposed method is relatively five times better than the existing ones. A comparison is made with the results of other algorithms and their variation is analyzed by statistical methods. Further, tests of significance are conducted with ANOVA to find the efficient threshold value for the optimum plot of frequent trajectories. The results are analyzed and found to be superior than the existing ones. This approach may be of relevance in finding alternate paths in busy networks ( congestion control), finding the frequent paths of migratory birds, or even to predict the next level of pattern characteristics in case of time series data with minor alterations and finding the frequent path of balls in certain games.展开更多
为了满足未来移动网络蜂窝小、切换频繁、支持规模用户和多媒体应用的需求,对位置预测与越区切换进行深入分析,提出了基于位置预测的越区切换方案HDLP(Handover Decision based on Location Prediction),其基本思想是:(1)从移动用户的...为了满足未来移动网络蜂窝小、切换频繁、支持规模用户和多媒体应用的需求,对位置预测与越区切换进行深入分析,提出了基于位置预测的越区切换方案HDLP(Handover Decision based on Location Prediction),其基本思想是:(1)从移动用户的大量历史移动轨迹数据中挖掘频繁轨迹;(2)根据挖掘出的频繁轨迹集合生成运动规则;(3)将运动规则运用于判决蜂窝移动通信的越区切换中。对所提算法进行仿真的结果表明,与传统的切换方案相比较,本算法减少了不必要的越区切换次数,降低了错误切换率,提高了切换的准确率,进而在一定程度上降低了通信代价,提高了通信系统的容量以及QoS。展开更多
基金partially supported by the National Natural Science Foundation of China (Nos. U1509216 and 61472099)the National Sci-Tech Support Plan (No. 2015BAH10F01)+1 种基金the Scientific Research Foundation for the Returned Overseas Chinese Scholars of Heilongjiang Provience (No. LC2016026)MOECMicrosoft Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology
文摘With the rapid development and wide use of Global Positioning System in technology tools, such as smart phones and touch pads, many people share their personal experience through their trajectories while visiting places of interest. Therefore, trajectory query processing has emerged in recent years to help users find their best trajectories. However, with the huge amount of trajectory points and text descriptions, such as the activities practiced by users at these points, organizing these data in the index becomes tedious. Therefore, the parallel method becomes indispensable. In this paper, we have investigated the problem of distributed trajectory query processing based on the distance and frequent activities. The query is specified by start and final points in the trajectory, the distance threshold, and a set of frequent activities involved in the point of interest of the trajectory.As a result, the query returns the shortest trajectory including the most frequent activities with high support and high confidence. To simplify the query processing, we have implemented the Distributed Mining Trajectory R-Tree index(DMTR-Tree). For this method, we initially managed the large trajectory dataset in distributed R-Tree indexes.Then, for each index, we applied the frequent itemset Apriori algorithm for each point to select the frequent activity set. For the faster computation of the above algorithms, we utilized the cluster computing framework of Apache Spark with MapReduce as the programing model. The experimental results show that the DMTR-Tree index and the query-processing algorithm are efficient and can achieve the scalability.
基金the receipt of research supported by the TATA Consultancy Service's scholarship
文摘Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-temporal data by a novel method adopting the concepts of clustering and sequential pattern mining. The algorithms used logically split the trajectory span area into clusters and then apply the k-means algorithm over this clusters until the squared error minimizes. The new method applies the threshold to obtain active clusters and arranges them in descending order based on number of trajectories passing through. From these active clusters, inter cluster patterns are found by a sequential pattern mining technique. The process is repeated until all the active clusters are linked. The clusters thus linked in sequence are the frequent trajectories. A set of experiments conducted using real datasets shows that the proposed method is relatively five times better than the existing ones. A comparison is made with the results of other algorithms and their variation is analyzed by statistical methods. Further, tests of significance are conducted with ANOVA to find the efficient threshold value for the optimum plot of frequent trajectories. The results are analyzed and found to be superior than the existing ones. This approach may be of relevance in finding alternate paths in busy networks ( congestion control), finding the frequent paths of migratory birds, or even to predict the next level of pattern characteristics in case of time series data with minor alterations and finding the frequent path of balls in certain games.
文摘为了满足未来移动网络蜂窝小、切换频繁、支持规模用户和多媒体应用的需求,对位置预测与越区切换进行深入分析,提出了基于位置预测的越区切换方案HDLP(Handover Decision based on Location Prediction),其基本思想是:(1)从移动用户的大量历史移动轨迹数据中挖掘频繁轨迹;(2)根据挖掘出的频繁轨迹集合生成运动规则;(3)将运动规则运用于判决蜂窝移动通信的越区切换中。对所提算法进行仿真的结果表明,与传统的切换方案相比较,本算法减少了不必要的越区切换次数,降低了错误切换率,提高了切换的准确率,进而在一定程度上降低了通信代价,提高了通信系统的容量以及QoS。