提出类别属性数据流数据离群度量——加权频繁模式离群因子(weighted frequent pattern outlier factor,简称WFPOF),并在此基础上给出一种快速数据流离群点检测算法FODFP-Stream(fast outlier detection for high dimensional categoric...提出类别属性数据流数据离群度量——加权频繁模式离群因子(weighted frequent pattern outlier factor,简称WFPOF),并在此基础上给出一种快速数据流离群点检测算法FODFP-Stream(fast outlier detection for high dimensional categorical data streams based on frequent pattern).该算法通过动态发现和维护频繁模式来计算离群度,能够有效地处理高维类别属性数据流,并可进一步扩展到数值属性和混合属性数据流.对仿真数据集和真实数据集的实验检测均验证该算法具有良好的适用性和有效性.展开更多
Abstract Monthly mean sea ice motion vectors and monthly mean sea level pressure (SLP) for the period of 1979-2006 are investigated to understand the spatial and temporal changes of Arctic sea-ice drift. According t...Abstract Monthly mean sea ice motion vectors and monthly mean sea level pressure (SLP) for the period of 1979-2006 are investigated to understand the spatial and temporal changes of Arctic sea-ice drift. According to the distinct differences in monthly mean ice velocity field as well as in the distribution of SLP, there are four primary types in the Arctic Ocean: Beaufort Gyre+Transpolar Drift, Anticyclonic Drift, Cyclonic Drift and Double Gyre Drift. These four types account for 81% of the total, and reveal distinct seasonal variations. The Cyclonic Drift with a large-scale anticlockwise ice motion pattern trends to prevail in summer while the Anticyclonic Drift with an opposite pattern trends to prevail in winter and spring. The prevailing seasons for the Beaufort Gyre+Transpolar Drift are spring and autumn, while the Double Gyre Drift trends to prevail in winter, especially in Feb- ruary. The annual occurring times of the Anticyclonic Drift and the Cyclonic Drift are closely correlated with the yearly mean Arc- tic Oscillation (AO) index, with a correlation coefficient of -0.54 and 0.54 (both significant with the confident level of 99%), re- spectively. When the AO index stays in a high positive (negative) condition, the sea-ice motion in the Arctic Ocean demonstrates a more anticlockwise (clockwise) drifting pattern as a whole. When the AO index stays in a neutral condition, the sea-ice motion becomes much more complicated and more transitional types trend to take place.展开更多
A pattern matching based tracking algorithm, named MdcPatRec, is used for the reconstruction of charged tracks in the drift chamber of the BESIII detector. This paper addresses the shortage of segment finding in the M...A pattern matching based tracking algorithm, named MdcPatRec, is used for the reconstruction of charged tracks in the drift chamber of the BESIII detector. This paper addresses the shortage of segment finding in the MdcPatRec algorithm. An extended segment construction scheme and the corresponding pattern dictionary are presented. Evaluation with Monte-Carlo and experimental data show that the new method can achieve higher efficiency for low transverse momentum tracks.展开更多
数据流是随着时间顺序快速变化的和连续的,对其进行频繁模式挖掘时会出现概念漂移现象.在一些数据流应用中,通常认为最新的数据具有最大的价值.数据流挖掘会产生大量无用的模式,为了减少无用模式且保证无损压缩,需要挖掘闭合模式.因此,...数据流是随着时间顺序快速变化的和连续的,对其进行频繁模式挖掘时会出现概念漂移现象.在一些数据流应用中,通常认为最新的数据具有最大的价值.数据流挖掘会产生大量无用的模式,为了减少无用模式且保证无损压缩,需要挖掘闭合模式.因此,提出了一种基于时间衰减模型和闭合算子的数据流闭合模式挖掘方式TDMCS(Time-Decay-Model-based Closed frequent pattern mining on data Stream).该算法采用时间衰减模型来区分滑动窗口内的历史和新近事务权重,使用闭合算子提高闭合模式挖掘的效率,设计使用最小支持度-最大误差率-衰减因子的三层架构避免概念漂移,设计一种均值衰减因子平衡算法的高查全率和高查准率.实验分析表明该算法适用于挖掘高密度、长模式的数据流;且具有较高的效率,在不同大小的滑动窗口条件下性能表现是稳态的,同时也优于其他同类算法.展开更多
文摘提出类别属性数据流数据离群度量——加权频繁模式离群因子(weighted frequent pattern outlier factor,简称WFPOF),并在此基础上给出一种快速数据流离群点检测算法FODFP-Stream(fast outlier detection for high dimensional categorical data streams based on frequent pattern).该算法通过动态发现和维护频繁模式来计算离群度,能够有效地处理高维类别属性数据流,并可进一步扩展到数值属性和混合属性数据流.对仿真数据集和真实数据集的实验检测均验证该算法具有良好的适用性和有效性.
基金the National Natural Science Foundation of China (Grant no. 40631006)the National Major Science Project of China for Global Change Research (Grant no. 2010CB951403)
文摘Abstract Monthly mean sea ice motion vectors and monthly mean sea level pressure (SLP) for the period of 1979-2006 are investigated to understand the spatial and temporal changes of Arctic sea-ice drift. According to the distinct differences in monthly mean ice velocity field as well as in the distribution of SLP, there are four primary types in the Arctic Ocean: Beaufort Gyre+Transpolar Drift, Anticyclonic Drift, Cyclonic Drift and Double Gyre Drift. These four types account for 81% of the total, and reveal distinct seasonal variations. The Cyclonic Drift with a large-scale anticlockwise ice motion pattern trends to prevail in summer while the Anticyclonic Drift with an opposite pattern trends to prevail in winter and spring. The prevailing seasons for the Beaufort Gyre+Transpolar Drift are spring and autumn, while the Double Gyre Drift trends to prevail in winter, especially in Feb- ruary. The annual occurring times of the Anticyclonic Drift and the Cyclonic Drift are closely correlated with the yearly mean Arc- tic Oscillation (AO) index, with a correlation coefficient of -0.54 and 0.54 (both significant with the confident level of 99%), re- spectively. When the AO index stays in a high positive (negative) condition, the sea-ice motion in the Arctic Ocean demonstrates a more anticlockwise (clockwise) drifting pattern as a whole. When the AO index stays in a neutral condition, the sea-ice motion becomes much more complicated and more transitional types trend to take place.
基金Supported by Ministry of Science and Technology of China(2009CB825200)Joint Funds of National Natural Science Foundation of China(11079008,11121092)+1 种基金Natural Science Foundation of China(10905091)SRF for ROCS of SEM
文摘A pattern matching based tracking algorithm, named MdcPatRec, is used for the reconstruction of charged tracks in the drift chamber of the BESIII detector. This paper addresses the shortage of segment finding in the MdcPatRec algorithm. An extended segment construction scheme and the corresponding pattern dictionary are presented. Evaluation with Monte-Carlo and experimental data show that the new method can achieve higher efficiency for low transverse momentum tracks.
文摘数据流是随着时间顺序快速变化的和连续的,对其进行频繁模式挖掘时会出现概念漂移现象.在一些数据流应用中,通常认为最新的数据具有最大的价值.数据流挖掘会产生大量无用的模式,为了减少无用模式且保证无损压缩,需要挖掘闭合模式.因此,提出了一种基于时间衰减模型和闭合算子的数据流闭合模式挖掘方式TDMCS(Time-Decay-Model-based Closed frequent pattern mining on data Stream).该算法采用时间衰减模型来区分滑动窗口内的历史和新近事务权重,使用闭合算子提高闭合模式挖掘的效率,设计使用最小支持度-最大误差率-衰减因子的三层架构避免概念漂移,设计一种均值衰减因子平衡算法的高查全率和高查准率.实验分析表明该算法适用于挖掘高密度、长模式的数据流;且具有较高的效率,在不同大小的滑动窗口条件下性能表现是稳态的,同时也优于其他同类算法.