Global meteorology data are now widely used in various areas, but one of its applications, weather analogues, still require exhaustive searches on the whole historical data. We present two optimisations for the state-...Global meteorology data are now widely used in various areas, but one of its applications, weather analogues, still require exhaustive searches on the whole historical data. We present two optimisations for the state-of-the-art weather analogue search algorithms: a parallelization and a heuristic search. The heuristic search (NDRank) limits of the final number of results and does initial searches on a lower resolution dataset to find candidates that, in the second phase, are locally validated. These optimisations were deployed in the Cloud and evaluated with ERA5 data from ECMWF. The proposed parallelization attained speedups close to optimal, and NDRank attains speedups higher than 4. NDRank can be applied to any parallel search, adding similar speedups. A substantial number of executions returned a set of analogues similar to the existing exhaustive search and most of the remaining results presented a numerical value difference lower than 0.1%. The results demonstrate that it is now possible to search for weather analogues in a faster way (even compared with parallel searches) with results with little to no error. Furthermore, NDRank can be applied to existing exhaustive searches, providing faster results with small reduction of the precision of the results.展开更多
利用多维属性关键性能指标(key performance indicators,KPI)的可加性特征,能够实现对大型互联网服务故障的根因定位.由一项或多项异常根因导致的KPI数据变化,会导致大量相关KPI数据值的变化.提出一种基于异常相似性评估和影响力因子的...利用多维属性关键性能指标(key performance indicators,KPI)的可加性特征,能够实现对大型互联网服务故障的根因定位.由一项或多项异常根因导致的KPI数据变化,会导致大量相关KPI数据值的变化.提出一种基于异常相似性评估和影响力因子的剪枝搜索异常定位模型(pruning search model based on anomaly similarity and effectiveness factor for root cause location,PASER),该模型以多维KPI异常传播模型为基础,提出了衡量候选集合成为根因可能性的异常潜在分数评估方案;基于影响力的逐层剪枝搜索算法,将异常根因的定位时间降低到了平均约5.3 s.此外,针对异常根因定位中所使用的时间序列预测算法的准确性和时效性也进行了对比实验,PASER模型在所使用的数据集上的定位表现达到了0.99的F-score.展开更多
The theory of nu-support vector regression (Nu-SVR) is employed in modeling time series variationfor prediction. In order to avoid prediction performance degradation caused by improper parameters, themethod of paralle...The theory of nu-support vector regression (Nu-SVR) is employed in modeling time series variationfor prediction. In order to avoid prediction performance degradation caused by improper parameters, themethod of parallel multidimensional step search (PMSS) is proposed for users to select best parameters intraining support vector machine to get a prediction model. A series of tests are performed to evaluate themodeling mechanism and prediction results indicate that Nu-SVR models can reflect the variation tendencyof time series with low prediction error on both familiar and unfamiliar data. Statistical analysis is alsoemployed to verify the optimization performance of PMSS algorithm and comparative results indicate thattraining error can take the minimum over the interval around planar data point corresponding to selectedparameters. Moreover, the introduction of parallelization can remarkably speed up the optimizing procedure.展开更多
基金the Fundação para a Ciência e a Tecnologia[UIDB/50021/2020].
文摘Global meteorology data are now widely used in various areas, but one of its applications, weather analogues, still require exhaustive searches on the whole historical data. We present two optimisations for the state-of-the-art weather analogue search algorithms: a parallelization and a heuristic search. The heuristic search (NDRank) limits of the final number of results and does initial searches on a lower resolution dataset to find candidates that, in the second phase, are locally validated. These optimisations were deployed in the Cloud and evaluated with ERA5 data from ECMWF. The proposed parallelization attained speedups close to optimal, and NDRank attains speedups higher than 4. NDRank can be applied to any parallel search, adding similar speedups. A substantial number of executions returned a set of analogues similar to the existing exhaustive search and most of the remaining results presented a numerical value difference lower than 0.1%. The results demonstrate that it is now possible to search for weather analogues in a faster way (even compared with parallel searches) with results with little to no error. Furthermore, NDRank can be applied to existing exhaustive searches, providing faster results with small reduction of the precision of the results.
文摘利用多维属性关键性能指标(key performance indicators,KPI)的可加性特征,能够实现对大型互联网服务故障的根因定位.由一项或多项异常根因导致的KPI数据变化,会导致大量相关KPI数据值的变化.提出一种基于异常相似性评估和影响力因子的剪枝搜索异常定位模型(pruning search model based on anomaly similarity and effectiveness factor for root cause location,PASER),该模型以多维KPI异常传播模型为基础,提出了衡量候选集合成为根因可能性的异常潜在分数评估方案;基于影响力的逐层剪枝搜索算法,将异常根因的定位时间降低到了平均约5.3 s.此外,针对异常根因定位中所使用的时间序列预测算法的准确性和时效性也进行了对比实验,PASER模型在所使用的数据集上的定位表现达到了0.99的F-score.
基金Supported by the National Natural Science Foundation of China (No. 60873235&60473099)the Science-Technology Development Key Project of Jilin Province of China (No. 20080318)the Program of New Century Excellent Talents in University of China (No. NCET-06-0300).
文摘The theory of nu-support vector regression (Nu-SVR) is employed in modeling time series variationfor prediction. In order to avoid prediction performance degradation caused by improper parameters, themethod of parallel multidimensional step search (PMSS) is proposed for users to select best parameters intraining support vector machine to get a prediction model. A series of tests are performed to evaluate themodeling mechanism and prediction results indicate that Nu-SVR models can reflect the variation tendencyof time series with low prediction error on both familiar and unfamiliar data. Statistical analysis is alsoemployed to verify the optimization performance of PMSS algorithm and comparative results indicate thattraining error can take the minimum over the interval around planar data point corresponding to selectedparameters. Moreover, the introduction of parallelization can remarkably speed up the optimizing procedure.